In the last post we created a Logical Switch with two ports residing on different hypervisors. Communication between those two ports took place over the tunnel interface using Geneve encapsulation. Let’s now take a closer look at this overlay traffic.
Without diving too much into the packet processing in OVN, we need to know that each Logical Datapath (Logical Switch / Logical Router) has an ingress and an egress pipeline. Whenever a packet comes in, the ingress pipeline is executed and after the output action, the egress pipeline will run to deliver the packet to its destination. More info here: http://docs.openvswitch.org/en/latest/faq/ovn/#ovn
In our scenario, when we ping from VM1 to VM2, the ingress pipeline of each ICMP packet runs on Worker1 (where VM1 is bound to) and the packet is pushed to the tunnel interface to Worker2 (where VM2 resides). When Worker2 receives the packet on its physical interface, the egress pipeline of the Logical Switch (network1) is executed to deliver the packet to VM2. But … How does OVN know where the packet comes from and which Logical Datapath should process it? This is where the metadata in the Geneve headers comes in.
Let’s get back to our setup and ping from VM1 to VM2 and capture traffic on the physical interface (eth1) of Worker2:
[root@worker2 ~]# sudo tcpdump -i eth1 -vvvnnexx
17:02:13.403229 52:54:00:13:e0:a2 > 52:54:00:ac:67:5b, ethertype IPv4 (0x0800), length 156: (tos 0x0, ttl 64, id 63920, offset 0, flags [DF], proto UDP (17), length 142)
192.168.50.100.7549 > 192.168.50.101.6081: [bad udp cksum 0xe6a5 -> 0x7177!] Geneve, Flags [C], vni 0x1, proto TEB (0x6558), options [class Open Virtual Networking (OVN) (0x102) type 0x80(C) len 8 data 00010002]
40:44:00:00:00:01 > 40:44:00:00:00:02, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 41968, offset 0, flags [DF], proto ICMP (1), length 84)
192.168.0.11 > 192.168.0.12: ICMP echo request, id 1251, seq 6897, length 64
0x0000: 5254 00ac 675b 5254 0013 e0a2 0800 4500
0x0010: 008e f9b0 4000 4011 5a94 c0a8 3264 c0a8
0x0020: 3265 1d7d 17c1 007a e6a5 0240 6558 0000
0x0030: 0100 0102 8001 0001 0002 4044 0000 0002
0x0040: 4044 0000 0001 0800 4500 0054 a3f0 4000
0x0050: 4001 1551 c0a8 000b c0a8 000c 0800 c67b
0x0060: 04e3 1af1 94d9 6e5c 0000 0000 41a7 0e00
0x0070: 0000 0000 1011 1213 1415 1617 1819 1a1b
0x0080: 1c1d 1e1f 2021 2223 2425 2627 2829 2a2b
0x0090: 2c2d 2e2f 3031 3233 3435 3637
17:02:13.403268 52:54:00:ac:67:5b > 52:54:00:13:e0:a2, ethertype IPv4 (0x0800), length 156: (tos 0x0, ttl 64, id 46181, offset 0, flags [DF], proto UDP (17), length 142)
192.168.50.101.9683 > 192.168.50.100.6081: [bad udp cksum 0xe6a5 -> 0x6921!] Geneve, Flags [C], vni 0x1, proto TEB (0x6558), options [class Open Virtual Networking (OVN) (0x102) type 0x80(C) len 8 data 00020001]
40:44:00:00:00:02 > 40:44:00:00:00:01, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 16422, offset 0, flags [none], proto ICMP (1), length 84)
192.168.0.12 > 192.168.0.11: ICMP echo reply, id 1251, seq 6897, length 64
0x0000: 5254 0013 e0a2 5254 00ac 675b 0800 4500
0x0010: 008e b465 4000 4011 9fdf c0a8 3265 c0a8
0x0020: 3264 25d3 17c1 007a e6a5 0240 6558 0000
0x0030: 0100 0102 8001 0002 0001 4044 0000 0001
0x0040: 4044 0000 0002 0800 4500 0054 4026 0000
0x0050: 4001 b91b c0a8 000c c0a8 000b 0000 ce7b
0x0060: 04e3 1af1 94d9 6e5c 0000 0000 41a7 0e00
0x0070: 0000 0000 1011 1213 1415 1617 1819 1a1b
0x0080: 1c1d 1e1f 2021 2223 2425 2627 2829 2a2b
0x0090: 2c2d 2e2f 3031 3233 3435 3637
Let’s now decode the ICMP request packet (I’m using this tool):
In the ovn-architecture(7) document, you can check how the Metadata is used in OVN in the Tunnel Encapsulations section. In short, OVN encodes the following information in the Geneve packets:
- Logical Datapath (switch/router) identifier (24 bits) – Geneve VNI
- Ingress and Egress port identifiers – Option with class 0x0102 and type 0x80 with 32 bits of data:
1 15 16 +---+------------+-----------+ |rsv|ingress port|egress port| +---+------------+-----------+ 0
Back to our example: VNI = 0x000001 and Option Data = 00010002, so from the above:
Logical Datapath = 1 Ingress Port = 1 Egress Port = 2
Let’s take a look at SB database contents to see if they match what we expect:
[root@central ~]# ovn-sbctl get Datapath_Binding network1 tunnel-key
1
[root@central ~]# ovn-sbctl get Port_Binding vm1 tunnel-key
1
[root@central ~]# ovn-sbctl get Port_Binding vm2 tunnel-key
2
We can see that the Logical Datapath belongs to network1, that the ingress port is vm1 and that the output port is vm2 which makes sense as we’re analyzing the ICMP request from VM1 to VM2.
By the time this packet hits Worker2 hypervisor, OVN has all the information to process the packet on the right pipeline and deliver the port to VM2 without having to run the ingress pipeline again.
What if we don’t use any encapsulation?
This is technically possible in OVN and there’s such scenarios like in the case where we’re managing a physical network directly and won’t use any kind of overlay technology. In this case, our ICMP request packet would’ve been pushed directly to the network and when Worker2 receives the packet, OVN needs to figure out (based on the IP/MAC addresses) which ingress pipeline to execute (twice, as it was also executed by Worker1) before it can go to the egress pipeline and deliver the packet to VM2.