NSX Replication Modes
When two VMs connected to different ESXi hosts need to communicate directly, unicast VXLAN-encapsulated traffic is exchanged between the VTEP IP addresses associated with their associated hypervisors. Traffic originated by a VM may also need to be sent to all the other VMs belonging to the same logical switch. Specific instances of this type of L2 traffic include broadcast, unknown unicast, and multicast. These multi-destination traffic types are collectively referred to using the acronym BUM (Broadcast, Unknown Unicast, Multicast).
In each of these scenarios, traffic originated by a given ESXi host must be replicated to multiple remote. NSX supports three different replications modes to enable multi-destination communication on VXLAN backed Logical Switches: multicast, unicast and hybrid.
- Multicast Mode (+PIM & +IGM Snooping)
In this mode, a multicast IP address must be associated to each defined VXLAN L2 segment (i.e., Logical Switch). L2 multicast capability is used to replicate traffic to all VTEPs in the local segment (i.e., VTEP IP addresses that are part of the same IP subnet). Additionally, IGMP snooping should be configured on the physical switches to optimize the delivery of L2 multicast traffic. To ensure multicast traffic is also delivered to VTEPs in a different subnet from the source VTEP, the network administrator must configure PIM and enable L3 multicast routing.
- Hybrid Mode (+IGMP Snooping)
Hybrid mode offers operational simplicity similar to unicast mode – IP multicast routing configuration is not required in the physical network – while leveraging the L2 multicast capability of physical switches.
Hybrid mode allows deployment of NSX in large L2 topologies by helping scale multicast at layer 2 with the simplicity of unicast. It also allows scaling in L3 leafspine topologies without requiring PIM to forward multicast (BUM frame) beyond layer 3 ToR boundaries while still allowing multicast replication in physical switch for layer 2 BUM replication.
- Unicast Mode (-PIM & -IGM Snooping)
Unicast mode replication mode does not require explicit configuration on the physical network to enable distribution of multi-destination VXLAN traffic. This mode can be used for small to medium size environments where BUM traffic is not high and NSX is deployed in an L2 topology where all VTEPs are in the same subnet.
Source from NSX Reference Design Version 3.0
NSX-T does not differentiate between the different kinds of frames replicated to multiple destinations. Broadcast, unknown unicast, or multicast traffic will be flooded in a similar ashion across a logical switch. In the overlay model, the replication of a frame to be flooded on a logical switch is orchestrated by the different NSX-T components. NSX-T provides two different methods for flooding traffic described in the following sections. They can be selected on a per logical switch basis.
- Head-End Replication Mode
In the head end replication mode, the transport node at the origin of the frame to be flooded on a logical switch sends a copy to each and every other transport node that is connected to thislogical switch.In this mode, the burden of the replication rests entirely on source hypervisor. Seven copies of the tunnel packet carrying the frame are sent over the uplink of “HV1”. This should be taken into account when provisioning the bandwidth on this uplink. Typical use case of this type of replication is necessary when underlay is an L2 fabric.
The source hypervisor transport node knows about the groups based on the information it has received from the NSX-T Controller. It does not matter which transport node is selected to perform replication in the remote groups so long as the remote transport node is up and available.
- Two-tier Hierarchical
The two-tier hierarchical mode achieves scalability through a divide and conquest method. In this mode, hypervisor transport nodes are grouped according to the subnet of the IP address of their TEP. Hypervisor transport nodes in the same rack typically share the same subnet for their TEP IPs, though this is not mandatory.
The default two-tier hierarchical flooding mode is recommended as a best practice as it typically performs better in terms of physical uplink bandwidth utilization and reduces CPU utilization.
Source from VMware NSX-T Reference Design Version 2.0
Nested vLab Network Environment
The following lab is based on T0 deployment performed on Part 7 .
The picture below summarize the environment composed by:
- A vSphere Compute Cluster (3 Hosts)
- A KVM Compute Node
- A vSphere Cluster that hosts the Edge Cluster (2 ESGs) with a T0 Router (DLR-S1-01)
- 3 Logical Switches routed by T0 DLR
- 4 Linux VMs (each hosted by an vSphere/KVM Hypervisor)
Setup is performed with NSX-T Advanced workflow
Create Logical Switches
Under Advanced Networking and Security | Switching | Switches click on +ADD
Create 3 new LS (LS-OL-WEB01, LS-OL-APP02 & LS-OL-DB03)
- Type a LS name
- Select Overlay Transport Zone
- Leave Replication Mode ad Two-Tier
LS are created…
…and on vSphere Client | Networking view we can verify the new objects attached to Compute Cluster ESXi hosts
Attach LS to DLR
Under Advanced Networking and Security | Routers select DLR, click on Configuration | Router Ports | +ADD
Add 3 Router Ports for each LS
- Type a Router Port name
- Select Type as Downlink
- Select LS
- Type a Switch Port name
- Add an IP Address with a Netmask (this will be the default GW of the network)
The 3 new RPs are ready
Bind vSphere VMs on LS
On vSphere Client, change vNIC parameters for the VMs
web-1-10 –> LS-OL-WEB01
web-1-11 –> LS-OL-WEB01
app-1-20 –> LS-OL-APP02
ADD KVM VM on LS
Download from CentOS site a qcow2 image, scp to KVM host under /var/lib/libvirt/images and copy as db-s1-30
[root@cen-s1-20 images]# ls -lrt total 8839068 drwx------ 2 root root 16384 Mar 19 14:52 lost+found -rw-r--r-- 1 root root 1128 Mar 19 23:11 guestinfo.xml -rw-r--r-- 1 root root 8112766976 Mar 20 00:47 nsx-unified-appliance-220.127.116.11.0.12456291.qcow2 -rw-r--r-- 1 root root 938409984 Apr 4 20:56 CentOS-7-x86_64-GenericCloud.qcow2 [root@cen-s1-20 images]# cp CentOS-7-x86_64-GenericCloud.qcow2 db-s1-30.qcow2
Customize CentOS image removing cloud-init and set the root password
[root@cen-s1-20 images]# virt-customize -a db-s1-30.qcow2 --root-password password:centos --uninstall cloud-init [ 0.0] Examining the guest ... [ 2.2] Setting a random seed [ 2.2] Uninstalling packages: cloud-init [ 3.9] Setting passwords [ 4.9] Finishing off
Install the KVM VM
[root@cen-s1-20 images]# virt-install --import --vnc --name db-s1-30 --ram 1024 --vcpu 1 --disk path=/var/lib/libvirt/images/db-s1-30.qcow2,format=qcow2 --network bridge:nsx-managed,model=virtio,virtualport_type=openvswitch WARNING No operating system detected, VM performance may suffer. Specify an OS with --os-variant for optimal results. WARNING Graphics requested but DISPLAY is not set. Not running virt-viewer. WARNING No console to launch for the guest, defaulting to --wait -1 Starting install... Domain installation still in progress. Waiting for installation to complete.
Check VM status
[root@cen-s1-20 ~]# virsh list Id Name State ---------------------------------------------------- 6 db-s1-30 running
Now we need the VM interface ID in order to create a new NSX-T logical port
[root@cen-s1-20 images]# virsh dumpxml db-s1-30 | grep interfaceid <parameters interfaceid='28d256cc-54ff-4f67-bbde-00ef845953f7'/>
Under Advanced Networking and Security | Switching | Ports click on +ADD
- Type a Switch Port name
- Select the right LS
- Select VIF as Attachment Type
- Paste the KVM VM Interfaced ID under Attachment ID
The db-s1-30 KVM VM SP is ready!
Complete KVM VM setup
Login on db-s1-01 console
[root@cen-s1-20 ~]# virsh console db-s1-30 Connected to domain db-s1-30 Escape character is ^] CentOS Linux 7 (Core) Kernel 3.10.0-957.1.3.el7.x86_64 on an x86_64 localhost login: root Password:
Setup networking parameter (just for testing)
[root@localhost ~]# ifconfig eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet6 fe80::5054:ff:fe14:d118 prefixlen 64 scopeid 0x20<link> ether 52:54:00:14:d1:18 txqueuelen 1000 (Ethernet) RX packets 8 bytes 656 (656.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 26 bytes 6812 (6.6 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 [root@localhost ~]# ifconfig eth0 10.1.3.30 netmask 255.255.255.0 broadcast 10.1.3.255 up [root@localhost ~]# route add default gw 10.1.3.1 [root@localhost ~]# hostnamectl set-hostname db-s1-30
Test reachability and routing
From web-s1-10 (10.1.1.10):
- ping web-s1-11 (10.1.1.11) on the same LS & different vSphere host
- ping app-s1-20 (10.1.2.20) on the different LS & different vSphere host
- ping db-s1-30 (10.1.3.30) on the different LS on KVM host