NSX-T: BUM Replication with Two-Tier Hierarchical Replication Mode Deep Dive

In this article we shall validate that Hierarchical Replication Mode utilises unicast messaging, vs layer 2 multicast as seen with NSX-V. Read on or click the link in your Alexa app.

As part of my continued learning of NSX-T I wanted to confirm a few things with respect to how BUM replication is handled. There are numerous blogs and articles about head end vs hierarchical replication, the two modes utilised with NSX-T. Yet I couldn’t find any information to demonstrate at a packet level whether hierarchical is unicast or layer-2 multicast in nature. I caught up with some colleagues who were of the opinion that all messaging happens via unicast channels. So I wanted to validate this.

So firstly the environment, I have two ESXi hosts and two KVM hosts on two segments. ESXi hosts are attached to 192.168.140.x and KVM hosts are attached to 192.168.150.x. I attached VMs that reside on both my KVM hosts, and my ESXi host, so that from a TEP perspective, all hosts would be members of the web logical switch. See below.

Now what I’ve highlighted above is that, from the perspective of esxcomp-02a, 192.168.150.152 i.e. kvmcomp-02a is my remote MTEP in my deployment. So what does this mean? Hierarchical Replication results in a remote host being nominated as the MTEP, or replication node, for BUM traffic in the remote TEP segment. So this is very similar to Unicast Mode (UTEP) and Hybrid Mode (MTEP) with NSX-V, but unlike Hybrid Mode in NSX-V the messages are set as unicast messages, and not layer-2 multicast. What was initially throwing me off was the Is MTEP part of the output, in my head when I see MTEP I immediately associated this with hybrid mode, or at least some sort of multicast messaging. Which is not the case, and it seems this is a simply the UI utilising MTEP when in fact we should be talking about UTEPs.

So I went on to validate this through a packet capture on the kvmcomp-02a, which you’ll see below.

In this output I’ve highlighted a few key elements:

  1. Firstly remember GENEVE is UDP 6081
  2. The first and second box show the hierarchical replication mode happening. I had initiated a ping from Web-01 to 172.16.10.14 (web04), where Web-01 resides on ESXi-01 (192.168.140.151). As you can see there’s a message going out over my GENEVE Tunnel to my MTEP (192.168.150.152), who in turn is sending a unicast message to the other KVM host (192.168.150.151) on my segment
  3. Secondly I initiated a ping from Web-01 to Web-03, to demonstrate that traffic goes direct between VMs that the NSX controllers have known MAC/ARP entries.

Finally, it’s always worth referring back to IETF documentation,  https://tools.ietf.org/html/draft-gross-geneve-01.

In the Geneve standard the following statements are made in Section 4.1.3 Broadcast and Multicast:

Geneve tunnels may either be point-to-point unicast between two endpoints or may utilize broadcast or multicast addressing. It is not required that inner and outer addressing match in this respect. For example, in physical networks that do not support multicast, encapsulated multicast traffic may be replicated into multiple unicast tunnels or forwarded by policy to a unicast location (possibly to be replicated there).

Hopefully you find this article valuable just to cement understanding and knowledge. If there’s any errors please provide feedback via the comments, and good luck with your NSX-T journey.

Thanks

Bal Birdy on LinkedinBal Birdy on Twitter
Bal Birdy
Bal is an Open Group Certified IT Architect, and VCDX #269, specializing in the network and security arena, with over 15 years experience in enterprise level network/system technologies. His goal has always been to maintain a holistic view of the architecture allowing him to understand how various technology streams may impact the networking/infrastructure space.
Bal has a proven record of delivering on enterprise network designs, leading data center and site migrations as a result of business mergers and acquisitions, and vendor migrations e.g. Cisco to Checkpoint/Juniper. As part of this he worked across several business sectors: Utilities, Banking, Retail and Government, and can base designs around sector specific standards e.g. PCI-DSS, DSD and ISM. He is proficient in several technology areas including Cisco, Juniper, F5, VMware, Citrix and Microsoft. These skills are supported by non-technical certifications: Prince2 Project Management Practitioner, ITILv3, TOGAF 9.1 Certified and Open Group Certified IT Architect – Open CA.
In addition to supporting the Livefire Team, Bal leads several innovation efforts within the VMware WRACE organization, including projects investigating the use of Virtual Reality/Augmented Reality, AI/ML and Interactive 360, to support customer and partner enablement.

Certifications:
BSc (Hons) Computer Science
CCNP/CCDP
VCDX-NV #269
Open Group Certificated Architect
Member of the Associated of Enterprise Architects

Leave a Reply