Reachability failure is a very generic term for any support engineer until He/She digs deeper to find out the root cause of the failure condition. In this blog, my aim is to highlight a design scenario with the feature when enabled could lead to NSX Edge internal interface not respond to ping.
Let’s first set the scene or NSX design scenario, NSX Edge’s running in ECMP mode connecting to Physical routers upstream and Distributed logical router downstream (Simple enough as this deployment model is very common and found in customer environment or in HOL where you practice NSX functionality).
As per below diagram Transit VXLAN segment connects two ECMP edge node and DLR together. Internal interfaces of the ECMP Edge nodes and DLR Uplink share same IP address subnet space (Part of one broadcast segment) as depicted in below diagram:
In this scenario Support Engineer trying to reach the Internal interface of an ECMP Edge Device from Outside network (Say HOL control centre) and ping fails.
He/She found that to reach Edge E1 Internal Interface IP address packet routed to E8 first than E1 directly as depicted inside below image:
Point to Note
In routing world when we have multiple paths to reach specific destination router always choose the best path in case those paths are not equal and would load share using hashing algorithm if the destination is reachable via two or more equal paths.
With multiple NSX Edge running in ECMP mode, routing is always asymmetric in nature which means to reach the destination IP address path may choose Edge 1 for North-South communication and return/outgoing path from South-North may choose Edge 2. Due to this reason, we do not run Stateful services on ECMP Edge nodes.
In the current design, Data Center fabric is running ECMP Leaf and Spine Architecture, hence path chosen to reach the destination is random (Based on Hash Algo used in Fabric). In the present scenario support engineer found that path taken by the packet is as follows:
- The packet is received by Physical Router 2 and routed to E8 (NSX Edge)
- E8 Edge Node looks at the packet destination IP and finds that destination IP subnet is directly connected, hence routes the packet out the connected internal interface after re-writing Layer 2 Destination MAC address.
- E1 receives the packet and simply discards its and does not process.
So routing is doing its perfect job, but now what is the reason for the failure?
When you closely look at the configuration of the Internal Interface on NSX Edge E1 you find that Reverse Path Filter feature is Enabled as shown in the screen capture below which is causing the packet to drop. Alright, but what is this feature doing or is it really important to be enabled?
Let spotlight what this feature means and why is it important?
Reverse Path Filter is also known as Unicast RPF (Reverse Path Forwarding), is actually a security feature to limit the malicious traffic inside Data Centre network. When this feature is enabled, Edge starts verifying the reachability of the source IP address of the packet received for forwarding / Routing. How it does is, it checks the routing to the source IP address or subnet (Source from which packet was received) and verifies the outgoing interface through which the source IP subnet can be reached. It will simply discard the packet and not route in an instance Edge finds Source of the packet can be reached via X interface but received the packet from Y interface for routing. This security feature capability limits the appearance of spoofed addresses on a network.
Reverse Path Filter provides us three Options:
We know what Enabled means and Disabled would turn down the feature. Third option Loose is definitive enough by its name, it’s loose and not aggressive enough to perform full due diligence. Hence it performs a single function, can Edge reach received packet source IP address subnet from any of its interfaces whether upstream or downstream, if the answer is “YES it can” Edge will route/process the packet.
It is not recommended to disable Reverse Path Filter as it helps in fighting malicious traffic within DC fabric.
7 thoughts on “NSX Edge Internal Interface Reachability failure”
Thanks for the great article, So as “Disable” is not preferred, the recommendation is to use “Loose” for RPF on all interfaces of ECMP Edges ?
Thanks, Mohamad. This feature has major benefit when it is set to Enabled, still Loose can be your second best option and only to be used when it is very much necessary. Keep in mind that Security is a major concern for today’s DC and this feature helps in preventing routing of IP packets that are forged (IP Source Address) by Man In Middle Attack. We have advanced but this feature is still used in Physical Fabric Design. So in my design I would keep it in Enabled mode.
I got you. Thanks a lot for the clarification dear, and keep on the good work
with Cisco implementation of loose packet will be dropped if its sourced from RFC1918 private range, which in a DC environment can be dangerous .
In case of NSX under what conditions packet will be dropped with loose mode ?
Thank You Syed for your comments. In NSX Reverse Path Forwarding implementation with Loose is different and does not block or drop the packets sourced from RFC1918 private address range. As far as my understanding goes, the Loose setting would drop only when the ESG does not have packet source specific subnet/summary/default route in its routing table.
Hi, Amit. You’ve just demonstrated how ESG might drop legitimate traffic, when RP filter is enabled. If that’s the case, why do you still recommend to leave it enabled?
Sorry I could not catchup with your query earlier.
URPF is a security feature that has been used on Edge of the data center since decade. It helps to restrict packet entering on the router interface that is never used by the router to route the packet back to that source. Hence in summary in case the source is internal network learned on edge via transit interface connecting DLR then router restricts receiving traffic on its uplink interface when sourced (Forged) from that internal subnets.
In my scenario source (Control Center) is sitting outside the DC network and initiating connection to an internal interface of the Edge for management , monitoring etc. We have multiple Edge forming ECMP which means traffic received by those edges via uplink will not be symmetric in nature. In case the traffic is routed to the Edge that we are trying to manage would establish connection without any issues and if received on different edge it allows the connection to enter and route to the appropriate edge directly connected transit interface (All edge’s share the transit subnet). But when it is received by destination Edge interface it is blocked reason being source of the packet can be reached via uplink and not internal transit interface as per its forwarding table.
I have explained you how it works and its working itself explains why you want it to be enabled. Personally while designing I will make sure to keep it enabled with strict (In case you are not doing similar to what has been described inside the blog or loose with basic security considering default route)
In case you have further question we can coordinate offline.