-
Notifications
You must be signed in to change notification settings - Fork 475
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved support for p2p tunnels #1025
Changes from all commits
01a5ab0
89107ff
47b29aa
f203747
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,61 @@ | ||
# Introduction | ||
As of SAI version 1.5 the “tunnel” has a p2mp connotation. It holds the VTEP SIP whereas there is no DIP. The DIP is specified as part of the FDB entry or as part of Next Hop entry. | ||
|
||
It is proposed to add the DIP as part of the `sai_tunnel_attr_t` structure as an optional parameter. | ||
|
||
The following are the motivations for introducing a DIP in the tunnel structure and to model it as a p2p entity. | ||
|
||
- Support Head End Replication as the method for handling BUM traffic. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is an API to configure the flood vector for BUM traffic:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It can also be done per remote IP
|
||
- Support per remote IP and per remote IP+VNI Tx and Rx counters. | ||
- Support the notion of operational status per remote IP. | ||
- Support flushing FDB per remote IP. | ||
- Support learning enable/disable per tunnel. | ||
Comment on lines
+9
to
+12
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For these 4 items above, a P2P tunnel can be created with the current API as well. By creating multiple tunnels with the same SRC_IP, same functionality will be achieved. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you please elaborate with examples ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please see below
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The get attribute call will return an oid which is not created by an application. In the case of SONiC this will mean that there has to be a VID created in syncd as part of the get call specifically for this get attribute call. This just complicates the application in addition to book keeping associated with the L2MC group. The changes suggested in this PR and the use of VLAN_MEMBER_PORT for dot1q and bridge port for dot1d bridges will suffice to specify the flood vector. If there is concern in adding DESTIP and P2P/P2MP flags to the tunnel object as the tunnel terminator has these already then allowing for a new bridge port type of type tunnel_terminator would also achieve the goal of specifying the flood vector without having to explicitly use the L2MC group. |
||
|
||
# Head End Replication | ||
|
||
All the remote members as well as the local members are part of a broadcast domain. | ||
|
||
In a generic scenario the following cases are applicable. | ||
- Forwarding from local to remote members, remote to local members, local to local members should be allowed. | ||
- Forwarding from remote to remote members should not be allowed. | ||
- In a DCI case forwarding from the DCI tunnels to intra-DC tunnels and vice versa should be allowed. | ||
- Forwarding from one remote to another remote need to be controlled by per member configuration like it being hub/spoke or a generic split horizon group. | ||
- The remote end points can be of any encapsulation like VXLAN, MPLS, L2GRE etc and a single broadcast domain could have local members as well as remote members of different encapsulations. | ||
|
||
To handle all these scenarios it is proposed to model the remote members on similar lines and as a point to point entity. | ||
|
||
- For a dot1q bridge, | ||
- A tunnel is created for each remote passing the DIP as the newly introduced attribute. | ||
- A bridge_port is created with the following attributes. | ||
- Bridge port type as `SAI_BRIDGE_PORT_TYPE_TUNNEL` | ||
- `SAI_BRIDGE_PORT_ATTR_TUNNEL_ID` as the tunnel created above. | ||
- `SAI_BRIDGE_PORT_ATTR_BRIDGE_ID` as the default dot1q bridge. | ||
- `SAI_BRIDGE_PORT_ATTR_ISOLATION_GROUP` can be set appropriately depending on the application. | ||
- vlan_member objects are created with bridge port attribute as the above created entity. | ||
|
||
- For a dot1d bridge, the steps are similar except that the bridge port is created for each dot1d bridge, using the same tunnel created. There are no vlan_members created as the bridgeports serve that purpose. | ||
|
||
# Per Remote IP Tx and Rx Counters | ||
|
||
By modeling the tunnel as a p2p entity the `sai_get_tunnel_stats_ext_fn` can be used to fetch the per remote IP stats. | ||
|
||
# Per VNI Per Remote IP Tx and Rx counters | ||
|
||
In a dot1d bridge case the bridgeport counters can also be used to fetch per remote VNI, per remote IP counters. | ||
|
||
# Per Remote IP Operational Status | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The description seems too vague. If there is not enough resources, SAI should return error on creation. What attribute is proposed to be used for that? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The mechanism to declare an operational status as up or down is vendor dependent. Hence seems a little vague. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This reflects the h/w programming/fwding status of the corresponding tunnel. Hence, the SAI will indicate whether this specific tunnel is programmed (and traffic can be forwarded) in the hardware. |
||
|
||
With a p2p modelling it is possible to associate the notion of an operational status. The operational status could be based on the underlay IP reachability to the remote IP, other device specific constraints or resource availability considerations. | ||
|
||
# Per Remote IP FDB Flush | ||
|
||
With a p2p modelling it is possible to flush the FDB entries associated with a remote IP. The `SAI_FDB_FLUSH_ATTR_BRIDGE_PORT_ID` can be re-used for tunnel bridgeports when modelled as p2p. | ||
|
||
The flush might happen, | ||
- As a result of operational status going down as above. | ||
- Admin initiated. | ||
- Due to a VXLAN BFD session going down. | ||
|
||
# Per Remote IP Learning enable/disable | ||
|
||
One more benefit is to enable learning enable/disable per tunnel bridge port. Learning will need to be enabled for static tunnels whereas disabled for EVPN tunnels. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can P2P tunnel model be applied to Next Hop entry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The SAI nexthop can refer to the p2p tunnel id, tunnel vni and MAC just as it does today for p2mp.
However it is not the expectation to replace the p2mp tunnel. It is expected that nexthop programming and FDB continue to use the p2mp version.