Skip to content

[EVPN-MH] Add kernel patches for EVPN VXLAN Multihoming support#540

Open
bdfriedman wants to merge 1 commit intosonic-net:masterfrom
bdfriedman:evpn_mh
Open

[EVPN-MH] Add kernel patches for EVPN VXLAN Multihoming support#540
bdfriedman wants to merge 1 commit intosonic-net:masterfrom
bdfriedman:evpn_mh

Conversation

@bdfriedman
Copy link

@bdfriedman bdfriedman commented Feb 25, 2026

Why I did it

This PR adds three critical Linux kernel patches required to enable EVPN VXLAN Multihoming in SONiC. These kernel enhancements provide the necessary infrastructure for:

  1. Extended neighbor flags for multi-homing peer synchronization
  2. Protocol field tracking in bridge FDB entries to distinguish control plane vs data plane learned MACs
  3. External validation flag to prevent kernel from invalidating externally managed neighbor entries

These patches are essential for implementing the EVPN-MH feature as described in the EVPN VXLAN Multihoming HLD.

Work item tracking
  • Microsoft ADO (number only):

How I did it

Added three kernel patches to patches-sonic directory:

1. NDA_FLAGS_EXT Support with NTF_EXT_MH_PEER_SYNC (0001-vxlan-bridge-Add-NDA_FLAGS_EXT-support-with-NTF_EXT_.patch)

This patch adds extended flags support for VXLAN and bridge FDB entries to enable multi-homing peer synchronization:

  • New field: ext_flags in vxlan_fdb structure
  • New flag: NTF_EXT_MH_PEER_SYNC - Indicates FDB entry is synchronized across EVPN-MH peers
  • New neighbor update flag: NEIGH_UPDATE_F_EXT_MH_PEER_SYNC for propagating sync state
  • Modified functions:
    • vxlan_fdb_alloc() - Initialize ext_flags
    • vxlan_fdb_create() - Pass ext_flags parameter
    • vxlan_fdb_update_existing() - Handle ext_flags updates and notifications
    • vxlan_fdb_update_create() - Create FDB with ext_flags
    • vxlan_fdb_info() - Include NDA_FLAGS_EXT in netlink messages
    • Bridge FDB functions - Propagate ext_flags through bridge layer

Files modified:

  • drivers/net/vxlan/vxlan_core.c (140 lines)
  • drivers/net/vxlan/vxlan_private.h (21 lines)
  • drivers/net/vxlan/vxlan_vnifilter.c (11 lines)
  • include/net/neighbour.h (4 lines)
  • include/uapi/linux/neighbour.h (1 line)
  • net/bridge/br.c (4 lines)
  • net/bridge/br_fdb.c (35 lines)
  • net/bridge/br_private.h (5 lines)
  • net/core/neighbour.c (13 lines)

2. Protocol Field in Bridge FDB (0001-net-bridge-vxlan-Protocol-field-in-bridge-fdb.patch)

This patch introduces an optional "protocol" field for bridge FDB entries to distinguish between control plane and data plane learned MAC addresses:

Purpose: In EVPN Multihoming, MAC addresses can be learned via:

  • Control plane (ZEBRA protocol): Static MACs distributed by FRR/BGP
  • Data plane (HW protocol): Dynamic MACs learned from traffic with aging enabled

This distinction enables:

  • Proper state machine management during MAC transitions
  • Handling traffic hashing between EVPN-MH peers
  • Managing MAC mobility across EVPN peers
  • Synchronization between control and data planes

Implementation:

  • New field: protocol in net_bridge_fdb_entry and vxlan_fdb structures
  • Protocol values: Uses standard routing protocol values (RTPROT_UNSPEC, RTPROT_ZEBRA, RTPROT_KERNEL, etc.)
  • Default: RTPROT_UNSPEC when protocol not specified (backward compatible)
  • NDA_PROTOCOL attribute: Encoded in netlink messages for FDB entries

Usage Example:

# Add MAC with hardware protocol (data plane learned)
bridge fdb add 00:00:00:00:00:88 dev hostbond2 vlan 1000 master dynamic extern_learn proto hw

# Display with protocol field
bridge -d fdb show dev hostbond2
# Output: 00:00:00:00:00:88 vlan 1000 extern_learn master br1000 proto hw

# Transition to zebra (control plane)
bridge fdb replace 00:00:00:00:00:88 dev hostbond2 vlan 1000 master dynamic extern_learn proto zebra

Files modified:

  • drivers/net/vxlan/vxlan_core.c (55 lines)
  • drivers/net/vxlan/vxlan_private.h (5 lines)
  • drivers/net/vxlan/vxlan_vnifilter.c (4 lines)
  • net/bridge/br.c (2 lines)
  • net/bridge/br_fdb.c (55 lines)
  • net/bridge/br_private.h (5 lines)

3. NTF_EXT_VALIDATED Flag for External Validation (0001-neighbor-Add-NTF_EXT_VALIDATED-flag-for-externally-v.patch)

This patch adds a new "extern_valid" neighbor flag to indicate entries learned and validated externally that should not be invalidated by the kernel:

Background: In EVPN multi-homing:

  • Each host is multi-homed via Ethernet Segment (ES/LAG) to multiple VTEPs
  • Neighbor entries are distributed to ES peers using EVPN MAC/IP advertisement routes
  • When an ES link goes down, EVPN routes are withdrawn, causing intermittent failures

Solution (based on draft-rbickhart-evpn-ip-mac-proxy-adv-03):

  • ES peers install neighbor entries and inject proxy EVPN MAC/IP advertisements
  • When ES link goes down, ES peers start aging timers instead of immediately withdrawing
  • If an ES peer locally learns the entry (becomes "reachable"), it restarts timer and removes proxy indication
  • Prevents intermittent routing failures during ES link transitions

Implementation:

  • New flag: NTF_EXT_VALIDATED (extern_valid) - Entry is externally validated
  • Behavior:
    • Kernel will NOT remove or invalidate the entry
    • Kernel can probe the entry and notify user space when it becomes "reachable"
    • If no confirmation received, kernel returns entry to "stale" state (NOT "failed" state)
    • Control plane (FRR) manages entry lifecycle
  • Initial state: "stale" when installed by control plane
  • State transitions: Kernel notifies control plane when entry becomes "reachable"

Use case: Required for EVPN-MH proxy advertisements where control plane needs full control over neighbor entry validity and removal decisions.

Files modified:

  • Neighbor subsystem for external validation support
  • Netlink attributes for extern_valid flag
  • State machine modifications

How to verify it

  1. Build kernel with these patches applied:

    cd sonic-linux-kernel
    make BLDENV=bookworm
  2. Verify NDA_FLAGS_EXT support:

    # Add FDB entry with extended flags
    bridge fdb add <mac> dev <vxlan-dev> dst <vtep-ip> vni <vni> extern_learn
    
    # Verify in kernel via netlink dump
    bridge -d fdb show | grep <mac>
  3. Verify protocol field support:

    # Add MAC with specific protocol
    bridge fdb add <mac> dev <device> vlan <vid> master dynamic extern_learn proto hw
    
    # Verify protocol shows up
    bridge -d fdb show dev <device> | grep <mac>
    # Expected output includes: proto hw
    
    # Transition protocol
    bridge fdb replace <mac> dev <device> vlan <vid> master dynamic extern_learn proto zebra
    
    # Verify protocol changed
    bridge -d fdb show dev <device> | grep <mac>
    # Expected output includes: proto zebra
  4. Verify extern_valid flag:

    # Add neighbor with extern_valid flag (via FRR/control plane)
    # Entry should remain in "stale" state and not be removed by kernel GC
    
    # Monitor neighbor state transitions
    ip -d neigh show
  5. Integration testing with EVPN-MH:

    • Configure EVPN multi-homing with ES peers
    • Verify MAC/neighbor synchronization across peers
    • Test ES link failure scenarios
    • Verify proxy advertisements and aging behavior
    • Confirm no intermittent routing/ARP failures during transitions
  6. Compatibility testing:

    • Verify existing bridge/VXLAN functionality still works
    • Test backward compatibility (entries without new fields/flags)
    • Confirm no regressions in non-EVPN scenarios

Which release branch to backport (provide reason below if selected)

  • 202305
  • 202311
  • 202405
  • 202411
  • 202505
  • 202511

Tested branch (Please provide the tested image version)

Description for the changelog

Add kernel patches for EVPN VXLAN Multihoming: extended FDB flags (NTF_EXT_MH_PEER_SYNC), protocol field for bridge FDB entries, and extern_valid flag for externally validated neighbor entries

Link to config_db schema for YANG model changes

N/A - This PR only adds kernel patches, no CONFIG_DB schema changes

Depends on

Related upstream work

  • EVPN MAC/IP proxy advertisement draft: draft-rbickhart-evpn-ip-mac-proxy-adv-03
  • Kernel patch for protocol field: Authored by Mrinmoy Ghosh mrghosh@cisco.com

Summary:

  • Total patches: 3
  • Total lines added: +1,558
  • Kernel subsystems modified: VXLAN driver, bridge FDB, neighbor subsystem, netlink attributes
  • Backward compatible: Yes - all new fields/flags are optional with sensible defaults

Critical for EVPN-MH:
✅ Peer synchronization flag (NTF_EXT_MH_PEER_SYNC)
✅ Control/data plane MAC distinction (protocol field)
✅ External neighbor validation (extern_valid flag)
✅ Proxy advertisement support
✅ Prevents intermittent EVPN-MH failures

Signed-off-by: Barry Friedman (friedman) <friedman@cisco.com>
@bdfriedman bdfriedman requested a review from a team as a code owner February 25, 2026 22:19
@mssonicbld
Copy link

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants