Skip to content

Force bonds/vlans re-parsing when lower layer changes#5902

Open
milan-zededa wants to merge 1 commit intolf-edge:masterfrom
milan-zededa:fix-l2-parsing
Open

Force bonds/vlans re-parsing when lower layer changes#5902
milan-zededa wants to merge 1 commit intolf-edge:masterfrom
milan-zededa:fix-l2-parsing

Conversation

@milan-zededa
Copy link
Copy Markdown
Contributor

Description

parseBonds and parseVlans each use a content hash to skip parsing when their own config is unchanged. However, both store pointers to lower-layer objects (physio pointers in bond.lowerPhysPorts, bond pointers in vlan.lowerL2Ports) obtained at parse time. If a lower layer is parsed without the dependent layer being parsed, those pointers become stale (pointing at parsed old config).

Fix by propagating force-parse down the dependency chain:

  • parseBonds is forced to parse when physio changes
  • parseVlans is forced to parse when physio or bonds change

How to test and validate this PR

Set up a device with four network adapters (e.g. eth0eth3) and configure two bonds with VLAN sub-interfaces on top:

Initial config

  • bond1: aggregates eth0, eth1 — management uplink (DHCP)
  • bond2: aggregates eth2, eth3
  • vlan20, vlan21: parent port bond2

Apply this config and confirm the device connects and reports no port errors.

Config change — swap eth1 and eth2 between the bonds

  • bond1: aggregates eth0, eth2
  • bond2: aggregates eth1, eth3
  • VLANs unchanged

Expected (correct) result
/run/zedagent/DevicePortConfig/zedagent.json should show:

  • bond1.Bond.AggregatedPorts = ["eth0", "eth2"]
  • bond2.Bond.AggregatedPorts = ["eth1", "eth3"]
  • No LastError on any port

Regression (what the bug produced)
Without the fix, parsed VLAN config would remain stale from a prior parse and zedagent.json could show eth2 under both bond1 and bond2, resulting in:

LastError: "Port eth2 is aggregated by multiple bond interfaces (bond1, bond2, ...)"

Changelog notes

Fixed a bug where re-configuring bond membership (e.g. swapping physical interfaces between two bonds) could leave the device with a broken network configuration and a port aggregated by multiple bonds error, requiring a reboot to recover.

Checklist

  • I've provided a proper description
  • I've added the proper documentation
  • I've tested my PR on amd64 device
  • I've tested my PR on arm64 device
  • I've written the test verification instructions
  • I've set the proper labels to this PR
  • I've checked the boxes above, or I've provided a good reason why I didn't check them.

@milan-zededa milan-zededa added bug Something isn't working stable Should be backported to stable release(s) labels May 5, 2026
@github-actions github-actions Bot requested a review from OhmSpectator May 5, 2026 15:20
parseBonds and parseVlans each use a content hash to skip parsing
when their own config is unchanged. However, both store pointers to
lower-layer objects (physio pointers in bond.lowerPhysPorts, bond
pointers in vlan.lowerL2Ports) obtained at parse time. If a lower layer
is parsed without the dependent layer being parsed, those pointers
become stale (pointing at parsed old config).

Fix by propagating force-parse down the dependency chain:
- parseBonds is forced to parse when physio changes
- parseVlans is forced to parse when physio or bonds change

Signed-off-by: Milan Lenco <milan@zededa.com>
Copy link
Copy Markdown
Contributor

@eriknordmark eriknordmark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Run tests. Some yetus issues to fix.

@milan-zededa
Copy link
Copy Markdown
Contributor Author

Run tests. Some yetus issues to fix.

Those yetus issues shown are for unchanged lines. In fact, the Yetus check is green, so this is some github/yetus/workflow fail.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 5, 2026

Codecov Report

❌ Patch coverage is 66.66667% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 17.11%. Comparing base (2281599) to head (15aceeb).
⚠️ Report is 644 commits behind head on master.

Files with missing lines Patch % Lines
pkg/pillar/cmd/zedagent/parseconfig.go 66.66% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5902      +/-   ##
==========================================
- Coverage   19.52%   17.11%   -2.42%     
==========================================
  Files          19      474     +455     
  Lines        3021    85692   +82671     
==========================================
+ Hits          590    14662   +14072     
- Misses       2310    69514   +67204     
- Partials      121     1516    +1395     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working stable Should be backported to stable release(s)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants