fix(network): join newly added node to the running mesh#181
Open
warku123 wants to merge 2 commits into
Open
Conversation
network create wires every node's compose to the shared trond-<name> network so siblings can resolve each other by container name. network add skipped this step — the new node landed only on its per-compose bridge and stayed at 0 peers, and any service on trond-<name> (e.g. the monitoring stack) could not reach it either. Append trond-<name> to node.Networks before render, mirroring the loop already used by network create.
Now that the new node attaches to the shared trond-<name> docker network (previous commit), it can resolve sibling container names — but it still has no active_peers entry, so it doesn't know which peers to dial. java-tron with discovery off (private network default) only connects to nodes listed in node.active, so without this the new node sits silent on the network despite being reachable. Populate the new node's active_peers from existing entries in state, mirroring autoWireActivePeers in network create. Because P2P connections are bidirectional once established, we only update the new node — existing nodes accept the incoming connection and broadcast back without needing a config rewrite or restart. state.ManagedNode gains a P2PPort field so add can read each sibling's listen port from state instead of re-parsing intent files (which add no longer has access to for nodes deployed earlier). network create now persists it too. Legacy state entries that predate the field get P2PPort=0 and are skipped from the active list — the field comment documents the same fallback the existing HTTPPort / GRPCPort fields use.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Fixes
trond network addso the new node actually joins the running private network instead of sitting orphaned. Two related fixes:trond-<name>docker network (was only on its per-compose bridge).node.activefrom existing nodes in state so the new node knows whom to dial.Adds a
state.ManagedNode.P2PPortfield so step 2 can read the listen port without the create-time intent.Why are these changes required?
network createalready does both steps via thesharedNetloop andautoWireActivePeersincmd/network/create.go.network addwas never updated to mirror them, so an added node has no DNS reachability to siblings and no peer list to dial — it stays at 0 peers and never syncs blocks, even though state and naming look correct.This PR has been tested by:
go build ./...clean.trond network create, added a third fullnode viatrond network add. Verified the new container attaches totrond-<name>, the rendered HOCON contains the populatednode.activelist, and the new node received blocks 1–38 over P2P from the witness within ~30s (RestartCount=0).P2PPortusesomitempty; legacy state entries read as0and are skipped from the auto-wire (same convention asHTTPPort/GRPCPort).Follow up
autoWireActivePeersruns only at create time; this PR fills the gap unilaterally for the new node (sufficient because P2P channels are bidirectional once established), but the design may warrant revisiting for denser networks.trond diagnosereportspeer_count=0for clearly-peered nodes because its probe queries/wallet/listnodes, which only surfaces discovery-found peers. Tracking separately.Extra details