Skip to content

Tunnel routing not live until next connector heartbeat fires after Programmed=True #167

@drewr

Description

@drewr

This is part of a four-issue tunnel creation story

A newly created tunnel takes up to ~14 minutes before it reliably routes traffic. There are two distinct delays, each with an operator-side and a client-side component:

  • Delay 1 (~3-4 min): creation → toggle turns green
  • Delay 2 (~0-10 min): toggle green → traffic flows
  • UX consequence
    • app#160 — green toggle shown before tunnel is usable

Summary

After the NSO sets Programmed=True on an HTTPProxy, traffic routing through the iroh-gateway is not confirmed live until the connector (Datum Desktop) next posts a status heartbeat. The operator reconciles all HTTPProxies backed by a connector in response to connector status updates, not proactively when marking a proxy programmed.

Observed

For tunnel-xnhnb (project drewr-y4nd1b), 2026-05-22:

Event Timestamp Delta from Programmed
Programmed=True set on HTTPProxy 19:37:37Z
ConnectorMetadataProgrammed=True set 19:37:37Z
Datum Desktop patches connector datum-connect-jttwh status 19:48:28Z +10m51s
NSO sweeps all HTTPProxies for that connector 19:48:28Z +10m51s

The UI toggle turns green at 19:37:37Z. The tunnel does not reliably route traffic until ~19:48:28Z.

Root Cause

The iroh-gateway resolves connector.local using the connector's Iroh node key (public key + relay addresses). The NSO pushes current connector addressing to downstream routes in response to connector status updates, not proactively on HTTPProxy reconcile completion. After Programmed=True is set, there is no operator activity until the Datum Desktop client posts a heartbeat.

The heartbeat interval is governed by lease_duration_seconds / 2 in the app (see app#159). For the observed connector, lease duration ≈ 1200s → ~10 min heartbeat interval.

Expected

When the NSO marks an HTTPProxy Programmed=True, it should immediately push the connector's current addressing to the downstream route — not wait for the next connector-driven reconcile. The tunnel should be routing within seconds of the toggle turning green.

Related

  • Companion NSO issue: network-services-operator#166 — 409 conflict burst causes Delay 1 (creation → toggle green)
  • Companion app issue: app#159 — heartbeat interval reduction
  • Companion app issue: app#160 — UX: green toggle ≠ usable tunnel

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions