Skip to content

Peer connection hangs indefinitely: missing timeouts on handshake I/O over degraded I2P tunnels #11

@rafabd1

Description

@rafabd1

Description

When I2P tunnels are degraded (frequent during bootstrapping or network instability), peer connection establishment hangs indefinitely because the handshake I/O operations (read_framed / write_framed) have no timeouts.

The I2P streaming layer accepts the inbound connection, but the application-level handshake never completes because data can't flow through broken tunnels — and nothing ever times out to report the failure.

Evidence from logs

  1. The embedded I2P router starts and builds tunnels, but tunnel health checks fail continuously:
tunnel test failed name=exploratory outbound=958854281 inbound=2225195137 error=Timeout

This repeats every ~20 seconds for the entire session, meaning the tunnels are essentially non-functional.

  1. A peer connects at the I2P streaming level (06:00:34):
inbound stream accepted local=f8SDGXUx remote=0-RohxV8 recv_stream_id=1233202003 send_stream_id=0 payload_len=0
  1. The peer retries the SYN 10 seconds later (06:00:44) because it never got a response — the handshake ACK from handle_incoming couldn't make it through the degraded tunnels:
received `SYN` to an active session local=f8SDGXUx remote=0-RohxV8 recv_id=1233202003 send_id=932169142
  1. This repeats again at 06:00:54, confirming the handshake is stuck.

  2. Eventually all NTCP2 sessions are forcibly reset (os error 10054 / WSAECONNRESET) at 06:01:48, killing all connectivity.

Root cause

In session.rs, both sides of the handshake lack timeouts:

Responder (handle_incoming, line ~274):

let frame = read_framed(&mut reader).await?;  // No timeout — blocks forever
// ...
write_framed(&mut writer, &ack).await?;  // No timeout

Initiator (initiate_session, line ~400-407):

write_framed(&mut writer, &init_msg).await?;  // No timeout
let ack_frame = read_framed(&mut reader).await?;  // No timeout — blocks forever

When I2P tunnels are degraded, these operations hang indefinitely. This also blocks the accept_loop since handle_incoming runs inline — no new connections can be accepted while one is stuck.

Additional problems visible in the logs

  • No I2P tunnel health feedback to user: The router status is set to "ready" once the SAM session is created, but the underlying tunnels may still be unusable. The user sees "connected" but connections fail silently.
  • No SAM session recovery: When tunnels degrade to the point of being non-functional (all tunnel tests failing), there's no mechanism to tear down and recreate the SAM session.

Proposed solution

  1. Add timeouts to all handshake I/O — wrap read_framed and write_framed calls in tokio::time::timeout (e.g., 30-60 seconds given I2P latency).
  2. Run handle_incoming with a timeout — so the accept loop isn't blocked by a stuck handshake.
  3. Emit a user-visible error when the handshake times out, instead of silently dropping the connection.
  4. Consider monitoring tunnel health — if tunnel tests fail continuously for an extended period, update router_status to reflect degraded connectivity so the user knows.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions