Skip to content

fix: use direct bastion tunnel for non-SSH app ports#968

Open
rysweet wants to merge 1 commit intomainfrom
fix/tunnel-direct-bastion-for-app-ports
Open

fix: use direct bastion tunnel for non-SSH app ports#968
rysweet wants to merge 1 commit intomainfrom
fix/tunnel-direct-bastion-for-app-ports

Conversation

@rysweet
Copy link
Copy Markdown
Owner

@rysweet rysweet commented Apr 10, 2026

Problem

azlin tunnel open <vm> 8080 failed to create a working tunnel for private (bastion-routed) VMs. The SSH -L forward process would die as a zombie while the bastion tunnel stayed alive, leaving no local listener on port 8080.

Root Cause

The old implementation always routed through SSH port 22:

  1. az bastion tunnel --resource-port 22 --port 50200 (bastion → VM:22)
  2. ssh -N -L 8080:localhost:8080 -p 50200 user@127.0.0.1 (SSH forward)

This two-hop approach is unnecessary and fragile for app ports. The SSH process frequently died, and tunnel close only killed the parent PID, leaving orphaned python3 listeners.

Fix

Non-SSH ports (e.g., 8080, 3000, 5432): Use a direct az bastion tunnel --resource-port <app_port> — single hop, no SSH layer. Simpler, more reliable.

Port 22: Retains the two-hop approach (it's the actual transport) but with added resilience (ServerAliveInterval, ExitOnForwardFailure).

Process cleanup: Spawn az as process group leader (setsid) so tunnel close kills the entire tree (az→bash→python3).

Startup detection: Replace fixed 3s sleep with wait_for_listener() polling.

Testing

Verified on a live private VM (devy):

  • azlin tunnel open devy 8080 → listener on localhost:8080, HTTP 303 from service ✓
  • azlin tunnel list → shows correct entry ✓
  • azlin tunnel close devy → port freed, no orphan processes ✓

The old implementation always routed through SSH port 22 and layered
an `ssh -N -L` forward on top, even for application ports like 8080.
This two-hop approach was fragile — the SSH process frequently died
as a zombie while the bastion tunnel stayed alive, leaving the user
with no local listener.

Changes:
- Non-SSH ports now use a direct `az bastion tunnel` with
  `--resource-port <app_port>` — single hop, no SSH layer needed
- Port 22 retains the two-hop approach (bastion→22 + SSH -L) with
  added ServerAliveInterval/ExitOnForwardFailure for resilience
- Replace fixed 3s sleep with `wait_for_listener()` polling for
  reliable startup detection
- Spawn `az` commands as process group leaders (setsid) so
  `tunnel close` kills the entire process tree (az→bash→python3)
  instead of leaving orphan python3 listeners
- Add `kill_process_tree()` that sends SIGTERM to the process group

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants