Skip to content

Standalone mode for the exporter#320

Open
bennyz wants to merge 5 commits intomainfrom
exporter-grpc-listener
Open

Standalone mode for the exporter#320
bennyz wants to merge 5 commits intomainfrom
exporter-grpc-listener

Conversation

@bennyz
Copy link
Member

@bennyz bennyz commented Mar 14, 2026

Allow exporters to serve directly over TCP without a controller, and clients to connect directly without going through a controller

# exporter
jmp run --tls-grpc-listener [HOST:]PORT [--tls-grpc-insecure]

# client
jmp shell --tls-grpc HOST:PORT [--tls-grpc-insecure]

Address #44

@netlify
Copy link

netlify bot commented Mar 14, 2026

Deploy Preview for jumpstarter-docs ready!

Name Link
🔨 Latest commit 88e12d7
🔍 Latest deploy log https://app.netlify.com/projects/jumpstarter-docs/deploys/69ba84f72f667c00089e231e
😎 Deploy Preview https://deploy-preview-320--jumpstarter-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@bennyz bennyz added the build-pr-images/jumpstarter request to build only the jumpstarter image from PR label Mar 14, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 14, 2026

📝 Walkthrough

Walkthrough

Adds a controller-less direct TCP/gRPC exporter mode with optional TLS and passphrase auth, client TCP dialing, DirectLease, CLI flags for listener and direct shell, session TCP serve helpers, exporter standalone lifecycle, e2e manifests/tests, and documentation for direct mode.

Changes

Cohort / File(s) Summary
E2E Manifests & Tests
e2e/exporters/exporter-direct-listener.yaml, e2e/exporters/exporter-direct-hooks-before.yaml, e2e/exporters/exporter-direct-hooks-both.yaml, e2e/tests-direct-listener.bats
Add ExporterConfig manifests and a bats suite validating standalone listener startup, TCP/gRPC connectivity, TLS/insecure/passphrase behavior, hook execution, readiness strategies, and lifecycle cleanup.
CLI — run
python/packages/jumpstarter-cli/jumpstarter_cli/run.py
Add listener bind parsing, TLS server credential builder, new CLI flags (--tls-grpc-listener, --tls-grpc-insecure, --tls-cert, --tls-key, --passphrase), validation, and route to exporter serve_standalone_tcp when listener provided.
CLI — shell
python/packages/jumpstarter-cli/jumpstarter_cli/shell.py
Add direct TLS/insecure shell path (_shell_direct_async), expand shell CLI signature with TLS/connect flags and passphrase, allow missing config for direct usage, and short-circuit to direct flow using DirectLease.
Client dialing & tests
python/packages/jumpstarter/jumpstarter/client/client.py, python/packages/jumpstarter/jumpstarter/client/client_test.py
Detect host:port addresses, extend client_from_path with tls_config/grpc_options/insecure/passphrase, support TLS or insecure TCP dialing with interceptors, and add tests for _is_tcp_address.
Lease model
python/packages/jumpstarter/jumpstarter/client/__init__.py, python/packages/jumpstarter/jumpstarter/client/lease.py
Introduce DirectLease dataclass for no-controller direct connections (address, tls_config, grpc_options, insecure, passphrase), export it, and provide minimal serve/monitor context managers.
Exporter config & creation
python/packages/jumpstarter/jumpstarter/config/exporter.py
create_exporter accepts standalone: bool, uses a dummy channel factory in standalone mode, and defers device instantiation by passing a factory callable.
Exporter runtime & session serving
python/packages/jumpstarter/jumpstarter/exporter/exporter.py, python/packages/jumpstarter/jumpstarter/exporter/session.py, python/packages/jumpstarter/jumpstarter/exporter/session_test.py
Add _standalone flag and _standalone_shutdown_waiter, implement serve_standalone_tcp, and add Session.serve_tcp_async / serve_tcp_and_unix_async with optional TLS and interceptor support. Add tests for insecure and passphrase-protected TCP serving.
Env & utils
python/packages/jumpstarter/jumpstarter/config/env.py, python/packages/jumpstarter/jumpstarter/utils/env.py, python/packages/jumpstarter/jumpstarter/common/utils.py
Add JMP_GRPC_INSECURE and JMP_GRPC_PASSPHRASE, thread insecure/passphrase through env parsing and client_from_path, and propagate them into launch_shell environment.
Common gRPC helper
python/packages/jumpstarter/jumpstarter/common/grpc.py
Add interceptors keyword to aio_secure_channel and pass through to grpc.aio.secure_channel; widen insecure-credential check to include JMP_GRPC_INSECURE.
Auth
python/packages/jumpstarter/jumpstarter/exporter/auth.py
Add passphrase-based server interceptor (PassphraseInterceptor) and client interceptor helpers (passphrase_client_interceptors) to inject/validate passphrase metadata and provide rejection handlers.
Docs
python/docs/source/getting-started/guides/index.md, python/docs/source/getting-started/guides/setup-direct-mode.md, python/docs/source/getting-started/configuration/files.md
Add a guide and toctree entry documenting direct mode usage, CLI flags, example manifests, hooks, client connection examples, and document JMP_GRPC_INSECURE alias.
Public API exports & tests updates
python/packages/jumpstarter/jumpstarter/client/__init__.py, python/packages/jumpstarter-cli/jumpstarter_cli/shell_test.py
Expose DirectLease in client package exports and update shell tests to include new TLS/passphrase parameters.

Sequence Diagram(s)

sequenceDiagram
    rect rgba(200,200,255,0.5)
    participant User as User/CLI
    participant Run as run.py
    participant Exporter as Exporter
    participant Session as Session
    participant gRPC as gRPC Server
    participant Client as jmp Client
    end

    User->>Run: start with --tls-grpc-listener host:port [+cert/key]
    Run->>Run: _parse_listener_bind(), _tls_server_credentials()
    Run->>Exporter: create_exporter(standalone=True)
    Run->>Exporter: serve_standalone_tcp(host,port,tls_creds,interceptors)
    Exporter->>Session: serve_tcp_async / serve_tcp_and_unix_async
    Session->>gRPC: create aio.Server, bind TCP (TLS/insecure), start
    User->>Client: jmp shell --tls-grpc-address host:port [--passphrase]
    Client->>Client: _is_tcp_address() -> True
    Client->>gRPC: dial (secure or insecure) + interceptors/passphrase
    Client->>gRPC: call driver methods
    gRPC->>Client: responses
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • Connection reset by peer #297 — touches exporter Session implementation and server serving methods overlapping session/server changes.
  • jumpstarter-dev/jumpstarter#359 — modifies gRPC channel construction and interceptor passthrough similar to this PR.
  • jumpstarter-dev/jumpstarter#367 — modifies CLI run/shell command implementations and parameter handling overlapping CLI changes.

Suggested labels

enhancement, python, build-pr-images

Suggested reviewers

  • evakhoni
  • mangelajo
  • NickCao

Poem

🐰 I hopped to the socket with certs in my pack,

Direct leases now chatter — no controller on track.
Hooks nibble before and after, scripts tidy and spry,
Passphrases guard the burrow while TLS watches nigh.
Clients connect, sockets hum — a snug secure sky.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 57.14% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Standalone mode for the exporter' clearly summarizes the main change: enabling exporters to run in standalone mode without a controller.
Description check ✅ Passed The description is directly related to the changeset, providing concrete CLI examples for both exporter and client usage in standalone mode, and referencing the GitHub issue being addressed.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch exporter-grpc-listener
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link

Container Images

The following container images have been built for this PR:

Image URI
jumpstarter quay.io/jumpstarter-dev/jumpstarter:pr-320

Images expire after 7 days.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (6)
python/packages/jumpstarter/jumpstarter/exporter/exporter.py (1)

38-43: Redundant import of Event inside function.

Event is already imported at the module level (line 11). The import inside _standalone_shutdown_waiter is unnecessary.

♻️ Proposed fix
 async def _standalone_shutdown_waiter():
     """Wait forever; used so serve_standalone_tcp can be cancelled by stop()."""
-    from anyio import Event
-
     await Event().wait()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter/jumpstarter/exporter/exporter.py` around lines 38
- 43, The function _standalone_shutdown_waiter contains a redundant local import
of Event; remove the inner "from anyio import Event" and use the module-level
Event import instead so that _standalone_shutdown_waiter simply awaits
Event().wait() without re-importing.
python/packages/jumpstarter/jumpstarter/client/lease.py (1)

65-68: Method name serve_unix_async is misleading for TCP addresses.

This method yields a TCP address (host:port), not a Unix socket path. The name serve_unix_async is inherited from the Lease interface but is semantically misleading here. Consider adding a docstring clarification or renaming if the interface allows.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter/jumpstarter/client/lease.py` around lines 65 -
68, The method serve_unix_async currently yields a TCP address (self.address)
while its name implies a Unix socket; update the implementation to clarify
intent by either renaming serve_unix_async to a more accurate name (e.g.,
serve_tcp_async) if the Lease interface permits, or at minimum update the
docstring of serve_unix_async to explicitly state it yields a TCP host:port
address (not a Unix socket); reference the serve_unix_async method and the Lease
interface and change the name and all call sites or adjust the docstring on
serve_unix_async accordingly so callers and implementers are not misled.
e2e/tests-direct-listener.bats (1)

6-6: Consider using a dynamic port or per-test port to avoid conflicts.

Using a fixed port (19090) could cause test failures if:

  1. Multiple test runs execute in parallel
  2. The port is already in use by another service
  3. A previous test's cleanup didn't complete before the next test starts

Consider either using port 0 (let the OS assign a free port and capture it) or using $BATS_TEST_NUMBER to derive a unique port per test.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@e2e/tests-direct-listener.bats` at line 6, Replace the fixed
LISTENER_PORT=19090 with a dynamic per-test assignment: modify the test setup to
choose a free port (either by binding to port 0 and capturing the assigned port,
or by deriving a unique port using $BATS_TEST_NUMBER) and export that value as
LISTENER_PORT for the rest of the test; ensure the code that starts the listener
reads LISTENER_PORT and that any cleanup/fallback handles port conflicts (retry
or fail with a clear message) so parallel runs and leftover processes won't
collide.
python/packages/jumpstarter/jumpstarter/client/client.py (1)

23-34: IPv6 addresses may be misidentified as invalid TCP addresses.

The _is_tcp_address function uses rsplit(":", 1) which correctly handles IPv6 addresses like [::1]:8080, but parts[0] would be [::1] (with brackets). However, bare IPv6 without brackets (e.g., ::1:8080) would fail since ::1 would be interpreted as the "port".

This may be acceptable if IPv6 addresses are expected to always use bracket notation ([host]:port), but consider documenting this requirement.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter/jumpstarter/client/client.py` around lines 23 -
34, The _is_tcp_address function misparses IPv6 forms like ::1:8080; update
_is_tcp_address to explicitly handle bracketed IPv6 by checking for host wrapped
in [ ] (e.g., parts[0] starting with '[' and ending with ']') and treating the
inner text as the host, while leaving the existing numeric port validation
(int(parts[1], 10) and 1<=port<=65535) intact; also update the function
docstring to state that IPv6 addresses must be bracketed as [host]:port and that
bare IPv6 with embedded colons is not supported.
python/packages/jumpstarter-cli/jumpstarter_cli/shell.py (1)

308-341: Consider documenting the security implications of unsafe=True for direct connections.

The DirectLease is created with allow=[] and unsafe=True, which bypasses driver allowlisting. While this is reasonable for direct connections (the user explicitly chose to connect to a specific exporter), consider adding a comment explaining this security model choice.

📝 Suggested documentation
     async with anyio.from_thread.BlockingPortal() as portal:
         lease = DirectLease(
             address=tls_grpc_address,
             portal=portal,
-            allow=[],
-            unsafe=True,
+            allow=[],  # Direct connection trusts the exporter
+            unsafe=True,  # No driver allowlist filtering for direct mode
             tls_config=TLSConfigV1Alpha1(),
             grpc_options={},
             insecure=tls_grpc_insecure,
         )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter-cli/jumpstarter_cli/shell.py` around lines 308 -
341, The DirectLease in _shell_direct_async is created with allow=[] and
unsafe=True which bypasses driver allowlisting; add a concise comment next to
the DirectLease construction (referencing DirectLease, unsafe=True, allow=[])
that documents the security implications (that driver restrictions are disabled
for direct connections because the user explicitly targets an exporter), and
note any recommended mitigations (e.g., user-facing warning/confirmation or
restricted use only for trusted endpoints) so future readers understand the risk
and rationale.
python/packages/jumpstarter/jumpstarter/client/client_test.py (1)

27-32: Consider whether port 0 should be valid for OS-assigned ports.

Port 0 is conventionally used to request an OS-assigned ephemeral port. The test asserts _is_tcp_address("host:0") returns False, which may be intentional to prevent user confusion, but could also be a limitation if the code is later used in contexts where port 0 is meaningful.

Verify this is the intended behavior for your use case.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter/jumpstarter/client/client_test.py` around lines
27 - 32, Decide whether port 0 should be considered a valid TCP address and make
the change consistently: if port 0 should be allowed, update the _is_tcp_address
implementation (the parsing/port validation logic) to accept 0 as valid (range
0..65535) and change test_invalid_port_not_tcp_address to assert
_is_tcp_address("host:0") is True; if port 0 should remain invalid, add a
clarifying comment in test_invalid_port_not_tcp_address (and/or in the
_is_tcp_address docstring) explaining why port 0 is rejected to prevent future
confusion, leaving the existing assertions unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@python/packages/jumpstarter-cli/jumpstarter_cli/run.py`:
- Around line 16-27: The _parse_listener_bind function incorrectly uses
rsplit(":", 1) which breaks on IPv6 addresses; update it to explicitly handle
bracketed IPv6 (if value starts with '[' find the matching ']', take host=inside
brackets and parse the port after the ']' ), otherwise if the string contains
more than one ':' but is not bracketed raise click.BadParameter instructing
callers to use bracketed IPv6 addresses, and for the normal hostname/IPv4 case
continue using rsplit(":", 1) to extract host and port; preserve the existing
port range validation and the param_hint "'--tls-grpc-listener'".

In `@python/packages/jumpstarter/jumpstarter/exporter/exporter.py`:
- Around line 858-878: serve_standalone_tcp is missing a finally path to run the
after-lease cleanup when the standalone session is cancelled (e.g., via stop()),
so _cleanup_after_lease and any hook_executor.run_after_lease_hook never
execute; wrap the async with session.serve_tcp_and_unix_async(...) block in a
try/finally (or add a finally after the with) that clears self._lease_context
and calls self._cleanup_after_lease(lease_scope) and, if hook_executor exists,
invokes hook_executor.run_after_lease_hook(...) (mirroring
handle_lease/_handle_end_session behavior) so the afterLease logic always runs
on shutdown.

---

Nitpick comments:
In `@e2e/tests-direct-listener.bats`:
- Line 6: Replace the fixed LISTENER_PORT=19090 with a dynamic per-test
assignment: modify the test setup to choose a free port (either by binding to
port 0 and capturing the assigned port, or by deriving a unique port using
$BATS_TEST_NUMBER) and export that value as LISTENER_PORT for the rest of the
test; ensure the code that starts the listener reads LISTENER_PORT and that any
cleanup/fallback handles port conflicts (retry or fail with a clear message) so
parallel runs and leftover processes won't collide.

In `@python/packages/jumpstarter-cli/jumpstarter_cli/shell.py`:
- Around line 308-341: The DirectLease in _shell_direct_async is created with
allow=[] and unsafe=True which bypasses driver allowlisting; add a concise
comment next to the DirectLease construction (referencing DirectLease,
unsafe=True, allow=[]) that documents the security implications (that driver
restrictions are disabled for direct connections because the user explicitly
targets an exporter), and note any recommended mitigations (e.g., user-facing
warning/confirmation or restricted use only for trusted endpoints) so future
readers understand the risk and rationale.

In `@python/packages/jumpstarter/jumpstarter/client/client_test.py`:
- Around line 27-32: Decide whether port 0 should be considered a valid TCP
address and make the change consistently: if port 0 should be allowed, update
the _is_tcp_address implementation (the parsing/port validation logic) to accept
0 as valid (range 0..65535) and change test_invalid_port_not_tcp_address to
assert _is_tcp_address("host:0") is True; if port 0 should remain invalid, add a
clarifying comment in test_invalid_port_not_tcp_address (and/or in the
_is_tcp_address docstring) explaining why port 0 is rejected to prevent future
confusion, leaving the existing assertions unchanged.

In `@python/packages/jumpstarter/jumpstarter/client/client.py`:
- Around line 23-34: The _is_tcp_address function misparses IPv6 forms like
::1:8080; update _is_tcp_address to explicitly handle bracketed IPv6 by checking
for host wrapped in [ ] (e.g., parts[0] starting with '[' and ending with ']')
and treating the inner text as the host, while leaving the existing numeric port
validation (int(parts[1], 10) and 1<=port<=65535) intact; also update the
function docstring to state that IPv6 addresses must be bracketed as [host]:port
and that bare IPv6 with embedded colons is not supported.

In `@python/packages/jumpstarter/jumpstarter/client/lease.py`:
- Around line 65-68: The method serve_unix_async currently yields a TCP address
(self.address) while its name implies a Unix socket; update the implementation
to clarify intent by either renaming serve_unix_async to a more accurate name
(e.g., serve_tcp_async) if the Lease interface permits, or at minimum update the
docstring of serve_unix_async to explicitly state it yields a TCP host:port
address (not a Unix socket); reference the serve_unix_async method and the Lease
interface and change the name and all call sites or adjust the docstring on
serve_unix_async accordingly so callers and implementers are not misled.

In `@python/packages/jumpstarter/jumpstarter/exporter/exporter.py`:
- Around line 38-43: The function _standalone_shutdown_waiter contains a
redundant local import of Event; remove the inner "from anyio import Event" and
use the module-level Event import instead so that _standalone_shutdown_waiter
simply awaits Event().wait() without re-importing.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e8b1862c-9f05-4be1-bde9-dc0842ce52fc

📥 Commits

Reviewing files that changed from the base of the PR and between ecc0afa and 9651198.

📒 Files selected for processing (15)
  • e2e/exporters/exporter-direct-listener.yaml
  • e2e/tests-direct-listener.bats
  • python/packages/jumpstarter-cli/jumpstarter_cli/run.py
  • python/packages/jumpstarter-cli/jumpstarter_cli/shell.py
  • python/packages/jumpstarter/jumpstarter/client/__init__.py
  • python/packages/jumpstarter/jumpstarter/client/client.py
  • python/packages/jumpstarter/jumpstarter/client/client_test.py
  • python/packages/jumpstarter/jumpstarter/client/lease.py
  • python/packages/jumpstarter/jumpstarter/common/utils.py
  • python/packages/jumpstarter/jumpstarter/config/env.py
  • python/packages/jumpstarter/jumpstarter/config/exporter.py
  • python/packages/jumpstarter/jumpstarter/exporter/exporter.py
  • python/packages/jumpstarter/jumpstarter/exporter/session.py
  • python/packages/jumpstarter/jumpstarter/exporter/session_test.py
  • python/packages/jumpstarter/jumpstarter/utils/env.py

Comment on lines +16 to +27
def _parse_listener_bind(value: str) -> tuple[str, int]:
"""Parse '[host:]port' into (host, port). Default host is 0.0.0.0."""
if ":" in value:
host, port_str = value.rsplit(":", 1)
host = host.strip() or "0.0.0.0"
port = int(port_str, 10)
else:
host = "0.0.0.0"
port = int(value, 10)
if not (1 <= port <= 65535):
raise click.BadParameter(f"port must be between 1 and 65535, got {port}", param_hint="'--tls-grpc-listener'")
return host, port
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

IPv6 address parsing may fail with rsplit(":", 1).

IPv6 addresses contain colons (e.g., [::1]:1234 or ::1). Using rsplit(":", 1) would incorrectly split ::1:1234 as host ::1 and port 1234, but [::1]:1234 would split as host [::1] and port 1234 which is closer but not ideal.

Consider handling bracketed IPv6 addresses explicitly or documenting that IPv6 addresses must be bracketed.

🛠️ Proposed fix to handle IPv6 addresses
 def _parse_listener_bind(value: str) -> tuple[str, int]:
-    """Parse '[host:]port' into (host, port). Default host is 0.0.0.0."""
-    if ":" in value:
-        host, port_str = value.rsplit(":", 1)
-        host = host.strip() or "0.0.0.0"
-        port = int(port_str, 10)
+    """Parse '[host:]port' into (host, port). Default host is 0.0.0.0.
+    
+    For IPv6, use bracketed notation: [::1]:1234
+    """
+    if value.startswith("["):
+        # IPv6 bracketed notation: [host]:port
+        bracket_end = value.find("]")
+        if bracket_end == -1:
+            raise click.BadParameter("Invalid IPv6 address: missing closing bracket", param_hint="'--tls-grpc-listener'")
+        host = value[1:bracket_end]
+        remainder = value[bracket_end + 1:]
+        if remainder.startswith(":"):
+            port = int(remainder[1:], 10)
+        elif remainder == "":
+            raise click.BadParameter("Port is required", param_hint="'--tls-grpc-listener'")
+        else:
+            raise click.BadParameter("Invalid format after IPv6 address", param_hint="'--tls-grpc-listener'")
+    elif ":" in value:
+        host, port_str = value.rsplit(":", 1)
+        host = host.strip() or "0.0.0.0"
+        port = int(port_str, 10)
     else:
         host = "0.0.0.0"
         port = int(value, 10)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter-cli/jumpstarter_cli/run.py` around lines 16 - 27,
The _parse_listener_bind function incorrectly uses rsplit(":", 1) which breaks
on IPv6 addresses; update it to explicitly handle bracketed IPv6 (if value
starts with '[' find the matching ']', take host=inside brackets and parse the
port after the ']' ), otherwise if the string contains more than one ':' but is
not bracketed raise click.BadParameter instructing callers to use bracketed IPv6
addresses, and for the normal hostname/IPv4 case continue using rsplit(":", 1)
to extract host and port; preserve the existing port range validation and the
param_hint "'--tls-grpc-listener'".

@bennyz bennyz removed the build-pr-images/jumpstarter request to build only the jumpstarter image from PR label Mar 14, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
python/packages/jumpstarter/jumpstarter/exporter/exporter.py (1)

842-845: Reset _standalone on method exit to avoid stale runtime mode.

self._standalone is set at Line 842 but never reset. If the same Exporter instance is reused, controller-status/report paths stay disabled unexpectedly.

Suggested fix
     async def serve_standalone_tcp(
         self,
         host: str,
         port: int,
         *,
         tls_credentials: grpc.ServerCredentials | None = None,
     ) -> None:
@@
-        self._standalone = True
-        lease_scope = LeaseContext(lease_name="standalone", before_lease_hook=Event())
-        self._lease_context = lease_scope
+        self._standalone = True
+        lease_scope = LeaseContext(lease_name="standalone", before_lease_hook=Event())
+        self._lease_context = lease_scope

-        with TemporarySocket() as hook_path:
-            hook_path_str = str(hook_path)
-            with Session(
-                uuid=self.uuid,
-                labels=self.labels,
-                root_device=self.device_factory(),
-            ) as session:
-                session.lease_context = lease_scope
-                lease_scope.session = session
-                lease_scope.socket_path = hook_path_str
-                lease_scope.hook_socket_path = hook_path_str
+        try:
+            with TemporarySocket() as hook_path:
+                hook_path_str = str(hook_path)
+                with Session(
+                    uuid=self.uuid,
+                    labels=self.labels,
+                    root_device=self.device_factory(),
+                ) as session:
+                    session.lease_context = lease_scope
+                    lease_scope.session = session
+                    lease_scope.socket_path = hook_path_str
+                    lease_scope.hook_socket_path = hook_path_str

-                async with session.serve_tcp_and_unix_async(
-                    host, port, hook_path_str, tls_credentials=tls_credentials
-                ):
-                    try:
-                        async with create_task_group() as tg:
-                            self._tg = tg
-                            tg.start_soon(self._handle_end_session, lease_scope)
+                    async with session.serve_tcp_and_unix_async(
+                        host, port, hook_path_str, tls_credentials=tls_credentials
+                    ):
+                        try:
+                            async with create_task_group() as tg:
+                                self._tg = tg
+                                tg.start_soon(self._handle_end_session, lease_scope)

-                            if self.hook_executor:
-                                await self.hook_executor.run_before_lease_hook(
-                                    lease_scope,
-                                    self._report_status,
-                                    self.stop,
-                                    self._request_lease_release,
-                                )
-                            else:
-                                await self._report_status(ExporterStatus.LEASE_READY, "Ready for commands")
-                                lease_scope.before_lease_hook.set()
+                                if self.hook_executor:
+                                    await self.hook_executor.run_before_lease_hook(
+                                        lease_scope,
+                                        self._report_status,
+                                        self.stop,
+                                        self._request_lease_release,
+                                    )
+                                else:
+                                    await self._report_status(ExporterStatus.LEASE_READY, "Ready for commands")
+                                    lease_scope.before_lease_hook.set()

-                            await _standalone_shutdown_waiter()
-                    finally:
-                        await self._cleanup_after_lease(lease_scope)
-
-        self._lease_context = None
-        self._tg = None
+                                await _standalone_shutdown_waiter()
+                        finally:
+                            await self._cleanup_after_lease(lease_scope)
+        finally:
+            self._lease_context = None
+            self._tg = None
+            self._standalone = False

Also applies to: 881-882

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter/jumpstarter/exporter/exporter.py` around lines
842 - 845, The Exporter sets self._standalone = True and assigns
self._lease_context (via LeaseContext/lease_scope) but never resets them,
leaving the instance in standalone mode for reused Exporter objects; update the
method(s) where you set self._standalone = True and self._lease_context =
lease_scope (and the other similar block that sets those at the later location
around lines with self._standalone / lease_scope) to ensure you reset state on
exit: wrap the work in a try/finally (or use a context manager) and in the
finally set self._standalone = False and self._lease_context = None (and also
ensure any LeaseContext is properly closed/cleaned) so controller-status/report
paths are re-enabled for subsequent uses of the same Exporter instance.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@e2e/exporters/exporter-direct-hooks-before.yaml`:
- Around line 11-14: The hook script masks failures because the final `echo
"BEFORE_HOOK_DIRECT: complete"` can return success even if `j power on` failed;
update the script around the `j power on` invocation (the lines containing `j
power on` and the surrounding `echo` commands) to propagate errors — e.g.,
enable errexit (`set -e`) at the top of the script or explicitly test the exit
status of `j power on` and `exit` with that non-zero code so failures in `j
power on` cause the hook to fail immediately.

In `@e2e/tests-direct-listener.bats`:
- Around line 157-163: The test uses a fixed sleep after calling stop_exporter
which is flaky; replace the sleep 1 with a bounded polling loop that checks
"${BATS_TEST_TMPDIR}/exporter.log" for the expected "AFTER_HOOK_DIRECT:
executed" string until a short timeout (e.g. 5–10s) and fails if not found, then
run the final assertion (keep the run cat ... and assert_output --partial).
Implement the poll as a loop that sleeps briefly between checks, uses
stop_exporter as-is, and exits early when the log contains the marker so the
test is deterministic and safe to run in isolation.

---

Nitpick comments:
In `@python/packages/jumpstarter/jumpstarter/exporter/exporter.py`:
- Around line 842-845: The Exporter sets self._standalone = True and assigns
self._lease_context (via LeaseContext/lease_scope) but never resets them,
leaving the instance in standalone mode for reused Exporter objects; update the
method(s) where you set self._standalone = True and self._lease_context =
lease_scope (and the other similar block that sets those at the later location
around lines with self._standalone / lease_scope) to ensure you reset state on
exit: wrap the work in a try/finally (or use a context manager) and in the
finally set self._standalone = False and self._lease_context = None (and also
ensure any LeaseContext is properly closed/cleaned) so controller-status/report
paths are re-enabled for subsequent uses of the same Exporter instance.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f9c826cd-3a9c-4db8-aef4-c1c296136ae6

📥 Commits

Reviewing files that changed from the base of the PR and between 9651198 and 65117b3.

📒 Files selected for processing (4)
  • e2e/exporters/exporter-direct-hooks-before.yaml
  • e2e/exporters/exporter-direct-hooks-both.yaml
  • e2e/tests-direct-listener.bats
  • python/packages/jumpstarter/jumpstarter/exporter/exporter.py

@bennyz bennyz force-pushed the exporter-grpc-listener branch from 65117b3 to 9b379b7 Compare March 14, 2026 10:06
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
python/packages/jumpstarter/jumpstarter/exporter/exporter.py (1)

38-42: Remove redundant import inside function.

Event is already imported at line 11 from anyio. The import inside _standalone_shutdown_waiter is unnecessary.

♻️ Proposed fix
 async def _standalone_shutdown_waiter():
     """Wait forever; used so serve_standalone_tcp can be cancelled by stop()."""
-    from anyio import Event
-
     await Event().wait()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter/jumpstarter/exporter/exporter.py` around lines 38
- 42, Remove the redundant local import of Event inside
_standalone_shutdown_waiter: delete the line "from anyio import Event" within
the async def _standalone_shutdown_waiter(), and use the Event symbol already
imported at module top (line where Event is imported) so the function simply
awaits Event().wait() without re-importing.
e2e/tests-direct-listener.bats (1)

16-56: Consider extracting common logic to reduce duplication.

start_exporter and start_exporter_with_logs share identical logic except for stderr redirection. Similarly for the _bg variants. This is acceptable for test clarity, but could be consolidated with an optional parameter if more variants are needed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@e2e/tests-direct-listener.bats` around lines 16 - 56, start_exporter and
start_exporter_with_logs duplicate the same startup and readiness logic; extract
that common behavior into a helper function (e.g., start_exporter_common) that
accepts parameters for config and an optional stderr redirection flag or
redirection path, then have start_exporter and start_exporter_with_logs (and the
_bg variants) call this helper; preserve setting LISTENER_PID, writing
"${BATS_TEST_TMPDIR}/exporter.pid", the jmp run invocation differences (with or
without "2>..."), and the readiness loop that uses jmp shell --tls-grpc
"127.0.0.1:${LISTENER_PORT}" --tls-grpc-insecure -- j --help so behavior remains
identical.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@e2e/tests-direct-listener.bats`:
- Around line 16-56: start_exporter and start_exporter_with_logs duplicate the
same startup and readiness logic; extract that common behavior into a helper
function (e.g., start_exporter_common) that accepts parameters for config and an
optional stderr redirection flag or redirection path, then have start_exporter
and start_exporter_with_logs (and the _bg variants) call this helper; preserve
setting LISTENER_PID, writing "${BATS_TEST_TMPDIR}/exporter.pid", the jmp run
invocation differences (with or without "2>..."), and the readiness loop that
uses jmp shell --tls-grpc "127.0.0.1:${LISTENER_PORT}" --tls-grpc-insecure -- j
--help so behavior remains identical.

In `@python/packages/jumpstarter/jumpstarter/exporter/exporter.py`:
- Around line 38-42: Remove the redundant local import of Event inside
_standalone_shutdown_waiter: delete the line "from anyio import Event" within
the async def _standalone_shutdown_waiter(), and use the Event symbol already
imported at module top (line where Event is imported) so the function simply
awaits Event().wait() without re-importing.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ef821b4d-a914-4db2-830f-cdd34e683a10

📥 Commits

Reviewing files that changed from the base of the PR and between 65117b3 and 9b379b7.

📒 Files selected for processing (4)
  • e2e/exporters/exporter-direct-hooks-before.yaml
  • e2e/exporters/exporter-direct-hooks-both.yaml
  • e2e/tests-direct-listener.bats
  • python/packages/jumpstarter/jumpstarter/exporter/exporter.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • e2e/exporters/exporter-direct-hooks-both.yaml
  • e2e/exporters/exporter-direct-hooks-before.yaml

@bennyz bennyz force-pushed the exporter-grpc-listener branch from 9b379b7 to 5c82ee9 Compare March 14, 2026 10:28
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
python/packages/jumpstarter-cli/jumpstarter_cli/run.py (1)

16-27: ⚠️ Potential issue | 🟡 Minor

Normalize listener-bind parse failures into click.BadParameter.

rsplit(":", 1) still doesn't support bracketed IPv6, and int(port_str, 10) will raise a raw ValueError for inputs like foo:bar. Since _parse_listener_bind() runs after the fork, those CLI mistakes currently surface as child-process failures instead of a clean usage error. Please handle bracketed IPv6 explicitly and wrap bad ports in click.BadParameter.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter-cli/jumpstarter_cli/run.py` around lines 16 - 27,
_Parse_listener_bind currently misparses bracketed IPv6 and lets int()
ValueError leak; update _parse_listener_bind to detect bracketed IPv6 (if value
starts with '[' find the matching ']' and treat what's inside as host and what
follows after an optional ':' as port), otherwise use the existing host:port
split; wrap the port conversion in a try/except and raise click.BadParameter
with a clear message and the existing param_hint "'--tls-grpc-listener'" for any
missing/invalid port or non-numeric port values, and keep the existing range
check (1–65535) also raising click.BadParameter on failure.
🧹 Nitpick comments (3)
python/packages/jumpstarter/jumpstarter/exporter/exporter.py (1)

162-166: Consider resetting _standalone flag after serve_standalone_tcp exits.

The _standalone flag is set to True in serve_standalone_tcp (line 840) but never reset to False. While Exporter instances are typically not reused, this could cause unexpected behavior if someone calls serve() on the same instance after serve_standalone_tcp() completes.

♻️ Suggested fix
         self._lease_context = None
         self._tg = None
+        self._standalone = False
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter/jumpstarter/exporter/exporter.py` around lines
162 - 166, The _standalone flag is set to True in serve_standalone_tcp but never
reset, which can cause subsequent calls to serve() or methods like
_report_status and __aexit__ to behave as if running standalone; update
serve_standalone_tcp to ensure _standalone is set back to False when the method
exits (use a try/finally or context manager) so the Exporter instance returns to
its normal state, referencing the _standalone attribute and the serve,
serve_standalone_tcp, _report_status, and __aexit__ methods to locate where to
add the reset.
python/packages/jumpstarter/jumpstarter/exporter/session.py (1)

166-205: Bound TCP port is not yielded when using port=0.

Unlike serve_tcp_async which yields the bound_port (useful when port=0 lets the OS choose), serve_tcp_and_unix_async discards the return value from add_secure_port/add_insecure_port and yields None.

If a caller uses port=0 to get an ephemeral port, they won't be able to discover which port was actually bound.

♻️ Suggested fix to capture and yield the bound port
     async def serve_tcp_and_unix_async(
         self,
         host: str,
         port: int,
         unix_path: str,
         *,
         tls_credentials: grpc.ServerCredentials | None = None,
     ):
         """Serve session on TCP (and optionally TLS) plus a Unix socket (e.g. for hooks).

         One server listens on host:port and unix://unix_path.
         Use for standalone mode where clients use TCP and hook j commands use Unix.

         Yields:
-            None - server runs until context exit
+            int - the bound TCP port (useful when port=0)
         """
         server = grpc.aio.server()
         address = f"{host}:{port}"
         server.add_insecure_port(f"unix://{unix_path}")
         logger.debug("Session server listening on unix://%s (hooks)", unix_path)
         if tls_credentials is not None:
-            server.add_secure_port(address, tls_credentials)
+            bound_port = server.add_secure_port(address, tls_credentials)
             logger.debug("Session server listening on %s (TLS)", address)
         else:
-            server.add_insecure_port(address)
+            bound_port = server.add_insecure_port(address)
             logger.debug("Session server listening on %s (insecure)", address)

         jumpstarter_pb2_grpc.add_ExporterServiceServicer_to_server(self, server)
         router_pb2_grpc.add_RouterServiceServicer_to_server(self, server)

         await server.start()
-        logger.info("Session server started on %s and unix://%s", address, unix_path)
+        logger.info("Session server started on %s:%d and unix://%s", host, bound_port, unix_path)
         try:
-            yield
+            yield bound_port
         finally:
             logger.info("Stopping session server...")
             await server.stop(grace=1.0)
             await sleep(0.1)
             logger.info("Session server stopped")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter/jumpstarter/exporter/session.py` around lines 166
- 205, The serve_tcp_and_unix_async coroutine currently discards the return
value from server.add_secure_port/add_insecure_port so callers cannot discover
the actual bound port when port=0; capture the integer returned by the chosen
add_*_port call (e.g. bound_port = server.add_insecure_port(...) or bound_port =
server.add_secure_port(...)), use that value when logging and yield it from
serve_tcp_and_unix_async (matching serve_tcp_async behavior) so callers receive
the ephemeral port; ensure the yielded value is returned before the try/finally
block suspends and that shutdown still awaits server.stop as before.
python/packages/jumpstarter/jumpstarter/common/utils.py (1)

96-122: Missing docstring for the insecure parameter.

The function docstring at lines 98-111 documents all parameters except the newly added insecure parameter.

📝 Suggested docstring addition
     Args:
         host: The jumpstarter host path
         context: The context of the shell (e.g. "local" or exporter name)
         allow: List of allowed drivers
         unsafe: Whether to allow drivers outside of the allow list
         use_profiles: Whether to load shell profile files
         command: Optional command to run instead of launching an interactive shell
         lease: Optional Lease object to set up lease ending callback
+        insecure: Whether to use insecure gRPC communication (sets JMP_GRPC_INSECURE env var)

     Returns:
         The exit code of the shell or command process
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter/jumpstarter/common/utils.py` around lines 96 -
122, Add a docstring entry for the newly added insecure parameter in the
function's Args section: document insecure as a boolean that enables insecure
gRPC (it causes JMP_GRPC_INSECURE to be set in the environment), and keep it
consistent with the style of the other Args entries; update the function's
docstring near the existing parameters (host, context, allow, unsafe,
use_profiles, command, lease) so the insecure parameter is described.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@python/packages/jumpstarter-cli/jumpstarter_cli/shell.py`:
- Around line 308-341: _shell_direct_async currently builds a synthetic config
that discards the resolved client configuration from shell(), causing loss of
--client/--client-config settings and TLS/driver trust policy; modify
_shell_direct_async (and the similar block at lines 393-401) to accept and use
the resolved client configuration object passed in from shell() instead of
constructing SimpleNamespace(...) with
allow=[]/unsafe=True/TLSConfigV1Alpha1()/use_profiles=False, and only construct
that fallback config when no resolved config is provided; ensure the DirectLease
instantiation still receives tls_grpc_insecure and grpc options from the real
config and pass the real config into _run_shell_with_lease_async so
import_class(...) in client_from_channel() sees the correct
allow/unsafe/TLS/profile settings.

In `@python/packages/jumpstarter/jumpstarter/client/client.py`:
- Around line 23-34: _is_tcp_address currently only validates the trailing port
and therefore accepts malformed addresses like ":1234" and "::1"; update
_is_tcp_address to require a non-empty host for the plain HOST:PORT form and
only allow IPv6 addresses when they are bracketed (e.g. "[::1]:1234").
Concretely: if path starts with '[' parse until matching ']' and ensure a
trailing ":PORT" exists and port is numeric and in 1..65535; otherwise rsplit on
the last ':' and ensure the left-hand host is non-empty and does not contain
additional ':' characters (reject IPv6 unbracketed), then validate the port as
an integer in range. Apply this logic inside the _is_tcp_address function.

In `@python/packages/jumpstarter/jumpstarter/client/lease.py`:
- Around line 44-73: DirectLease is exposed but only implements serve_unix_async
and monitor_async, so callers expecting the full Lease surface (connect,
connect_async, serve_unix, monitor, and normal sync/async context-manager
behavior) will break; either stop exporting DirectLease or implement the missing
helpers: add connect() and connect_async() that return or await the same address
yielded by serve_unix_async, add synchronous wrappers serve_unix() and monitor()
that call the async context managers (or provide equivalent sync behavior), and
ensure enter/exit methods from ContextManagerMixin/AsyncContextManagerMixin are
present and delegate to serve_unix_async/monitor_async as appropriate so
DirectLease presents the same API as Lease.

---

Duplicate comments:
In `@python/packages/jumpstarter-cli/jumpstarter_cli/run.py`:
- Around line 16-27: _Parse_listener_bind currently misparses bracketed IPv6 and
lets int() ValueError leak; update _parse_listener_bind to detect bracketed IPv6
(if value starts with '[' find the matching ']' and treat what's inside as host
and what follows after an optional ':' as port), otherwise use the existing
host:port split; wrap the port conversion in a try/except and raise
click.BadParameter with a clear message and the existing param_hint
"'--tls-grpc-listener'" for any missing/invalid port or non-numeric port values,
and keep the existing range check (1–65535) also raising click.BadParameter on
failure.

---

Nitpick comments:
In `@python/packages/jumpstarter/jumpstarter/common/utils.py`:
- Around line 96-122: Add a docstring entry for the newly added insecure
parameter in the function's Args section: document insecure as a boolean that
enables insecure gRPC (it causes JMP_GRPC_INSECURE to be set in the
environment), and keep it consistent with the style of the other Args entries;
update the function's docstring near the existing parameters (host, context,
allow, unsafe, use_profiles, command, lease) so the insecure parameter is
described.

In `@python/packages/jumpstarter/jumpstarter/exporter/exporter.py`:
- Around line 162-166: The _standalone flag is set to True in
serve_standalone_tcp but never reset, which can cause subsequent calls to
serve() or methods like _report_status and __aexit__ to behave as if running
standalone; update serve_standalone_tcp to ensure _standalone is set back to
False when the method exits (use a try/finally or context manager) so the
Exporter instance returns to its normal state, referencing the _standalone
attribute and the serve, serve_standalone_tcp, _report_status, and __aexit__
methods to locate where to add the reset.

In `@python/packages/jumpstarter/jumpstarter/exporter/session.py`:
- Around line 166-205: The serve_tcp_and_unix_async coroutine currently discards
the return value from server.add_secure_port/add_insecure_port so callers cannot
discover the actual bound port when port=0; capture the integer returned by the
chosen add_*_port call (e.g. bound_port = server.add_insecure_port(...) or
bound_port = server.add_secure_port(...)), use that value when logging and yield
it from serve_tcp_and_unix_async (matching serve_tcp_async behavior) so callers
receive the ephemeral port; ensure the yielded value is returned before the
try/finally block suspends and that shutdown still awaits server.stop as before.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 11096d3c-e65c-470f-bff9-9d883a18bb72

📥 Commits

Reviewing files that changed from the base of the PR and between 9b379b7 and 5c82ee9.

📒 Files selected for processing (17)
  • e2e/exporters/exporter-direct-hooks-before.yaml
  • e2e/exporters/exporter-direct-hooks-both.yaml
  • e2e/exporters/exporter-direct-listener.yaml
  • e2e/tests-direct-listener.bats
  • python/packages/jumpstarter-cli/jumpstarter_cli/run.py
  • python/packages/jumpstarter-cli/jumpstarter_cli/shell.py
  • python/packages/jumpstarter/jumpstarter/client/__init__.py
  • python/packages/jumpstarter/jumpstarter/client/client.py
  • python/packages/jumpstarter/jumpstarter/client/client_test.py
  • python/packages/jumpstarter/jumpstarter/client/lease.py
  • python/packages/jumpstarter/jumpstarter/common/utils.py
  • python/packages/jumpstarter/jumpstarter/config/env.py
  • python/packages/jumpstarter/jumpstarter/config/exporter.py
  • python/packages/jumpstarter/jumpstarter/exporter/exporter.py
  • python/packages/jumpstarter/jumpstarter/exporter/session.py
  • python/packages/jumpstarter/jumpstarter/exporter/session_test.py
  • python/packages/jumpstarter/jumpstarter/utils/env.py
🚧 Files skipped from review as they are similar to previous changes (4)
  • e2e/exporters/exporter-direct-hooks-both.yaml
  • e2e/exporters/exporter-direct-hooks-before.yaml
  • e2e/tests-direct-listener.bats
  • python/packages/jumpstarter/jumpstarter/utils/env.py

Comment on lines +23 to +34
def _is_tcp_address(path: str) -> bool:
"""Return True if path looks like host:port (TCP address)."""
if ":" not in path:
return False
parts = path.rsplit(":", 1)
if len(parts) != 2:
return False
try:
port = int(parts[1], 10)
return 1 <= port <= 65535
except ValueError:
return False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Tighten _is_tcp_address() so malformed direct targets fail early.

Because this only validates the last :PORT, it currently accepts inputs like :1234 and ::1 as TCP endpoints. In the new jmp shell --tls-grpc flow that pushes obviously bad input to channel creation / first RPC instead of rejecting it up front. Require a non-empty host for the plain HOST:PORT form and only accept IPv6 in bracketed form; the new :1234 assertion in python/packages/jumpstarter/jumpstarter/client/client_test.py would need to change with that fix.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter/jumpstarter/client/client.py` around lines 23 -
34, _is_tcp_address currently only validates the trailing port and therefore
accepts malformed addresses like ":1234" and "::1"; update _is_tcp_address to
require a non-empty host for the plain HOST:PORT form and only allow IPv6
addresses when they are bracketed (e.g. "[::1]:1234"). Concretely: if path
starts with '[' parse until matching ']' and ensure a trailing ":PORT" exists
and port is numeric and in 1..65535; otherwise rsplit on the last ':' and ensure
the left-hand host is non-empty and does not contain additional ':' characters
(reject IPv6 unbracketed), then validate the port as an integer in range. Apply
this logic inside the _is_tcp_address function.

@bennyz bennyz added the build-pr-images/jumpstarter request to build only the jumpstarter image from PR label Mar 14, 2026
@bennyz bennyz requested a review from mangelajo March 15, 2026 08:00
@bennyz bennyz changed the title Direct listener mode for the exporter Standalone mode for the exporter Mar 15, 2026
@bennyz bennyz requested review from bkhizgiy and evakhoni March 15, 2026 08:01
Comment on lines +146 to +151
if tls_credentials is not None:
bound_port = server.add_secure_port(address, tls_credentials)
logger.debug("Session server listening on %s (TLS)", address)
else:
bound_port = server.add_insecure_port(address)
logger.debug("Session server listening on %s (insecure)", address)
Copy link
Member

@bkhizgiy bkhizgiy Mar 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

something is a bit confusing here, if we run for example jmp run --exporter-config x.yaml --tls-grpc-listener XXXX, based on the command syntax I would assume were using the secure flow since we didn't explicitly add --tls-grpc-insecure but based on this code it looks like were actually falling back to insecure.

@bennyz bennyz force-pushed the exporter-grpc-listener branch from 2e976d1 to 73feecf Compare March 15, 2026 19:38
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
python/packages/jumpstarter-cli/jumpstarter_cli/run.py (1)

16-27: ⚠️ Potential issue | 🟠 Major

Harden listener bind parsing for malformed and IPv6 inputs.

Current parsing can misinterpret colon-rich inputs and raises raw ValueError for invalid ports instead of a clear click.BadParameter.

🛡️ Suggested robust parser
 def _parse_listener_bind(value: str) -> tuple[str, int]:
     """Parse '[host:]port' into (host, port). Default host is 0.0.0.0."""
-    if ":" in value:
-        host, port_str = value.rsplit(":", 1)
-        host = host.strip() or "0.0.0.0"
-        port = int(port_str, 10)
-    else:
-        host = "0.0.0.0"
-        port = int(value, 10)
+    value = value.strip()
+    if value.startswith("["):
+        end = value.find("]")
+        if end == -1:
+            raise click.BadParameter("missing closing ']' for IPv6 host", param_hint="'--tls-grpc-listener'")
+        host = value[1:end]
+        if value[end + 1 : end + 2] != ":":
+            raise click.BadParameter("missing ':PORT' after IPv6 host", param_hint="'--tls-grpc-listener'")
+        port_str = value[end + 2 :]
+    elif ":" in value:
+        if value.count(":") > 1:
+            raise click.BadParameter(
+                "IPv6 addresses must be bracketed, e.g. [::1]:1234",
+                param_hint="'--tls-grpc-listener'",
+            )
+        host, port_str = value.rsplit(":", 1)
+        host = host.strip() or "0.0.0.0"
+    else:
+        host = "0.0.0.0"
+        port_str = value
+
+    try:
+        port = int(port_str, 10)
+    except ValueError:
+        raise click.BadParameter(f"invalid port '{port_str}'", param_hint="'--tls-grpc-listener'") from None
+
     if not (1 <= port <= 65535):
         raise click.BadParameter(f"port must be between 1 and 65535, got {port}", param_hint="'--tls-grpc-listener'")
     return host, port
🧹 Nitpick comments (2)
python/packages/jumpstarter/jumpstarter/exporter/session_test.py (1)

335-353: Add a TLS-mode companion test for serve_tcp_async.

This test validates only the insecure branch. A second test that passes tls_credentials would cover the secure binding path and reduce regression risk.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter/jumpstarter/exporter/session_test.py` around
lines 335 - 353, Add a companion async test to cover the TLS branch of
Session.serve_tcp_async by duplicating test_serve_tcp_async_insecure but passing
valid tls_credentials to Session.serve_tcp_async and connecting with
grpc.aio.secure_channel; use the same driver/Session setup and
jumpstarter_pb2_grpc.ExporterServiceStub, await stub.GetReport, and assert
response fields (reports length and uuid) just like the insecure test so the
secure binding path is exercised (reference Session.serve_tcp_async,
tls_credentials, and jumpstarter_pb2_grpc.ExporterServiceStub).
e2e/tests-direct-listener.bats (1)

6-14: Prefer a per-test free port instead of a fixed listener port.

Using a constant port (19090) can make this suite flaky when another process already owns that port.

🔧 Suggested isolation-friendly approach
-LISTENER_PORT=19090
+LISTENER_PORT=""
+
+_pick_free_port() {
+  while :; do
+    local candidate=$((20000 + RANDOM % 20000))
+    if ! nc -z 127.0.0.1 "$candidate" >/dev/null 2>&1; then
+      echo "$candidate"
+      return 0
+    fi
+  done
+}
@@
 setup() {
   bats_load_library bats-support
   bats_load_library bats-assert
 
   bats_require_minimum_version 1.5.0
+  LISTENER_PORT="$(_pick_free_port)"
 }
Based on learnings: E2E tests in bats should be runnable in isolation and avoid shared state across tests.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@e2e/tests-direct-listener.bats` around lines 6 - 14, The test currently
hardcodes LISTENER_PORT=19090 which causes flakes when the port is in use;
change to allocate a free port per test by removing the constant and setting
LISTENER_PORT inside setup() (or a new helper like find_free_port) at
runtime—e.g. run a small one-liner (python/python3:
socket.socket(...).bind(('',0)); print(s.getsockname()[1])) or use a similar
shell-based free-port discovery, export LISTENER_PORT for other functions that
reference LISTENER_PID/LISTENER_PORT, and ensure any process that binds uses
that dynamic port so tests run in isolation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@python/packages/jumpstarter-cli/jumpstarter_cli/shell.py`:
- Around line 345-346: The code that handles the command argument coming from
the opt_config decorator (opt_config(allow_missing=True)) currently treats any
tuple-valued command as a generic usage error; change the branching around the
command variable (the handling currently around lines 403-406) to distinguish an
empty tuple (meaning no config was provided) from a non-empty tuple (meaning an
explicit config was requested but not found). Concretely, update the handler
that inspects the command tuple produced by click.argument("command", nargs=-1):
if command == () treat it as “no config provided” and proceed with the default
flow, but if command is a non-empty tuple raise a clear config-not-found error
(e.g., raise click.ClickException or a specific error including the requested
config name from the tuple) instead of the generic usage error so the real cause
is visible.

In `@python/packages/jumpstarter/jumpstarter/client/client.py`:
- Around line 45-63: The insecure TCP branch drops grpc_options; update the code
in the TCP/insecure path (where _is_tcp_address(path) and insecure is True and
grpc.aio.insecure_channel(path) is called) to pass the grpc_options into
insecure_channel (e.g., grpc.aio.insecure_channel(path, grpc_options or None))
so it receives the same channel options as the secure branch; keep the existing
use of client_from_channel and aio_secure_channel unchanged.

---

Nitpick comments:
In `@e2e/tests-direct-listener.bats`:
- Around line 6-14: The test currently hardcodes LISTENER_PORT=19090 which
causes flakes when the port is in use; change to allocate a free port per test
by removing the constant and setting LISTENER_PORT inside setup() (or a new
helper like find_free_port) at runtime—e.g. run a small one-liner
(python/python3: socket.socket(...).bind(('',0)); print(s.getsockname()[1])) or
use a similar shell-based free-port discovery, export LISTENER_PORT for other
functions that reference LISTENER_PID/LISTENER_PORT, and ensure any process that
binds uses that dynamic port so tests run in isolation.

In `@python/packages/jumpstarter/jumpstarter/exporter/session_test.py`:
- Around line 335-353: Add a companion async test to cover the TLS branch of
Session.serve_tcp_async by duplicating test_serve_tcp_async_insecure but passing
valid tls_credentials to Session.serve_tcp_async and connecting with
grpc.aio.secure_channel; use the same driver/Session setup and
jumpstarter_pb2_grpc.ExporterServiceStub, await stub.GetReport, and assert
response fields (reports length and uuid) just like the insecure test so the
secure binding path is exercised (reference Session.serve_tcp_async,
tls_credentials, and jumpstarter_pb2_grpc.ExporterServiceStub).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 54b4b432-d17b-49ab-8652-4e0024024ff8

📥 Commits

Reviewing files that changed from the base of the PR and between 5c82ee9 and 73feecf.

📒 Files selected for processing (19)
  • e2e/exporters/exporter-direct-hooks-before.yaml
  • e2e/exporters/exporter-direct-hooks-both.yaml
  • e2e/exporters/exporter-direct-listener.yaml
  • e2e/tests-direct-listener.bats
  • python/docs/source/getting-started/guides/index.md
  • python/docs/source/getting-started/guides/setup-direct-mode.md
  • python/packages/jumpstarter-cli/jumpstarter_cli/run.py
  • python/packages/jumpstarter-cli/jumpstarter_cli/shell.py
  • python/packages/jumpstarter/jumpstarter/client/__init__.py
  • python/packages/jumpstarter/jumpstarter/client/client.py
  • python/packages/jumpstarter/jumpstarter/client/client_test.py
  • python/packages/jumpstarter/jumpstarter/client/lease.py
  • python/packages/jumpstarter/jumpstarter/common/utils.py
  • python/packages/jumpstarter/jumpstarter/config/env.py
  • python/packages/jumpstarter/jumpstarter/config/exporter.py
  • python/packages/jumpstarter/jumpstarter/exporter/exporter.py
  • python/packages/jumpstarter/jumpstarter/exporter/session.py
  • python/packages/jumpstarter/jumpstarter/exporter/session_test.py
  • python/packages/jumpstarter/jumpstarter/utils/env.py
🚧 Files skipped from review as they are similar to previous changes (4)
  • python/docs/source/getting-started/guides/index.md
  • e2e/exporters/exporter-direct-hooks-before.yaml
  • python/packages/jumpstarter/jumpstarter/exporter/exporter.py
  • python/packages/jumpstarter/jumpstarter/client/lease.py

Comment on lines +345 to 346
@opt_config(allow_missing=True)
@click.argument("command", nargs=-1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Differentiate “no config provided” from “requested config missing.”

With allow_missing=True (Line 345), missing explicit config targets can arrive as tuples. Line 403 currently maps all tuples to a generic usage error, which hides the actual “not found” cause.

Suggested fix
-    if config is None or isinstance(config, tuple):
-        raise click.UsageError(
-            "Specify one of: --client / --client-config, --exporter / --exporter-config, or --tls-grpc HOST:PORT"
-        )
+    if config is None:
+        raise click.UsageError(
+            "Specify one of: --client / --client-config, --exporter / --exporter-config, or --tls-grpc HOST:PORT"
+        )
+    if isinstance(config, tuple):
+        kind, value = config
+        if (kind, value) == ("client", "default"):
+            raise click.UsageError(
+                "Specify one of: --client / --client-config, --exporter / --exporter-config, or --tls-grpc HOST:PORT"
+            )
+        raise click.ClickException(f"Failed to load {kind}: {value}")

Also applies to: 403-406

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter-cli/jumpstarter_cli/shell.py` around lines 345 -
346, The code that handles the command argument coming from the opt_config
decorator (opt_config(allow_missing=True)) currently treats any tuple-valued
command as a generic usage error; change the branching around the command
variable (the handling currently around lines 403-406) to distinguish an empty
tuple (meaning no config was provided) from a non-empty tuple (meaning an
explicit config was requested but not found). Concretely, update the handler
that inspects the command tuple produced by click.argument("command", nargs=-1):
if command == () treat it as “no config provided” and proceed with the default
flow, but if command is a non-empty tuple raise a clear config-not-found error
(e.g., raise click.ClickException or a specific error including the requested
config name from the tuple) instead of the generic usage error so the real cause
is visible.

@bennyz bennyz force-pushed the exporter-grpc-listener branch from 73feecf to e66d97c Compare March 16, 2026 08:03
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
python/packages/jumpstarter/jumpstarter/common/utils.py (1)

100-108: ⚠️ Potential issue | 🟡 Minor

Docstring is missing the new insecure argument.

Please add insecure to the Args section so the function contract matches the signature.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter/jumpstarter/common/utils.py` around lines 100 -
108, The docstring Args block for the function in jumpstarter.common.utils that
currently lists host, context, allow, unsafe, use_profiles, command, and lease
is missing the new insecure parameter; update that Args section to include an
entry for insecure (describe that it controls skipping/allowing insecure
connections or verification) so the docstring matches the function signature and
mentions the insecure parameter alongside the other arguments.
♻️ Duplicate comments (3)
python/packages/jumpstarter-cli/jumpstarter_cli/run.py (1)

16-27: ⚠️ Potential issue | 🟡 Minor

Handle IPv6 and invalid numeric parsing explicitly.

Line 18/Line 19 still relies on rsplit(":", 1), which is fragile for IPv6 input forms and can yield confusing parse outcomes. Also, int(...) failures currently escape as raw ValueError instead of a parameter-specific CLI error.

Suggested patch
 def _parse_listener_bind(value: str) -> tuple[str, int]:
     """Parse '[host:]port' into (host, port). Default host is 0.0.0.0."""
-    if ":" in value:
-        host, port_str = value.rsplit(":", 1)
-        host = host.strip() or "0.0.0.0"
-        port = int(port_str, 10)
-    else:
-        host = "0.0.0.0"
-        port = int(value, 10)
+    try:
+        if value.startswith("["):
+            end = value.find("]")
+            if end == -1 or end + 1 >= len(value) or value[end + 1] != ":":
+                raise click.BadParameter(
+                    "Invalid IPv6 format, expected [HOST]:PORT",
+                    param_hint="'--tls-grpc-listener'",
+                )
+            host = value[1:end].strip() or "::"
+            port = int(value[end + 2 :], 10)
+        elif value.count(":") > 1:
+            raise click.BadParameter(
+                "IPv6 must be bracketed, e.g. [::1]:1234",
+                param_hint="'--tls-grpc-listener'",
+            )
+        elif ":" in value:
+            host, port_str = value.rsplit(":", 1)
+            host = host.strip() or "0.0.0.0"
+            port = int(port_str, 10)
+        else:
+            host = "0.0.0.0"
+            port = int(value, 10)
+    except ValueError as exc:
+        raise click.BadParameter(
+            "Invalid listener format, expected [HOST:]PORT",
+            param_hint="'--tls-grpc-listener'",
+        ) from exc
     if not (1 <= port <= 65535):
         raise click.BadParameter(f"port must be between 1 and 65535, got {port}", param_hint="'--tls-grpc-listener'")
     return host, port
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter-cli/jumpstarter_cli/run.py` around lines 16 - 27,
The _parse_listener_bind function fails on IPv6 forms and leaks ValueError on
bad ports; update it to explicitly handle bracketed IPv6 addresses (e.g.,
"[::1]:8080") by detecting a leading '[' and finding the matching ']' to split
host and port, otherwise fall back to the existing ":"-based split for
IPv4/hostname, then wrap the port conversion in a try/except that raises
click.BadParameter with a clear message (including the invalid value) and the
same param_hint "'--tls-grpc-listener'"; keep the existing numeric range check
(1–65535) and default host behavior (empty -> "0.0.0.0").
python/packages/jumpstarter/jumpstarter/client/client.py (1)

23-34: ⚠️ Potential issue | 🟡 Minor

Reject malformed direct targets earlier in _is_tcp_address().

The helper still accepts malformed forms like :1234 and unbracketed IPv6-like input, so obvious input errors can slip into channel creation instead of failing fast.

Suggested fix
 def _is_tcp_address(path: str) -> bool:
     """Return True if path looks like host:port (TCP address)."""
-    if ":" not in path:
-        return False
-    parts = path.rsplit(":", 1)
-    if len(parts) != 2:
-        return False
-    try:
-        port = int(parts[1], 10)
-        return 1 <= port <= 65535
-    except ValueError:
-        return False
+    if path.startswith("["):
+        end = path.find("]")
+        if end <= 1 or end + 2 > len(path) or path[end + 1] != ":":
+            return False
+        host = path[1:end]
+        port_str = path[end + 2 :]
+    else:
+        if ":" not in path:
+            return False
+        host, port_str = path.rsplit(":", 1)
+        if not host or ":" in host:
+            return False
+
+    if not host:
+        return False
+
+    try:
+        port = int(port_str, 10)
+    except ValueError:
+        return False
+    return 1 <= port <= 65535
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter/jumpstarter/client/client.py` around lines 23 -
34, The _is_tcp_address function accepts malformed targets like ":1234" and
unbracketed IPv6-like hosts; fix it by rejecting empty hosts and unbracketed
IPv6 fragments: after splitting with parts = path.rsplit(":", 1) extract host =
parts[0] and return False if not host, and if ':' in host then only allow it
when host.startswith('[') and host.endswith(']') (i.e., bracketed IPv6);
otherwise proceed to parse and validate the port as currently done. Ensure these
checks are applied inside the existing _is_tcp_address function.
python/packages/jumpstarter-cli/jumpstarter_cli/shell.py (1)

403-406: ⚠️ Potential issue | 🟡 Minor

Differentiate “no config provided” from “requested config missing.”

This branch still maps tuple-valued config to a generic usage error, which hides the actual config-not-found cause for explicit --client/--client-config requests.

Suggested fix
-    if config is None or isinstance(config, tuple):
-        raise click.UsageError(
-            "Specify one of: --client / --client-config, --exporter / --exporter-config, or --tls-grpc HOST:PORT"
-        )
+    if config is None:
+        raise click.UsageError(
+            "Specify one of: --client / --client-config, --exporter / --exporter-config, or --tls-grpc HOST:PORT"
+        )
+    if isinstance(config, tuple):
+        kind, value = config
+        if (kind, value) == ("client", "default"):
+            raise click.UsageError(
+                "Specify one of: --client / --client-config, --exporter / --exporter-config, or --tls-grpc HOST:PORT"
+            )
+        raise click.ClickException(f"Failed to load {kind}: {value}")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter-cli/jumpstarter_cli/shell.py` around lines 403 -
406, The current branch conflates "no config provided" and "requested config
missing" by treating tuple-valued config the same as None; change the logic so
that only config is None triggers the generic click.UsageError, and handle the
tuple case separately: when config is a tuple (the signal that the user
requested a specific config but it wasn't found), raise a more specific error
(e.g., click.FileError or a UsageError with a clear message including the
missing config identifier/path) so callers see the actual "config-not-found"
cause; update the branch around the config variable in jumpstarter_cli/shell.py
where the if config is None or isinstance(config, tuple) check occurs to
implement this split behavior.
🧹 Nitpick comments (2)
e2e/tests-direct-listener.bats (1)

6-6: Allow listener port override to improve test isolation.

Using a fixed port can cause intermittent failures when another process already binds it. Consider making it overrideable per test run.

Suggested tweak
-LISTENER_PORT=19090
+LISTENER_PORT="${JMP_E2E_LISTENER_PORT:-19090}"

Based on learnings: E2E tests in bats should be runnable in isolation and avoid shared external state assumptions.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@e2e/tests-direct-listener.bats` at line 6, The test hardcodes
LISTENER_PORT=19090 which causes flakiness when the port is in use; change
tests-direct-listener.bats to read LISTENER_PORT from the environment with a
fallback (e.g., use ${LISTENER_PORT:-19090}) so callers can override the port
per-run or CI; update any references to LISTENER_PORT in the script and document
in test setup that LISTENER_PORT can be supplied as an env var to improve
isolation.
python/packages/jumpstarter/jumpstarter/exporter/exporter.py (1)

840-880: Consider resetting _standalone in finally for defensive state hygiene.

_standalone is set to True on entry but never explicitly reset. If this instance is reused in-process after an exception path, later controller-mode behavior can be affected.

Suggested refactor
     async def serve_standalone_tcp(
@@
-        self._standalone = True
-        lease_scope = LeaseContext(lease_name="standalone", before_lease_hook=Event())
-        self._lease_context = lease_scope
-
-        with TemporarySocket() as hook_path:
-            ...
-
-        self._lease_context = None
-        self._tg = None
+        self._standalone = True
+        lease_scope = LeaseContext(lease_name="standalone", before_lease_hook=Event())
+        self._lease_context = lease_scope
+        try:
+            with TemporarySocket() as hook_path:
+                ...
+        finally:
+            self._lease_context = None
+            self._tg = None
+            self._standalone = False
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter/jumpstarter/exporter/exporter.py` around lines
840 - 880, The instance flag self._standalone is set to True on entry but never
reset; modify the exit/finalization path so that in the same finally where you
await self._cleanup_after_lease(lease_scope) you also set self._standalone =
False (and keep the existing resets of self._lease_context and self._tg) to
ensure the Exporter instance is returned to a non-standalone state even on
exceptions; locate this change around the finally block that calls
_cleanup_after_lease and restore the flag there.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@python/packages/jumpstarter-cli/jumpstarter_cli/run.py`:
- Around line 96-109: The current listener TLS logic in the block handling
listener_bind silently prefers tls_insecure over provided cert/key; update the
conditional order in run.py so that if tls_insecure is True and either tls_cert
or tls_key is also provided the code raises a UsageError or prints an error and
exits rather than silently disabling TLS; specifically adjust the checks around
tls_insecure, tls_cert, tls_key and the call to _tls_server_credentials (used to
create tls_credentials) so the precedence is: (1) if tls_insecure and no
cert/key -> no TLS, (2) if tls_insecure and any cert/key -> error and exit, (3)
elif both tls_cert and tls_key -> call _tls_server_credentials, (4) elif one of
tls_cert/tls_key -> error and exit, else raise UsageError as before.
- Around line 233-237: The run function currently only errors when
--tls-grpc-listener is set without --exporter-config; add a guard to reject
TLS-related flags when no listener is configured: inside run (function name run)
check if listener_bind is None and any of tls_insecure, tls_cert, or tls_key are
provided, and raise click.UsageError indicating that
--tls-grpc-insecure/--tls-cert/--tls-key require --tls-grpc-listener to be set;
keep the existing exporter-config validation and return path to
_serve_with_exc_handling unchanged.

---

Outside diff comments:
In `@python/packages/jumpstarter/jumpstarter/common/utils.py`:
- Around line 100-108: The docstring Args block for the function in
jumpstarter.common.utils that currently lists host, context, allow, unsafe,
use_profiles, command, and lease is missing the new insecure parameter; update
that Args section to include an entry for insecure (describe that it controls
skipping/allowing insecure connections or verification) so the docstring matches
the function signature and mentions the insecure parameter alongside the other
arguments.

---

Duplicate comments:
In `@python/packages/jumpstarter-cli/jumpstarter_cli/run.py`:
- Around line 16-27: The _parse_listener_bind function fails on IPv6 forms and
leaks ValueError on bad ports; update it to explicitly handle bracketed IPv6
addresses (e.g., "[::1]:8080") by detecting a leading '[' and finding the
matching ']' to split host and port, otherwise fall back to the existing
":"-based split for IPv4/hostname, then wrap the port conversion in a try/except
that raises click.BadParameter with a clear message (including the invalid
value) and the same param_hint "'--tls-grpc-listener'"; keep the existing
numeric range check (1–65535) and default host behavior (empty -> "0.0.0.0").

In `@python/packages/jumpstarter-cli/jumpstarter_cli/shell.py`:
- Around line 403-406: The current branch conflates "no config provided" and
"requested config missing" by treating tuple-valued config the same as None;
change the logic so that only config is None triggers the generic
click.UsageError, and handle the tuple case separately: when config is a tuple
(the signal that the user requested a specific config but it wasn't found),
raise a more specific error (e.g., click.FileError or a UsageError with a clear
message including the missing config identifier/path) so callers see the actual
"config-not-found" cause; update the branch around the config variable in
jumpstarter_cli/shell.py where the if config is None or isinstance(config,
tuple) check occurs to implement this split behavior.

In `@python/packages/jumpstarter/jumpstarter/client/client.py`:
- Around line 23-34: The _is_tcp_address function accepts malformed targets like
":1234" and unbracketed IPv6-like hosts; fix it by rejecting empty hosts and
unbracketed IPv6 fragments: after splitting with parts = path.rsplit(":", 1)
extract host = parts[0] and return False if not host, and if ':' in host then
only allow it when host.startswith('[') and host.endswith(']') (i.e., bracketed
IPv6); otherwise proceed to parse and validate the port as currently done.
Ensure these checks are applied inside the existing _is_tcp_address function.

---

Nitpick comments:
In `@e2e/tests-direct-listener.bats`:
- Line 6: The test hardcodes LISTENER_PORT=19090 which causes flakiness when the
port is in use; change tests-direct-listener.bats to read LISTENER_PORT from the
environment with a fallback (e.g., use ${LISTENER_PORT:-19090}) so callers can
override the port per-run or CI; update any references to LISTENER_PORT in the
script and document in test setup that LISTENER_PORT can be supplied as an env
var to improve isolation.

In `@python/packages/jumpstarter/jumpstarter/exporter/exporter.py`:
- Around line 840-880: The instance flag self._standalone is set to True on
entry but never reset; modify the exit/finalization path so that in the same
finally where you await self._cleanup_after_lease(lease_scope) you also set
self._standalone = False (and keep the existing resets of self._lease_context
and self._tg) to ensure the Exporter instance is returned to a non-standalone
state even on exceptions; locate this change around the finally block that calls
_cleanup_after_lease and restore the flag there.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 247e1ca2-e75d-4fdf-83e6-094d7e297318

📥 Commits

Reviewing files that changed from the base of the PR and between 73feecf and e66d97c.

📒 Files selected for processing (19)
  • e2e/exporters/exporter-direct-hooks-before.yaml
  • e2e/exporters/exporter-direct-hooks-both.yaml
  • e2e/exporters/exporter-direct-listener.yaml
  • e2e/tests-direct-listener.bats
  • python/docs/source/getting-started/guides/index.md
  • python/docs/source/getting-started/guides/setup-direct-mode.md
  • python/packages/jumpstarter-cli/jumpstarter_cli/run.py
  • python/packages/jumpstarter-cli/jumpstarter_cli/shell.py
  • python/packages/jumpstarter/jumpstarter/client/__init__.py
  • python/packages/jumpstarter/jumpstarter/client/client.py
  • python/packages/jumpstarter/jumpstarter/client/client_test.py
  • python/packages/jumpstarter/jumpstarter/client/lease.py
  • python/packages/jumpstarter/jumpstarter/common/utils.py
  • python/packages/jumpstarter/jumpstarter/config/env.py
  • python/packages/jumpstarter/jumpstarter/config/exporter.py
  • python/packages/jumpstarter/jumpstarter/exporter/exporter.py
  • python/packages/jumpstarter/jumpstarter/exporter/session.py
  • python/packages/jumpstarter/jumpstarter/exporter/session_test.py
  • python/packages/jumpstarter/jumpstarter/utils/env.py
🚧 Files skipped from review as they are similar to previous changes (6)
  • e2e/exporters/exporter-direct-hooks-before.yaml
  • e2e/exporters/exporter-direct-hooks-both.yaml
  • python/docs/source/getting-started/guides/index.md
  • python/packages/jumpstarter/jumpstarter/config/env.py
  • e2e/exporters/exporter-direct-listener.yaml
  • python/packages/jumpstarter/jumpstarter/exporter/session_test.py

@bennyz bennyz force-pushed the exporter-grpc-listener branch from e66d97c to 3ee4798 Compare March 16, 2026 08:43
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
python/packages/jumpstarter-cli/jumpstarter_cli/run.py (2)

237-241: ⚠️ Potential issue | 🟡 Minor

Reject TLS-related flags when --tls-grpc-listener is not set.

Currently, passing --tls-grpc-insecure, --tls-cert, or --tls-key without --tls-grpc-listener is silently ignored. This could lead to misconfiguration where users believe TLS is configured when it actually isn't being applied.

🛠️ Proposed fix
 def run(config, listener_bind, tls_insecure, tls_cert, tls_key):
     """Run an exporter locally."""
+    if listener_bind is None and (tls_insecure or tls_cert or tls_key):
+        raise click.UsageError(
+            "--tls-grpc-insecure/--tls-cert/--tls-key require --tls-grpc-listener"
+        )
     if listener_bind is not None and config is None:
         raise click.UsageError("--exporter-config (or --exporter) is required when using --tls-grpc-listener")
     return _serve_with_exc_handling(config, listener_bind, tls_insecure, tls_cert, tls_key)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter-cli/jumpstarter_cli/run.py` around lines 237 -
241, The run function currently ignores TLS flags when --tls-grpc-listener
(listener_bind) is not set; add an explicit validation in run (before calling
_serve_with_exc_handling) that if listener_bind is None and any of tls_insecure,
tls_cert, or tls_key is provided, raise click.UsageError with a clear message
like "--tls-grpc-listener must be set when using --tls-grpc-insecure,
--tls-cert, or --tls-key" so users cannot silently pass TLS-related flags
without an active gRPC listener.

16-27: ⚠️ Potential issue | 🟡 Minor

IPv6 address parsing may fail with rsplit(":", 1).

IPv6 addresses contain multiple colons (e.g., [::1]:1234). The current rsplit(":", 1) approach would incorrectly parse unbracketed IPv6 addresses.

Consider documenting that IPv6 addresses must use bracketed notation, or handle them explicitly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter-cli/jumpstarter_cli/run.py` around lines 16 - 27,
The _parse_listener_bind function currently uses value.rsplit(":", 1) which
breaks for IPv6; update it to explicitly handle bracketed IPv6 notation: if
value starts with "[" find the matching "]" and take the substring inside as
host and require a trailing ":"+port after the closing bracket, otherwise fall
back to the existing host:port parsing (including the default host when no
colon), validate the parsed port range as before and raise click.BadParameter on
errors; reference _parse_listener_bind and the param_hint usage so the error
message remains correct.
🧹 Nitpick comments (3)
python/packages/jumpstarter/jumpstarter/client/client.py (1)

23-34: Consider tightening _is_tcp_address() to reject empty hosts.

The function accepts :1234 as a valid TCP address (empty host with valid port), which the test explicitly expects. However, this may lead to confusing errors downstream when grpc.aio.insecure_channel(":1234") or TLS connections are attempted with an empty host.

If this is intentional for binding to all interfaces (e.g., 0.0.0.0), consider documenting it. Otherwise, requiring a non-empty host would provide earlier, clearer error messages.

Optional: Require non-empty host
 def _is_tcp_address(path: str) -> bool:
     """Return True if path looks like host:port (TCP address)."""
     if ":" not in path:
         return False
     parts = path.rsplit(":", 1)
     if len(parts) != 2:
         return False
+    host, port_str = parts
+    if not host:
+        return False
     try:
-        port = int(parts[1], 10)
+        port = int(port_str, 10)
         return 1 <= port <= 65535
     except ValueError:
         return False
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter/jumpstarter/client/client.py` around lines 23 -
34, The current _is_tcp_address(path: str) allows an empty host (e.g., ":1234");
update the validation to require a non-empty host component before accepting the
TCP address. In _is_tcp_address check that parts = path.rsplit(":", 1) yields a
non-empty, non-whitespace parts[0] (the host) in addition to validating the
numeric port range, so addresses like ":1234" are rejected unless you
intentionally want to allow empty hosts—document that behavior if left as-is.
python/packages/jumpstarter-cli/jumpstarter_cli/run.py (1)

107-109: Consider using click.UsageError for consistency.

Other validation errors in this block use click.UsageError (lines 100-102, 111-113), but this case uses click.echo + sys.exit(1). Using click.UsageError would provide consistent error formatting and exit behavior.

♻️ Proposed fix for consistency
                 elif tls_cert or tls_key:
-                    click.echo("Both --tls-cert and --tls-key are required for TLS", err=True)
-                    sys.exit(1)
+                    raise click.UsageError("Both --tls-cert and --tls-key are required for TLS")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter-cli/jumpstarter_cli/run.py` around lines 107 -
109, Replace the click.echo/sys.exit(1) branch that handles when tls_cert or
tls_key is set by raising click.UsageError instead so the TLS validation matches
the other error branches; specifically, in the branch checking tls_cert or
tls_key (the same block that currently calls click.echo("Both --tls-cert and
--tls-key are required for TLS", err=True) and sys.exit(1)), raise
click.UsageError("Both --tls-cert and --tls-key are required for TLS") so the
CLI uses consistent error formatting and exit behavior with the other validation
checks.
python/packages/jumpstarter/jumpstarter/exporter/exporter.py (1)

853-854: Both socket_path and hook_socket_path point to the same Unix socket.

In session_for_lease() (lines 578-581), these are separate sockets to prevent SSL frame corruption. Here both are set to hook_path_str, which is the Unix socket. The TCP endpoint serves client connections, so this may be intentional, but the naming is potentially confusing.

If this is intentional (clients connect via TCP, hooks use the Unix socket), consider adding a clarifying comment.

📝 Suggested clarifying comment
                 session.lease_context = lease_scope
                 lease_scope.session = session
-                lease_scope.socket_path = hook_path_str
-                lease_scope.hook_socket_path = hook_path_str
+                # In standalone mode, clients connect via TCP (host:port).
+                # The Unix socket is only used for hook j commands.
+                # Both paths point to the same socket since there's no separate client socket.
+                lease_scope.socket_path = hook_path_str
+                lease_scope.hook_socket_path = hook_path_str
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter/jumpstarter/exporter/exporter.py` around lines
853 - 854, Assigning both lease_scope.socket_path and
lease_scope.hook_socket_path to hook_path_str makes it look like both client and
hook traffic use the same Unix socket; if this is intentional (clients use the
TCP endpoint while hooks use the Unix socket to avoid SSL frame corruption, as
done in session_for_lease), add a concise clarifying comment immediately above
the two assignments stating that socket_path is intentionally set to the hook
UNIX socket for hooks only while client connections are served over TCP and
reference session_for_lease behavior; if it was unintentional, instead set
lease_scope.socket_path to the TCP endpoint variable rather than hook_path_str.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@python/packages/jumpstarter-cli/jumpstarter_cli/run.py`:
- Around line 237-241: The run function currently ignores TLS flags when
--tls-grpc-listener (listener_bind) is not set; add an explicit validation in
run (before calling _serve_with_exc_handling) that if listener_bind is None and
any of tls_insecure, tls_cert, or tls_key is provided, raise click.UsageError
with a clear message like "--tls-grpc-listener must be set when using
--tls-grpc-insecure, --tls-cert, or --tls-key" so users cannot silently pass
TLS-related flags without an active gRPC listener.
- Around line 16-27: The _parse_listener_bind function currently uses
value.rsplit(":", 1) which breaks for IPv6; update it to explicitly handle
bracketed IPv6 notation: if value starts with "[" find the matching "]" and take
the substring inside as host and require a trailing ":"+port after the closing
bracket, otherwise fall back to the existing host:port parsing (including the
default host when no colon), validate the parsed port range as before and raise
click.BadParameter on errors; reference _parse_listener_bind and the param_hint
usage so the error message remains correct.

---

Nitpick comments:
In `@python/packages/jumpstarter-cli/jumpstarter_cli/run.py`:
- Around line 107-109: Replace the click.echo/sys.exit(1) branch that handles
when tls_cert or tls_key is set by raising click.UsageError instead so the TLS
validation matches the other error branches; specifically, in the branch
checking tls_cert or tls_key (the same block that currently calls
click.echo("Both --tls-cert and --tls-key are required for TLS", err=True) and
sys.exit(1)), raise click.UsageError("Both --tls-cert and --tls-key are required
for TLS") so the CLI uses consistent error formatting and exit behavior with the
other validation checks.

In `@python/packages/jumpstarter/jumpstarter/client/client.py`:
- Around line 23-34: The current _is_tcp_address(path: str) allows an empty host
(e.g., ":1234"); update the validation to require a non-empty host component
before accepting the TCP address. In _is_tcp_address check that parts =
path.rsplit(":", 1) yields a non-empty, non-whitespace parts[0] (the host) in
addition to validating the numeric port range, so addresses like ":1234" are
rejected unless you intentionally want to allow empty hosts—document that
behavior if left as-is.

In `@python/packages/jumpstarter/jumpstarter/exporter/exporter.py`:
- Around line 853-854: Assigning both lease_scope.socket_path and
lease_scope.hook_socket_path to hook_path_str makes it look like both client and
hook traffic use the same Unix socket; if this is intentional (clients use the
TCP endpoint while hooks use the Unix socket to avoid SSL frame corruption, as
done in session_for_lease), add a concise clarifying comment immediately above
the two assignments stating that socket_path is intentionally set to the hook
UNIX socket for hooks only while client connections are served over TCP and
reference session_for_lease behavior; if it was unintentional, instead set
lease_scope.socket_path to the TCP endpoint variable rather than hook_path_str.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 57298e7e-2750-4cdf-95e9-5b20fba9a314

📥 Commits

Reviewing files that changed from the base of the PR and between e66d97c and 3ee4798.

📒 Files selected for processing (19)
  • e2e/exporters/exporter-direct-hooks-before.yaml
  • e2e/exporters/exporter-direct-hooks-both.yaml
  • e2e/exporters/exporter-direct-listener.yaml
  • e2e/tests-direct-listener.bats
  • python/docs/source/getting-started/guides/index.md
  • python/docs/source/getting-started/guides/setup-direct-mode.md
  • python/packages/jumpstarter-cli/jumpstarter_cli/run.py
  • python/packages/jumpstarter-cli/jumpstarter_cli/shell.py
  • python/packages/jumpstarter/jumpstarter/client/__init__.py
  • python/packages/jumpstarter/jumpstarter/client/client.py
  • python/packages/jumpstarter/jumpstarter/client/client_test.py
  • python/packages/jumpstarter/jumpstarter/client/lease.py
  • python/packages/jumpstarter/jumpstarter/common/utils.py
  • python/packages/jumpstarter/jumpstarter/config/env.py
  • python/packages/jumpstarter/jumpstarter/config/exporter.py
  • python/packages/jumpstarter/jumpstarter/exporter/exporter.py
  • python/packages/jumpstarter/jumpstarter/exporter/session.py
  • python/packages/jumpstarter/jumpstarter/exporter/session_test.py
  • python/packages/jumpstarter/jumpstarter/utils/env.py
🚧 Files skipped from review as they are similar to previous changes (9)
  • python/packages/jumpstarter/jumpstarter/utils/env.py
  • python/packages/jumpstarter/jumpstarter/config/exporter.py
  • python/packages/jumpstarter/jumpstarter/config/env.py
  • python/packages/jumpstarter/jumpstarter/common/utils.py
  • e2e/exporters/exporter-direct-listener.yaml
  • e2e/exporters/exporter-direct-hooks-both.yaml
  • python/docs/source/getting-started/guides/index.md
  • python/packages/jumpstarter/jumpstarter/exporter/session_test.py
  • python/docs/source/getting-started/guides/setup-direct-mode.md

@mangelajo
Copy link
Member

Should we add some sort of passphrase support?

@bennyz
Copy link
Member Author

bennyz commented Mar 16, 2026

Should we add some sort of passphrase support?

ack

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
python/packages/jumpstarter/jumpstarter/common/utils.py (1)

117-125: ⚠️ Potential issue | 🟠 Major

Clear inherited gRPC overrides when this launch doesn't set them.

common_env starts from os.environ, and python/packages/jumpstarter/jumpstarter/utils/env.py:9-43 reads JMP_GRPC_INSECURE / JMP_GRPC_PASSPHRASE directly. A parent shell with JMP_GRPC_INSECURE=1 or a stale passphrase will silently override insecure=False / passphrase=None here. if passphrase: also leaves a parent value in place for "". Explicitly remove both keys when the current launch doesn't want them.

Suggested fix
-    common_env = os.environ | {
-        JUMPSTARTER_HOST: host,
-        JMP_DRIVERS_ALLOW: "UNSAFE" if unsafe else ",".join(allow),
-        "_JMP_SUPPRESS_DRIVER_WARNINGS": "1",  # Already warned during client initialization
-    }
-    if insecure:
-        common_env = common_env | {JMP_GRPC_INSECURE: "1"}
-    if passphrase:
-        common_env = common_env | {JMP_GRPC_PASSPHRASE: passphrase}
+    common_env = dict(os.environ)
+    common_env.update(
+        {
+            JUMPSTARTER_HOST: host,
+            JMP_DRIVERS_ALLOW: "UNSAFE" if unsafe else ",".join(allow),
+            "_JMP_SUPPRESS_DRIVER_WARNINGS": "1",  # Already warned during client initialization
+        }
+    )
+    if insecure:
+        common_env[JMP_GRPC_INSECURE] = "1"
+    else:
+        common_env.pop(JMP_GRPC_INSECURE, None)
+    if passphrase is not None:
+        common_env[JMP_GRPC_PASSPHRASE] = passphrase
+    else:
+        common_env.pop(JMP_GRPC_PASSPHRASE, None)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter/jumpstarter/common/utils.py` around lines 117 -
125, common_env is built from os.environ so existing
JMP_GRPC_INSECURE/JMP_GRPC_PASSPHRASE can leak into launches; update the block
that sets common_env (and the branches using insecure and passphrase) to
explicitly remove these keys when the current launch does not want them: if
insecure is truthy set JMP_GRPC_INSECURE="1" else pop JMP_GRPC_INSECURE from
common_env; if passphrase is not None and not empty set
JMP_GRPC_PASSPHRASE=passphrase, otherwise pop JMP_GRPC_PASSPHRASE from
common_env (ensure you treat empty string and None as “unset”); refer to the
common_env variable and the JMP_GRPC_INSECURE / JMP_GRPC_PASSPHRASE symbols to
locate the change.
♻️ Duplicate comments (4)
python/packages/jumpstarter-cli/jumpstarter_cli/run.py (2)

256-260: ⚠️ Potential issue | 🟡 Minor

Reject TLS-related flags unless --tls-grpc-listener is set.

Passing --tls-grpc-insecure, --tls-cert, --tls-key, or --passphrase without --tls-grpc-listener is silently ignored, which could lead to misconfiguration.

Suggested fix
 def run(config, listener_bind, tls_insecure, tls_cert, tls_key, passphrase):
     """Run an exporter locally."""
+    if listener_bind is None and (tls_insecure or tls_cert or tls_key or passphrase):
+        raise click.UsageError(
+            "--tls-grpc-insecure, --tls-cert, --tls-key, and --passphrase require --tls-grpc-listener"
+        )
     if listener_bind is not None and config is None:
         raise click.UsageError("--exporter-config (or --exporter) is required when using --tls-grpc-listener")
     return _serve_with_exc_handling(config, listener_bind, tls_insecure, tls_cert, tls_key, passphrase)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter-cli/jumpstarter_cli/run.py` around lines 256 -
260, In run(), validate that TLS-related flags are only used when
--tls-grpc-listener is set: if listener_bind is None and any of tls_insecure,
tls_cert, tls_key, or passphrase is provided, raise click.UsageError explaining
that --tls-grpc-listener is required when using --tls-grpc-insecure, --tls-cert,
--tls-key, or --passphrase; update the existing early-check (the block
referencing listener_bind and config) to perform this additional validation
before calling _serve_with_exc_handling.

16-27: ⚠️ Potential issue | 🟡 Minor

IPv6 address parsing may fail with rsplit(":", 1).

IPv6 addresses contain colons (e.g., [::1]:1234). Using rsplit(":", 1) would incorrectly parse unbracketed IPv6 addresses. Consider handling bracketed IPv6 explicitly or documenting that IPv6 must use bracketed notation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter-cli/jumpstarter_cli/run.py` around lines 16 - 27,
The _parse_listener_bind function incorrectly splits on the last ":" and will
break for IPv6; update it to handle bracketed IPv6 first: if value starts with
"[" then find the matching "]", extract host between brackets and require a ":"
after the "]" for the port (validate brackets and port range), otherwise if no
brackets fall back to the existing host:port parsing but explicitly reject
unbracketed IPv6 by detecting multiple ":" characters and raise
click.BadParameter instructing callers to use bracketed form (e.g. [::1]:1234);
continue to default host to "0.0.0.0" when host is empty and keep the same port
range validation and param_hint "'--tls-grpc-listener'".
python/packages/jumpstarter-cli/jumpstarter_cli/shell.py (1)

427-430: ⚠️ Potential issue | 🟡 Minor

Differentiate "no config provided" from "requested config missing."

With allow_missing=True, a tuple is returned when an explicitly requested config cannot be found. The current logic treats all tuples the same as None, hiding the actual cause.

Suggested fix
     if config is None or isinstance(config, tuple):
+        if isinstance(config, tuple):
+            kind, value = config
+            if (kind, value) != ("client", "default"):
+                raise click.ClickException(f"Failed to load {kind} config: {value}")
         raise click.UsageError(
             "Specify one of: --client / --client-config, --exporter / --exporter-config, or --tls-grpc HOST:PORT"
         )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter-cli/jumpstarter_cli/shell.py` around lines 427 -
430, The current check conflates a completely missing config (config is None)
with an explicitly requested-but-missing config (config is a tuple returned when
allow_missing=True); change the logic around the variable config so that if
config is None you raise the existing generic click.UsageError, but if
isinstance(config, tuple) you raise a distinct UsageError that indicates the
requested config could not be found and include the tuple's identifying info
(e.g., extract the config name/identifier from the tuple) so callers know which
config was requested and missing; update the error text in the block that
currently raises click.UsageError to differentiate these two cases and reference
the config tuple contents when forming the missing-config message.
python/packages/jumpstarter/jumpstarter/client/client.py (1)

23-34: ⚠️ Potential issue | 🟡 Minor

Tighten _is_tcp_address() so malformed direct targets fail early.

The function accepts :1234 (empty host) as valid since it only checks that the port portion is numeric and in range. For the --tls-grpc flow, this pushes obviously bad input to channel creation rather than rejecting it upfront.

Suggested fix
 def _is_tcp_address(path: str) -> bool:
     """Return True if path looks like host:port (TCP address)."""
     if ":" not in path:
         return False
     parts = path.rsplit(":", 1)
     if len(parts) != 2:
         return False
+    host, port_str = parts
+    if not host:
+        return False  # Reject empty host like ":1234"
     try:
-        port = int(parts[1], 10)
+        port = int(port_str, 10)
         return 1 <= port <= 65535
     except ValueError:
         return False
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter/jumpstarter/client/client.py` around lines 23 -
34, The _is_tcp_address function currently accepts strings like ":1234" because
it only validates the port; update it to also validate the host portion: after
parts = path.rsplit(":", 1) ensure parts[0] is non-empty and not just whitespace
(e.g., host = parts[0].strip(); if not host: return False) before parsing the
port, so malformed direct targets with empty hosts are rejected early.
🧹 Nitpick comments (4)
python/docs/source/getting-started/guides/setup-direct-mode.md (2)

45-47: Clarify hook execution timing for direct mode.

The description states "beforeLease runs once when the exporter starts (before any client connects)". This is accurate for standalone/direct mode where there's no controller-managed lease lifecycle. However, it may help to explicitly contrast this with distributed mode where hooks run per-lease, to avoid confusion for users familiar with both modes.

Consider adding a brief note like: "In direct mode, hooks run once per exporter session since there is no controller to manage leases."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/docs/source/getting-started/guides/setup-direct-mode.md` around lines
45 - 47, Update the hooks documentation to explicitly state that in direct mode
(no controller), hooks like beforeLease and afterLease run once per exporter
session rather than per-lease; contrast this with distributed mode where hooks
execute per lease managed by the controller so users understand the different
lifecycles of beforeLease and afterLease between direct and distributed modes.

59-60: Consider adding a production TLS example.

The docs mention using --tls-cert and --tls-key for production but don't show the syntax. A brief example would help users configure secure deployments:

$ jmp run --exporter-config example-direct.yaml \
    --tls-grpc-listener 0.0.0.0:19090 \
    --tls-cert /path/to/server.crt \
    --tls-key /path/to/server.key
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/docs/source/getting-started/guides/setup-direct-mode.md` around lines
59 - 60, Add a short production TLS example showing the full CLI syntax using
the TLS flags mentioned; include the flags --tls-grpc-listener, --tls-cert, and
--tls-key together with the existing --exporter-config flag so users can copy
the exact command (e.g. show setting listener address and paths to server.crt
and server.key) and place it after the paragraph that currently references using
--tls-cert and --tls-key for production.
e2e/tests-direct-listener.bats (2)

140-146: Consider adding a brief poll for log completeness after stop_exporter.

While stop_exporter now waits for the process to exit (via wait "$pid"), there's a small race window where the process has exited but the log file hasn't been fully flushed to disk. On slower runners or under I/O pressure, the cat immediately after may miss the final log lines.

A brief bounded poll (similar to the readiness checks) would make this more robust:

Suggested fix
   # Stop the exporter (SIGTERM triggers _cleanup_after_lease).
   # stop_exporter waits for the process to exit, so the log is complete.
   stop_exporter
+  
+  # Brief poll to ensure log is fully flushed
+  local retries=10
+  while ! grep -q "AFTER_HOOK_DIRECT: executed" "${BATS_TEST_TMPDIR}/exporter.log" 2>/dev/null; do
+    retries=$((retries - 1))
+    [ "$retries" -gt 0 ] || { echo "afterLease hook output not found" >&2; break; }
+    sleep 0.1
+  done

   # afterLease hook output should appear in the exporter's stderr log
   run cat "${BATS_TEST_TMPDIR}/exporter.log"

Based on learnings: E2E tests in bats should be runnable in isolation. Do not rely on prior suite setup or shared state across tests.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@e2e/tests-direct-listener.bats` around lines 140 - 146, Add a short bounded
poll after calling stop_exporter and before cat-ing the log to ensure
exporter.log has the final lines; specifically, after stop_exporter call in the
test that asserts "AFTER_HOOK_DIRECT: executed", loop with a small sleep and
timeout checking for the presence of that string in
"${BATS_TEST_TMPDIR}/exporter.log" (similar to existing readiness checks) and
only then run cat and assert_output, so stop_exporter and the assertion reliably
see the flushed log contents.

6-7: Fixed port may cause issues in certain CI environments.

If tests run in parallel or the port is in use, tests will fail. Consider using a dynamic port or adding a port-availability check. For now, this is acceptable if tests run sequentially.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@e2e/tests-direct-listener.bats` around lines 6 - 7, The test currently
hardcodes LISTENER_PORT=19090 (and uses LISTENER_PID) which can fail in CI or
parallel runs; update the test to pick a dynamic free port instead of a fixed
one: implement a small port-selection step that finds an available TCP port
(e.g., try random high ports or use a helper like `nc -z`/`ss`/`lsof` probe
loop) and export that value to LISTENER_PORT before starting the listener, and
keep LISTENER_PID logic unchanged so teardown still kills the right process if
startup fails then retry or fail with a clear message; ensure the logic that
references LISTENER_PORT (in tests-direct-listener.bats) reads this computed
value.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@python/packages/jumpstarter-cli/jumpstarter_cli/run.py`:
- Line 212: The function signature for _serve_with_exc_handling is too long and
exceeds the 120-character limit; split the parameters across multiple lines to
wrap the signature (for example place each parameter on its own indented line or
group related parameters on separate lines) so the def
_serve_with_exc_handling(...): line is under 120 characters while preserving the
same parameter names (config, listener_bind, tls_insecure, tls_cert, tls_key,
passphrase) and any default values.

In `@python/packages/jumpstarter/jumpstarter/client/client.py`:
- Around line 78-82: The Unix socket branch creates a grpc.aio.secure_channel
without passing the interceptors, so client-side interceptors (e.g., passphrase
metadata injection) are skipped; update the async with
grpc.aio.secure_channel(...) call in client.client.Client (the block that
currently yields await client_from_channel(channel, portal, stack, allow,
unsafe)) to include interceptors=interceptors so the same interceptors used for
TCP connections are applied to UDS, ensuring passphrase auth is injected
consistently across connection types.

In `@python/packages/jumpstarter/jumpstarter/exporter/auth.py`:
- Around line 23-38: intercept_service currently returns a fixed unary-unary
_unauthenticated_handler which breaks streaming RPCs; instead call
continuation(handler_call_details) once, inspect its returned RpcMethodHandler
(the continuation result), and construct a new RpcMethodHandler that wraps each
of its callable attributes (unary_unary, unary_stream, stream_unary,
stream_stream) to call _abort_unauthenticated (or return an appropriate
coroutine) so the cardinality matches; replace returning
_unauthenticated_handler with this synthesized handler built from the
continuation result so streaming vs unary semantics from continuation are
preserved while still aborting via _abort_unauthenticated.

In `@python/packages/jumpstarter/jumpstarter/exporter/session_test.py`:
- Around line 3-10: Ruff complains about import ordering: group and alphabetize
the jumpstarter package imports so they pass lint; specifically reorder the
local imports so from jumpstarter.common.utils import serve, from
jumpstarter.driver import Driver, then from jumpstarter.exporter.auth import
PASSPHRASE_METADATA_KEY, PassphraseInterceptor, and finally from
jumpstarter.exporter.session import Session (i.e., swap the export.auth and
export.session imports) to satisfy I001.

---

Outside diff comments:
In `@python/packages/jumpstarter/jumpstarter/common/utils.py`:
- Around line 117-125: common_env is built from os.environ so existing
JMP_GRPC_INSECURE/JMP_GRPC_PASSPHRASE can leak into launches; update the block
that sets common_env (and the branches using insecure and passphrase) to
explicitly remove these keys when the current launch does not want them: if
insecure is truthy set JMP_GRPC_INSECURE="1" else pop JMP_GRPC_INSECURE from
common_env; if passphrase is not None and not empty set
JMP_GRPC_PASSPHRASE=passphrase, otherwise pop JMP_GRPC_PASSPHRASE from
common_env (ensure you treat empty string and None as “unset”); refer to the
common_env variable and the JMP_GRPC_INSECURE / JMP_GRPC_PASSPHRASE symbols to
locate the change.

---

Duplicate comments:
In `@python/packages/jumpstarter-cli/jumpstarter_cli/run.py`:
- Around line 256-260: In run(), validate that TLS-related flags are only used
when --tls-grpc-listener is set: if listener_bind is None and any of
tls_insecure, tls_cert, tls_key, or passphrase is provided, raise
click.UsageError explaining that --tls-grpc-listener is required when using
--tls-grpc-insecure, --tls-cert, --tls-key, or --passphrase; update the existing
early-check (the block referencing listener_bind and config) to perform this
additional validation before calling _serve_with_exc_handling.
- Around line 16-27: The _parse_listener_bind function incorrectly splits on the
last ":" and will break for IPv6; update it to handle bracketed IPv6 first: if
value starts with "[" then find the matching "]", extract host between brackets
and require a ":" after the "]" for the port (validate brackets and port range),
otherwise if no brackets fall back to the existing host:port parsing but
explicitly reject unbracketed IPv6 by detecting multiple ":" characters and
raise click.BadParameter instructing callers to use bracketed form (e.g.
[::1]:1234); continue to default host to "0.0.0.0" when host is empty and keep
the same port range validation and param_hint "'--tls-grpc-listener'".

In `@python/packages/jumpstarter-cli/jumpstarter_cli/shell.py`:
- Around line 427-430: The current check conflates a completely missing config
(config is None) with an explicitly requested-but-missing config (config is a
tuple returned when allow_missing=True); change the logic around the variable
config so that if config is None you raise the existing generic
click.UsageError, but if isinstance(config, tuple) you raise a distinct
UsageError that indicates the requested config could not be found and include
the tuple's identifying info (e.g., extract the config name/identifier from the
tuple) so callers know which config was requested and missing; update the error
text in the block that currently raises click.UsageError to differentiate these
two cases and reference the config tuple contents when forming the
missing-config message.

In `@python/packages/jumpstarter/jumpstarter/client/client.py`:
- Around line 23-34: The _is_tcp_address function currently accepts strings like
":1234" because it only validates the port; update it to also validate the host
portion: after parts = path.rsplit(":", 1) ensure parts[0] is non-empty and not
just whitespace (e.g., host = parts[0].strip(); if not host: return False)
before parsing the port, so malformed direct targets with empty hosts are
rejected early.

---

Nitpick comments:
In `@e2e/tests-direct-listener.bats`:
- Around line 140-146: Add a short bounded poll after calling stop_exporter and
before cat-ing the log to ensure exporter.log has the final lines; specifically,
after stop_exporter call in the test that asserts "AFTER_HOOK_DIRECT: executed",
loop with a small sleep and timeout checking for the presence of that string in
"${BATS_TEST_TMPDIR}/exporter.log" (similar to existing readiness checks) and
only then run cat and assert_output, so stop_exporter and the assertion reliably
see the flushed log contents.
- Around line 6-7: The test currently hardcodes LISTENER_PORT=19090 (and uses
LISTENER_PID) which can fail in CI or parallel runs; update the test to pick a
dynamic free port instead of a fixed one: implement a small port-selection step
that finds an available TCP port (e.g., try random high ports or use a helper
like `nc -z`/`ss`/`lsof` probe loop) and export that value to LISTENER_PORT
before starting the listener, and keep LISTENER_PID logic unchanged so teardown
still kills the right process if startup fails then retry or fail with a clear
message; ensure the logic that references LISTENER_PORT (in
tests-direct-listener.bats) reads this computed value.

In `@python/docs/source/getting-started/guides/setup-direct-mode.md`:
- Around line 45-47: Update the hooks documentation to explicitly state that in
direct mode (no controller), hooks like beforeLease and afterLease run once per
exporter session rather than per-lease; contrast this with distributed mode
where hooks execute per lease managed by the controller so users understand the
different lifecycles of beforeLease and afterLease between direct and
distributed modes.
- Around line 59-60: Add a short production TLS example showing the full CLI
syntax using the TLS flags mentioned; include the flags --tls-grpc-listener,
--tls-cert, and --tls-key together with the existing --exporter-config flag so
users can copy the exact command (e.g. show setting listener address and paths
to server.crt and server.key) and place it after the paragraph that currently
references using --tls-cert and --tls-key for production.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4cca6dda-ebac-4665-a677-df1a1a7d392b

📥 Commits

Reviewing files that changed from the base of the PR and between 3ee4798 and c7b2655.

📒 Files selected for processing (21)
  • e2e/exporters/exporter-direct-hooks-before.yaml
  • e2e/exporters/exporter-direct-hooks-both.yaml
  • e2e/exporters/exporter-direct-listener.yaml
  • e2e/tests-direct-listener.bats
  • python/docs/source/getting-started/guides/index.md
  • python/docs/source/getting-started/guides/setup-direct-mode.md
  • python/packages/jumpstarter-cli/jumpstarter_cli/run.py
  • python/packages/jumpstarter-cli/jumpstarter_cli/shell.py
  • python/packages/jumpstarter/jumpstarter/client/__init__.py
  • python/packages/jumpstarter/jumpstarter/client/client.py
  • python/packages/jumpstarter/jumpstarter/client/client_test.py
  • python/packages/jumpstarter/jumpstarter/client/lease.py
  • python/packages/jumpstarter/jumpstarter/common/grpc.py
  • python/packages/jumpstarter/jumpstarter/common/utils.py
  • python/packages/jumpstarter/jumpstarter/config/env.py
  • python/packages/jumpstarter/jumpstarter/config/exporter.py
  • python/packages/jumpstarter/jumpstarter/exporter/auth.py
  • python/packages/jumpstarter/jumpstarter/exporter/exporter.py
  • python/packages/jumpstarter/jumpstarter/exporter/session.py
  • python/packages/jumpstarter/jumpstarter/exporter/session_test.py
  • python/packages/jumpstarter/jumpstarter/utils/env.py
🚧 Files skipped from review as they are similar to previous changes (6)
  • e2e/exporters/exporter-direct-hooks-before.yaml
  • python/docs/source/getting-started/guides/index.md
  • python/packages/jumpstarter/jumpstarter/utils/env.py
  • e2e/exporters/exporter-direct-hooks-both.yaml
  • python/packages/jumpstarter/jumpstarter/client/client_test.py
  • python/packages/jumpstarter/jumpstarter/client/lease.py

Comment on lines +78 to +82
else:
async with grpc.aio.secure_channel(
f"unix://{path}", grpc.local_channel_credentials(grpc.LocalConnectionType.UDS)
) as channel:
yield await client_from_channel(channel, portal, stack, allow, unsafe)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if Unix socket connections are expected to use passphrase/interceptors
rg -n -C3 'unix.*passphrase|passphrase.*unix' python/

Repository: jumpstarter-dev/jumpstarter

Length of output: 53


🏁 Script executed:

# Read the context around lines 78-82 to understand the full function
sed -n '1,120p' python/packages/jumpstarter/jumpstarter/client/client.py | cat -n

Repository: jumpstarter-dev/jumpstarter

Length of output: 4891


🏁 Script executed:

# Also search for where interceptors are defined/used in the TCP connection path
rg -n 'interceptor' python/packages/jumpstarter/jumpstarter/client/ -A 2 -B 2

Repository: jumpstarter-dev/jumpstarter

Length of output: 2347


🏁 Script executed:

# Search for passphrase usage in the codebase
rg -n 'passphrase' python/packages/jumpstarter/ --max-count=20

Repository: jumpstarter-dev/jumpstarter

Length of output: 4547


Unix socket connections should pass interceptors but currently don't—this appears to be a bug.

The passphrase parameter is documented as being "injected as metadata on every RPC via client interceptors," yet the Unix socket connection (lines 79–82) omits the interceptors parameter, unlike TCP connections. This contradicts the comment in session.py (lines 188–189) which explicitly states "interceptors (e.g. passphrase auth) apply to ALL ports including the Unix socket." Add interceptors=interceptors to the grpc.aio.secure_channel() call to ensure consistent auth behavior across connection types, or clarify why Unix sockets intentionally bypass passphrase auth.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter/jumpstarter/client/client.py` around lines 78 -
82, The Unix socket branch creates a grpc.aio.secure_channel without passing the
interceptors, so client-side interceptors (e.g., passphrase metadata injection)
are skipped; update the async with grpc.aio.secure_channel(...) call in
client.client.Client (the block that currently yields await
client_from_channel(channel, portal, stack, allow, unsafe)) to include
interceptors=interceptors so the same interceptors used for TCP connections are
applied to UDS, ensuring passphrase auth is injected consistently across
connection types.

Comment on lines +841 to +883
self._standalone = True
lease_scope = LeaseContext(lease_name="standalone", before_lease_hook=Event())
self._lease_context = lease_scope

with TemporarySocket() as hook_path:
hook_path_str = str(hook_path)
with Session(
uuid=self.uuid,
labels=self.labels,
root_device=self.device_factory(),
) as session:
session.lease_context = lease_scope
lease_scope.session = session
lease_scope.socket_path = hook_path_str
lease_scope.hook_socket_path = hook_path_str

async with session.serve_tcp_and_unix_async(
host, port, hook_path_str,
tls_credentials=tls_credentials,
interceptors=interceptors,
):
try:
async with create_task_group() as tg:
self._tg = tg
tg.start_soon(self._handle_end_session, lease_scope)

if self.hook_executor:
await self.hook_executor.run_before_lease_hook(
lease_scope,
self._report_status,
self.stop,
self._request_lease_release,
)
else:
await self._report_status(ExporterStatus.LEASE_READY, "Ready for commands")
lease_scope.before_lease_hook.set()

await _standalone_shutdown_waiter()
finally:
await self._cleanup_after_lease(lease_scope)

self._lease_context = None
self._tg = None
Copy link
Contributor

@coderabbitai coderabbitai bot Mar 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reset standalone mode state on every exit path.

self._standalone is set before any of the bind/context-manager work, but it's never cleared. If serve_standalone_tcp() exits early or the same Exporter instance is reused, later _report_status() / _request_lease_release() calls will keep behaving as controller-less. Move _standalone = False and the _lease_context / _tg cleanup into an outer finally.

Suggested fix
     async def serve_standalone_tcp(
         self,
         host: str,
         port: int,
@@
     ) -> None:
@@
         """
         self._standalone = True
         lease_scope = LeaseContext(lease_name="standalone", before_lease_hook=Event())
         self._lease_context = lease_scope
-
-        with TemporarySocket() as hook_path:
-            hook_path_str = str(hook_path)
-            with Session(
-                uuid=self.uuid,
-                labels=self.labels,
-                root_device=self.device_factory(),
-            ) as session:
-                session.lease_context = lease_scope
-                lease_scope.session = session
-                lease_scope.socket_path = hook_path_str
-                lease_scope.hook_socket_path = hook_path_str
-
-                async with session.serve_tcp_and_unix_async(
-                    host, port, hook_path_str,
-                    tls_credentials=tls_credentials,
-                    interceptors=interceptors,
-                ):
-                    try:
-                        async with create_task_group() as tg:
-                            self._tg = tg
-                            tg.start_soon(self._handle_end_session, lease_scope)
-
-                            if self.hook_executor:
-                                await self.hook_executor.run_before_lease_hook(
-                                    lease_scope,
-                                    self._report_status,
-                                    self.stop,
-                                    self._request_lease_release,
-                                )
-                            else:
-                                await self._report_status(ExporterStatus.LEASE_READY, "Ready for commands")
-                                lease_scope.before_lease_hook.set()
-
-                            await _standalone_shutdown_waiter()
-                    finally:
-                        await self._cleanup_after_lease(lease_scope)
-
-        self._lease_context = None
-        self._tg = None
+        try:
+            with TemporarySocket() as hook_path:
+                hook_path_str = str(hook_path)
+                with Session(
+                    uuid=self.uuid,
+                    labels=self.labels,
+                    root_device=self.device_factory(),
+                ) as session:
+                    session.lease_context = lease_scope
+                    lease_scope.session = session
+                    lease_scope.socket_path = hook_path_str
+                    lease_scope.hook_socket_path = hook_path_str
+
+                    async with session.serve_tcp_and_unix_async(
+                        host, port, hook_path_str,
+                        tls_credentials=tls_credentials,
+                        interceptors=interceptors,
+                    ):
+                        try:
+                            async with create_task_group() as tg:
+                                self._tg = tg
+                                tg.start_soon(self._handle_end_session, lease_scope)
+
+                                if self.hook_executor:
+                                    await self.hook_executor.run_before_lease_hook(
+                                        lease_scope,
+                                        self._report_status,
+                                        self.stop,
+                                        self._request_lease_release,
+                                    )
+                                else:
+                                    await self._report_status(ExporterStatus.LEASE_READY, "Ready for commands")
+                                    lease_scope.before_lease_hook.set()
+
+                                await _standalone_shutdown_waiter()
+                        finally:
+                            await self._cleanup_after_lease(lease_scope)
+        finally:
+            self._standalone = False
+            self._lease_context = None
+            self._tg = None

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really need to clear it? why? , the exporter will exit, and that's it. A exporter instance does not run in both standalone and distributed mode at once

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mangelajo, you're right — my concern doesn't apply here. Based on the fork-based architecture (where the child process terminates when the serve method exits and the parent restarts a fresh process), the same Exporter instance will never transition from standalone mode back to controller mode. Clearing _standalone is unnecessary. I'll retract this suggestion.

(╯°□°)╯


🧠 Learnings used
Learnt from: bennyz
Repo: jumpstarter-dev/jumpstarter PR: 199
File: python/packages/jumpstarter/jumpstarter/client/lease.py:68-76
Timestamp: 2026-02-06T11:33:27.799Z
Learning: In `python/packages/jumpstarter/jumpstarter/client/lease.py`, the `refresh_channel()` method should NOT call `close()` on the old channel before replacing it. Token refresh happens asynchronously while RPCs may be in-flight, and calling `close()` would immediately terminate active RPCs. Instead, rely on Python's reference counting to clean up the old channel after in-flight operations complete.

Learnt from: michalskrivanek
Repo: jumpstarter-dev/jumpstarter PR: 589
File: packages/jumpstarter/jumpstarter/exporter/exporter.py:147-155
Timestamp: 2025-08-14T13:11:35.034Z
Learning: In the jumpstarter fork-based architecture, when the exporter's serve() method exits (e.g., due to lease changes), the child process terminates and the parent process automatically restarts it, eliminating concerns about orphaned tasks since the entire process is restarted.

@bennyz bennyz force-pushed the exporter-grpc-listener branch from cccce0a to f351e0b Compare March 16, 2026 13:15
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
python/packages/jumpstarter/jumpstarter/client/client.py (1)

79-82: ⚠️ Potential issue | 🟠 Major

Apply passphrase interceptors on Unix socket channels too.

At Line 79, the Unix grpc.aio.secure_channel(...) path omits interceptors, so passphrase metadata injection is skipped there.

Proposed fix
     else:
         async with grpc.aio.secure_channel(
-            f"unix://{path}", grpc.local_channel_credentials(grpc.LocalConnectionType.UDS)
+            f"unix://{path}",
+            grpc.local_channel_credentials(grpc.LocalConnectionType.UDS),
+            interceptors=interceptors,
         ) as channel:
             yield await client_from_channel(channel, portal, stack, allow, unsafe)
#!/bin/bash
# Verify interceptor wiring symmetry across TCP/Unix branches and passphrase env bridge.
rg -n -C4 'passphrase_client_interceptors|insecure_channel|aio_secure_channel|secure_channel\(' \
  python/packages/jumpstarter/jumpstarter/client/client.py

rg -n -C3 'JMP_GRPC_PASSPHRASE|client_from_path\(' \
  python/packages/jumpstarter/jumpstarter/utils/env.py
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter/jumpstarter/client/client.py` around lines 79 -
82, The Unix socket branch uses grpc.aio.secure_channel(...) without passing the
interceptors, so passphrase metadata injection is skipped; update the Unix
branch where grpc.aio.secure_channel(f"unix://{path}",
grpc.local_channel_credentials(...)) is created to include the same interceptors
used for the TCP branch (e.g. passphrase_client_interceptors or the variable
passed into client_from_channel) so that client_from_channel(...) receives a
channel with passphrase interceptors applied; ensure the interceptors argument
mirrors the TCP/insecure_channel usage and that client_from_channel,
passphrase_client_interceptors, and any env-bridging (JMP_GRPC_PASSPHRASE)
behavior remain unchanged.
python/packages/jumpstarter/jumpstarter/exporter/session_test.py (1)

3-10: ⚠️ Potential issue | 🟡 Minor

Fix Ruff I001 import ordering in this module.

The import block is still unsorted, and lint-python is failing on this exact segment.

Proposed fix
 from jumpstarter.common.utils import serve
 from jumpstarter.driver import Driver
-from jumpstarter.exporter.session import Session
 from jumpstarter.exporter.auth import PASSPHRASE_METADATA_KEY, PassphraseInterceptor
+from jumpstarter.exporter.session import Session
As per coding guidelines, "Use Ruff for code formatting and linting on all Python files except `jumpstarter-protocol`, invoked via `make lint` and `make lint-fix`".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter/jumpstarter/exporter/session_test.py` around
lines 3 - 10, The import block in session_test.py is unsorted causing Ruff I001;
reorder the imports into grouped, alphabetized order (standard library,
third-party, then local) and ensure a blank line separates third-party and local
imports—specifically sort the third-party imports so google.protobuf.empty_pb2
and grpc and pytest are alphabetized and then keep the local jumpstarter imports
(from jumpstarter.common.utils import serve, from jumpstarter.driver import
Driver, from jumpstarter.exporter.session import Session, from
jumpstarter.exporter.auth import PASSPHRASE_METADATA_KEY, PassphraseInterceptor)
below them; run Ruff or make lint-fix (e.g., ruff --fix or make lint-fix) to
apply the change automatically.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@python/packages/jumpstarter/jumpstarter/exporter/session.py`:
- Around line 153-158: The code currently calls server.add_secure_port() /
server.add_insecure_port() and proceeds to server.start() without checking the
return value; since these methods return 0 on failure, add explicit validation
after each call to server.add_secure_port(address, tls_credentials) and
server.add_insecure_port(address) (the bound_port variable) and if bound_port ==
0 raise or log an error and abort startup (or raise an exception) so the server
does not start silently unbound; apply the same check in the second bind/start
site referenced around the server.start() calls so both bind locations validate
bound_port before calling server.start().

---

Duplicate comments:
In `@python/packages/jumpstarter/jumpstarter/client/client.py`:
- Around line 79-82: The Unix socket branch uses grpc.aio.secure_channel(...)
without passing the interceptors, so passphrase metadata injection is skipped;
update the Unix branch where grpc.aio.secure_channel(f"unix://{path}",
grpc.local_channel_credentials(...)) is created to include the same interceptors
used for the TCP branch (e.g. passphrase_client_interceptors or the variable
passed into client_from_channel) so that client_from_channel(...) receives a
channel with passphrase interceptors applied; ensure the interceptors argument
mirrors the TCP/insecure_channel usage and that client_from_channel,
passphrase_client_interceptors, and any env-bridging (JMP_GRPC_PASSPHRASE)
behavior remain unchanged.

In `@python/packages/jumpstarter/jumpstarter/exporter/session_test.py`:
- Around line 3-10: The import block in session_test.py is unsorted causing Ruff
I001; reorder the imports into grouped, alphabetized order (standard library,
third-party, then local) and ensure a blank line separates third-party and local
imports—specifically sort the third-party imports so google.protobuf.empty_pb2
and grpc and pytest are alphabetized and then keep the local jumpstarter imports
(from jumpstarter.common.utils import serve, from jumpstarter.driver import
Driver, from jumpstarter.exporter.session import Session, from
jumpstarter.exporter.auth import PASSPHRASE_METADATA_KEY, PassphraseInterceptor)
below them; run Ruff or make lint-fix (e.g., ruff --fix or make lint-fix) to
apply the change automatically.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8b2bc9ef-84a6-4b99-aa43-15192d940e16

📥 Commits

Reviewing files that changed from the base of the PR and between c7b2655 and cccce0a.

📒 Files selected for processing (15)
  • e2e/tests-direct-listener.bats
  • python/docs/source/getting-started/guides/index.md
  • python/docs/source/getting-started/guides/setup-direct-mode.md
  • python/packages/jumpstarter-cli/jumpstarter_cli/run.py
  • python/packages/jumpstarter-cli/jumpstarter_cli/shell.py
  • python/packages/jumpstarter/jumpstarter/client/client.py
  • python/packages/jumpstarter/jumpstarter/client/lease.py
  • python/packages/jumpstarter/jumpstarter/common/grpc.py
  • python/packages/jumpstarter/jumpstarter/common/utils.py
  • python/packages/jumpstarter/jumpstarter/config/env.py
  • python/packages/jumpstarter/jumpstarter/exporter/auth.py
  • python/packages/jumpstarter/jumpstarter/exporter/exporter.py
  • python/packages/jumpstarter/jumpstarter/exporter/session.py
  • python/packages/jumpstarter/jumpstarter/exporter/session_test.py
  • python/packages/jumpstarter/jumpstarter/utils/env.py
🚧 Files skipped from review as they are similar to previous changes (7)
  • e2e/tests-direct-listener.bats
  • python/packages/jumpstarter/jumpstarter/exporter/auth.py
  • python/docs/source/getting-started/guides/index.md
  • python/packages/jumpstarter/jumpstarter/common/grpc.py
  • python/packages/jumpstarter-cli/jumpstarter_cli/run.py
  • python/packages/jumpstarter/jumpstarter/utils/env.py
  • python/packages/jumpstarter/jumpstarter/config/env.py

@bennyz bennyz force-pushed the exporter-grpc-listener branch from f351e0b to 4a93476 Compare March 16, 2026 14:09
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (4)
python/packages/jumpstarter/jumpstarter/client/client.py (2)

78-82: ⚠️ Potential issue | 🟠 Major

UDS branch drops passphrase interceptors (and channel options).

passphrase is documented as metadata on every RPC, but the Unix socket grpc.aio.secure_channel(...) call does not pass interceptors, so passphrase auth is skipped on this path.

Proposed fix
     else:
         async with grpc.aio.secure_channel(
-            f"unix://{path}", grpc.local_channel_credentials(grpc.LocalConnectionType.UDS)
+            f"unix://{path}",
+            grpc.local_channel_credentials(grpc.LocalConnectionType.UDS),
+            options=_override_default_grpc_options(grpc_options),
+            interceptors=interceptors,
         ) as channel:
             yield await client_from_channel(channel, portal, stack, allow, unsafe)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter/jumpstarter/client/client.py` around lines 78 -
82, The UDS branch using grpc.aio.secure_channel(f"unix://{path}",
grpc.local_channel_credentials(...)) omits the interceptors and channel options
so per-RPC passphrase metadata and any channel options are skipped; update the
else branch to create the channel with the same interceptors and channel options
used in the non-UDS path (pass the interceptors and options into
grpc.aio.secure_channel) before calling client_from_channel(portal, stack,
allow, unsafe) so the passphrase metadata logic is preserved; ensure you
reference the existing passphrase interceptor and any channel options variables
and pass them into grpc.aio.secure_channel and maintain the await
client_from_channel(...) call.

23-34: ⚠️ Potential issue | 🟡 Minor

Harden _is_tcp_address validation to reject malformed targets.

This parser still accepts invalid forms like :1234 and unbracketed IPv6 (e.g. ::1) because it only validates the final :PORT. That defers obvious input errors to channel creation / first RPC instead of failing fast.

Proposed fix
 def _is_tcp_address(path: str) -> bool:
     """Return True if path looks like host:port (TCP address)."""
-    if ":" not in path:
+    if ":" not in path:
         return False
-    parts = path.rsplit(":", 1)
-    if len(parts) != 2:
-        return False
+
+    if path.startswith("["):
+        end = path.find("]")
+        if end == -1:
+            return False
+        host = path[1:end]
+        rest = path[end + 1 :]
+        if not host or not rest.startswith(":"):
+            return False
+        port_s = rest[1:]
+    else:
+        host, sep, port_s = path.rpartition(":")
+        # Require non-empty host and reject unbracketed IPv6
+        if not sep or not host or ":" in host:
+            return False
+
     try:
-        port = int(parts[1], 10)
+        port = int(port_s, 10)
         return 1 <= port <= 65535
     except ValueError:
         return False
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter/jumpstarter/client/client.py` around lines 23 -
34, The _is_tcp_address function currently only checks the trailing :PORT and
accepts malformed inputs like ":1234" and unbracketed IPv6 (e.g. "::1"); update
it to validate host and port properly by: when path starts with '[' require a
matching ']' and parse host as the bracketed IPv6 literal with a single ":"
following it for the port; otherwise ensure there is exactly one ':' (no extra
colons) and the host portion is non-empty and contains no ':' (to reject
unbracketed IPv6 and bare ports); then parse and validate the port range as
before. Reference _is_tcp_address to locate and fix this logic.
python/packages/jumpstarter/jumpstarter/exporter/exporter.py (1)

841-883: ⚠️ Potential issue | 🟠 Major

Handle early-stop race and reset standalone state in an outer finally.

If stop() lands before self._tg is assigned, _stop_requested is set but this path never checks it, so _standalone_shutdown_waiter() can hang. Also, standalone flags are only reset on the happy path.

🛠️ Suggested fix
     async def serve_standalone_tcp(
@@
-        self._standalone = True
-        lease_scope = LeaseContext(lease_name="standalone", before_lease_hook=Event())
-        self._lease_context = lease_scope
-
-        with TemporarySocket() as hook_path:
-            ...
-                async with session.serve_tcp_and_unix_async(
-                    host, port, hook_path_str,
-                    tls_credentials=tls_credentials,
-                    interceptors=interceptors,
-                ):
-                    try:
-                        async with create_task_group() as tg:
-                            self._tg = tg
-                            tg.start_soon(self._handle_end_session, lease_scope)
-                            ...
-                            await _standalone_shutdown_waiter()
-                    finally:
-                        await self._cleanup_after_lease(lease_scope)
-
-        self._lease_context = None
-        self._tg = None
+        self._standalone = True
+        lease_scope = LeaseContext(lease_name="standalone", before_lease_hook=Event())
+        self._lease_context = lease_scope
+        try:
+            with TemporarySocket() as hook_path:
+                ...
+                    async with session.serve_tcp_and_unix_async(
+                        host, port, hook_path_str,
+                        tls_credentials=tls_credentials,
+                        interceptors=interceptors,
+                    ):
+                        try:
+                            async with create_task_group() as tg:
+                                self._tg = tg
+                                if self._stop_requested:
+                                    tg.cancel_scope.cancel()
+                                tg.start_soon(self._handle_end_session, lease_scope)
+                                ...
+                                await _standalone_shutdown_waiter()
+                        finally:
+                            await self._cleanup_after_lease(lease_scope)
+        finally:
+            self._standalone = False
+            self._lease_context = None
+            self._tg = None
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter/jumpstarter/exporter/exporter.py` around lines
841 - 883, The standalone path can hang if stop() sets _stop_requested before
self._tg is assigned and standalone flags are only cleared on the happy path; to
fix, wrap the entire standalone block in an outer try/finally that always resets
self._standalone = False, self._lease_context = None and self._tg = None in the
finally, ensure self._tg is assigned (e.g., set to the task group handle) before
starting background tasks, and add an early check of self._stop_requested (or
call stop condition) before awaiting _standalone_shutdown_waiter() so the waiter
exits if a stop was requested; keep calling _cleanup_after_lease(lease_scope) in
the inner finally as shown.
python/packages/jumpstarter-cli/jumpstarter_cli/shell.py (1)

432-435: ⚠️ Potential issue | 🟡 Minor

Differentiate “no config provided” vs “requested config missing.”

This still maps tuple-valued config outcomes to a generic usage error, which hides explicit config lookup failures.

🛠️ Suggested fix
-    if config is None or isinstance(config, tuple):
+    if config is None:
         raise click.UsageError(
             "Specify one of: --client / --client-config, --exporter / --exporter-config, or --tls-grpc HOST:PORT"
         )
+    if isinstance(config, tuple):
+        kind, value = config
+        raise click.ClickException(f"Failed to load {kind}: {value}")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter-cli/jumpstarter_cli/shell.py` around lines 432 -
435, Replace the single generic UsageError for both missing and failed config
lookups with two distinct error paths: if config is None, raise click.UsageError
with the existing "Specify one of: ..." message to indicate no config was
provided; if isinstance(config, tuple), treat it as a lookup failure and raise a
more specific error (e.g., click.UsageError or click.FileError) that includes
the tuple details (such as the requested key/name and the underlying error
message) so callers can see which requested config was missing or why the lookup
failed; update the conditional around the config variable in the same block to
implement these two separate branches and ensure the error text references the
tuple contents for debugging.
🧹 Nitpick comments (1)
python/packages/jumpstarter/jumpstarter/common/utils.py (1)

96-97: Update launch_shell docstring for new parameters.

insecure and passphrase were added to the signature but are still missing from the Args section.

✏️ Suggested docstring patch
 def launch_shell(
@@
     Args:
         host: The jumpstarter host path
         context: The context of the shell (e.g. "local" or exporter name)
         allow: List of allowed drivers
         unsafe: Whether to allow drivers outside of the allow list
         use_profiles: Whether to load shell profile files
         command: Optional command to run instead of launching an interactive shell
         lease: Optional Lease object to set up lease ending callback
+        insecure: Whether to force insecure gRPC mode for launched subprocesses
+        passphrase: Optional passphrase propagated to subprocesses for gRPC auth
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter/jumpstarter/common/utils.py` around lines 96 -
97, The docstring for launch_shell is missing entries for the newly added
parameters insecure and passphrase; update the Args section of the launch_shell
function's docstring to document insecure: bool (default False) describing its
effect on connection security, and passphrase: Optional[str] describing its
purpose and usage (e.g., for unlocking keys or encrypted connections), including
default behavior when None and any validation or side-effects.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@python/packages/jumpstarter-cli/jumpstarter_cli/run.py`:
- Around line 18-24: The port parsing currently calls int(...) on port_str or
value and lets ValueError bubble up; wrap both int conversions in a try/except
that catches ValueError (and optionally TypeError) and raises click.BadParameter
with a clear message including the original value (e.g., f"Invalid port:
{value}"). Update the parsing logic in run.py where the ":" branch (host,
port_str = value.rsplit(...)) and the else branch convert to int so they both
raise click.BadParameter instead of raw ValueError, referencing the same
parameter name "value" in the error message.

---

Duplicate comments:
In `@python/packages/jumpstarter-cli/jumpstarter_cli/shell.py`:
- Around line 432-435: Replace the single generic UsageError for both missing
and failed config lookups with two distinct error paths: if config is None,
raise click.UsageError with the existing "Specify one of: ..." message to
indicate no config was provided; if isinstance(config, tuple), treat it as a
lookup failure and raise a more specific error (e.g., click.UsageError or
click.FileError) that includes the tuple details (such as the requested key/name
and the underlying error message) so callers can see which requested config was
missing or why the lookup failed; update the conditional around the config
variable in the same block to implement these two separate branches and ensure
the error text references the tuple contents for debugging.

In `@python/packages/jumpstarter/jumpstarter/client/client.py`:
- Around line 78-82: The UDS branch using
grpc.aio.secure_channel(f"unix://{path}", grpc.local_channel_credentials(...))
omits the interceptors and channel options so per-RPC passphrase metadata and
any channel options are skipped; update the else branch to create the channel
with the same interceptors and channel options used in the non-UDS path (pass
the interceptors and options into grpc.aio.secure_channel) before calling
client_from_channel(portal, stack, allow, unsafe) so the passphrase metadata
logic is preserved; ensure you reference the existing passphrase interceptor and
any channel options variables and pass them into grpc.aio.secure_channel and
maintain the await client_from_channel(...) call.
- Around line 23-34: The _is_tcp_address function currently only checks the
trailing :PORT and accepts malformed inputs like ":1234" and unbracketed IPv6
(e.g. "::1"); update it to validate host and port properly by: when path starts
with '[' require a matching ']' and parse host as the bracketed IPv6 literal
with a single ":" following it for the port; otherwise ensure there is exactly
one ':' (no extra colons) and the host portion is non-empty and contains no ':'
(to reject unbracketed IPv6 and bare ports); then parse and validate the port
range as before. Reference _is_tcp_address to locate and fix this logic.

In `@python/packages/jumpstarter/jumpstarter/exporter/exporter.py`:
- Around line 841-883: The standalone path can hang if stop() sets
_stop_requested before self._tg is assigned and standalone flags are only
cleared on the happy path; to fix, wrap the entire standalone block in an outer
try/finally that always resets self._standalone = False, self._lease_context =
None and self._tg = None in the finally, ensure self._tg is assigned (e.g., set
to the task group handle) before starting background tasks, and add an early
check of self._stop_requested (or call stop condition) before awaiting
_standalone_shutdown_waiter() so the waiter exits if a stop was requested; keep
calling _cleanup_after_lease(lease_scope) in the inner finally as shown.

---

Nitpick comments:
In `@python/packages/jumpstarter/jumpstarter/common/utils.py`:
- Around line 96-97: The docstring for launch_shell is missing entries for the
newly added parameters insecure and passphrase; update the Args section of the
launch_shell function's docstring to document insecure: bool (default False)
describing its effect on connection security, and passphrase: Optional[str]
describing its purpose and usage (e.g., for unlocking keys or encrypted
connections), including default behavior when None and any validation or
side-effects.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: be94b9f0-29dd-41be-8344-3fd03e32cf8b

📥 Commits

Reviewing files that changed from the base of the PR and between cccce0a and 4a93476.

📒 Files selected for processing (14)
  • e2e/tests-direct-listener.bats
  • python/packages/jumpstarter-cli/jumpstarter_cli/run.py
  • python/packages/jumpstarter-cli/jumpstarter_cli/shell.py
  • python/packages/jumpstarter-cli/jumpstarter_cli/shell_test.py
  • python/packages/jumpstarter/jumpstarter/client/client.py
  • python/packages/jumpstarter/jumpstarter/client/lease.py
  • python/packages/jumpstarter/jumpstarter/common/grpc.py
  • python/packages/jumpstarter/jumpstarter/common/utils.py
  • python/packages/jumpstarter/jumpstarter/config/env.py
  • python/packages/jumpstarter/jumpstarter/exporter/auth.py
  • python/packages/jumpstarter/jumpstarter/exporter/exporter.py
  • python/packages/jumpstarter/jumpstarter/exporter/session.py
  • python/packages/jumpstarter/jumpstarter/exporter/session_test.py
  • python/packages/jumpstarter/jumpstarter/utils/env.py
🚧 Files skipped from review as they are similar to previous changes (6)
  • python/packages/jumpstarter/jumpstarter/client/lease.py
  • python/packages/jumpstarter/jumpstarter/exporter/session_test.py
  • python/packages/jumpstarter/jumpstarter/config/env.py
  • python/packages/jumpstarter/jumpstarter/utils/env.py
  • python/packages/jumpstarter/jumpstarter/exporter/session.py
  • python/packages/jumpstarter/jumpstarter/common/grpc.py

@bennyz bennyz force-pushed the exporter-grpc-listener branch from 4a93476 to 47aae1e Compare March 16, 2026 14:49
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (3)
python/packages/jumpstarter/jumpstarter/exporter/session.py (1)

153-158: ⚠️ Potential issue | 🟠 Major

Validate bind return values before server.start() to prevent silent startup failures.

Line 154/157 and Line 197/200/203 bind ports but never reject 0 (bind failure). That can leave the server “started” with no active listener.

Proposed fix
         if tls_credentials is not None:
             bound_port = server.add_secure_port(address, tls_credentials)
             logger.debug("Session server listening on %s (TLS)", address)
         else:
             bound_port = server.add_insecure_port(address)
             logger.debug("Session server listening on %s (insecure)", address)
+        if bound_port == 0:
+            raise OSError(f"Failed to bind session server to {address}")
-        server.add_insecure_port(f"unix://{unix_path}")
+        unix_bound = server.add_insecure_port(f"unix://{unix_path}")
+        if unix_bound == 0:
+            raise OSError(f"Failed to bind session server to unix://{unix_path}")
         logger.debug("Session server listening on unix://%s (hooks)", unix_path)
         if tls_credentials is not None:
-            server.add_secure_port(address, tls_credentials)
+            tcp_bound = server.add_secure_port(address, tls_credentials)
             logger.debug("Session server listening on %s (TLS)", address)
         else:
-            server.add_insecure_port(address)
+            tcp_bound = server.add_insecure_port(address)
             logger.debug("Session server listening on %s (insecure)", address)
+        if tcp_bound == 0:
+            raise OSError(f"Failed to bind session server to {address}")
In Python grpc.aio Server API, what do add_insecure_port() and add_secure_port() return when bind/listen fails?

Also applies to: 197-204

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter/jumpstarter/exporter/session.py` around lines 153
- 158, The call to server.add_secure_port(address, tls_credentials) /
server.add_insecure_port(address) can return 0 on bind failure but the code
(around tls_credentials, server.add_secure_port, server.add_insecure_port and
before server.start()) does not check this, allowing the server to "start"
without an active listener; update the code to check the returned bound_port
value after each add_*_port call (including the same pattern around lines
referencing the other binds) and if bound_port == 0 log an error (with context:
address and whether TLS was used) and raise an exception or exit to prevent
server.start() from running with no listener.
python/packages/jumpstarter-cli/jumpstarter_cli/run.py (1)

18-24: ⚠️ Potential issue | 🟡 Minor

Normalize invalid port parsing to click.BadParameter.

Line 21 and Line 24 can raise raw ValueError on invalid input. Convert that into a CLI parameter error for consistent UX.

Proposed fix
 def _parse_listener_bind(value: str) -> tuple[str, int]:
     """Parse '[host:]port' into (host, port). Default host is 0.0.0.0."""
-    if ":" in value:
-        host, port_str = value.rsplit(":", 1)
-        host = host.strip() or "0.0.0.0"
-        port = int(port_str, 10)
-    else:
-        host = "0.0.0.0"
-        port = int(value, 10)
+    try:
+        if ":" in value:
+            host, port_str = value.rsplit(":", 1)
+            host = host.strip() or "0.0.0.0"
+            port = int(port_str, 10)
+        else:
+            host = "0.0.0.0"
+            port = int(value, 10)
+    except ValueError as e:
+        raise click.BadParameter("invalid [HOST:]PORT value", param_hint="'--tls-grpc-listener'") from e
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter-cli/jumpstarter_cli/run.py` around lines 18 - 24,
The port parsing currently calls int(...) on port_str or value and can raise
ValueError; update the parsing in run.py around the host/port block to catch
ValueError and re-raise click.BadParameter with a clear message including the
invalid port string; specifically, wrap the conversions for port (the
int(port_str, 10) and int(value, 10) calls) in a try/except (catching
ValueError) and raise click.BadParameter(f"Invalid port: {port_str or value}")
(or similar) so invalid numeric input produces a CLI parameter error instead of
a raw ValueError.
python/packages/jumpstarter-cli/jumpstarter_cli/shell.py (1)

432-435: ⚠️ Potential issue | 🟡 Minor

Differentiate “no config provided” from “requested config not found.”

Line 432 currently collapses both states into one usage error, which obscures explicit config-load failures.

Proposed fix
-    if config is None or isinstance(config, tuple):
+    if config is None:
         raise click.UsageError(
             "Specify one of: --client / --client-config, --exporter / --exporter-config, or --tls-grpc HOST:PORT"
         )
+    if isinstance(config, tuple):
+        kind, value = config
+        if (kind, value) == ("client", "default"):
+            raise click.UsageError(
+                "Specify one of: --client / --client-config, --exporter / --exporter-config, or --tls-grpc HOST:PORT"
+            )
+        raise click.ClickException(f"Failed to load {kind}: {value}")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter-cli/jumpstarter_cli/shell.py` around lines 432 -
435, Separate the combined check into two branches so "no config provided" and
"requested config not found" are reported differently: keep the current branch
if config is None and raise click.UsageError with the existing usage message,
and add an elif isinstance(config, tuple) branch that unpacks the tuple (e.g.,
requested_source, error_info) and raises a distinct error such as
click.ClickException (or click.FileError) with a clear message like "Requested
config '<requested_source>' not found: <error_info>" so callers see explicit
load-failure details rather than the generic usage error.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@python/packages/jumpstarter-cli/jumpstarter_cli/run.py`:
- Around line 18-24: The port parsing currently calls int(...) on port_str or
value and can raise ValueError; update the parsing in run.py around the
host/port block to catch ValueError and re-raise click.BadParameter with a clear
message including the invalid port string; specifically, wrap the conversions
for port (the int(port_str, 10) and int(value, 10) calls) in a try/except
(catching ValueError) and raise click.BadParameter(f"Invalid port: {port_str or
value}") (or similar) so invalid numeric input produces a CLI parameter error
instead of a raw ValueError.

In `@python/packages/jumpstarter-cli/jumpstarter_cli/shell.py`:
- Around line 432-435: Separate the combined check into two branches so "no
config provided" and "requested config not found" are reported differently: keep
the current branch if config is None and raise click.UsageError with the
existing usage message, and add an elif isinstance(config, tuple) branch that
unpacks the tuple (e.g., requested_source, error_info) and raises a distinct
error such as click.ClickException (or click.FileError) with a clear message
like "Requested config '<requested_source>' not found: <error_info>" so callers
see explicit load-failure details rather than the generic usage error.

In `@python/packages/jumpstarter/jumpstarter/exporter/session.py`:
- Around line 153-158: The call to server.add_secure_port(address,
tls_credentials) / server.add_insecure_port(address) can return 0 on bind
failure but the code (around tls_credentials, server.add_secure_port,
server.add_insecure_port and before server.start()) does not check this,
allowing the server to "start" without an active listener; update the code to
check the returned bound_port value after each add_*_port call (including the
same pattern around lines referencing the other binds) and if bound_port == 0
log an error (with context: address and whether TLS was used) and raise an
exception or exit to prevent server.start() from running with no listener.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a30bbc14-b622-4070-80bb-2b2ab4b4ac35

📥 Commits

Reviewing files that changed from the base of the PR and between cccce0a and 47aae1e.

📒 Files selected for processing (14)
  • e2e/tests-direct-listener.bats
  • python/packages/jumpstarter-cli/jumpstarter_cli/run.py
  • python/packages/jumpstarter-cli/jumpstarter_cli/shell.py
  • python/packages/jumpstarter-cli/jumpstarter_cli/shell_test.py
  • python/packages/jumpstarter/jumpstarter/client/client.py
  • python/packages/jumpstarter/jumpstarter/client/lease.py
  • python/packages/jumpstarter/jumpstarter/common/grpc.py
  • python/packages/jumpstarter/jumpstarter/common/utils.py
  • python/packages/jumpstarter/jumpstarter/config/env.py
  • python/packages/jumpstarter/jumpstarter/exporter/auth.py
  • python/packages/jumpstarter/jumpstarter/exporter/exporter.py
  • python/packages/jumpstarter/jumpstarter/exporter/session.py
  • python/packages/jumpstarter/jumpstarter/exporter/session_test.py
  • python/packages/jumpstarter/jumpstarter/utils/env.py
🚧 Files skipped from review as they are similar to previous changes (8)
  • e2e/tests-direct-listener.bats
  • python/packages/jumpstarter/jumpstarter/config/env.py
  • python/packages/jumpstarter/jumpstarter/utils/env.py
  • python/packages/jumpstarter/jumpstarter/exporter/session_test.py
  • python/packages/jumpstarter/jumpstarter/exporter/auth.py
  • python/packages/jumpstarter-cli/jumpstarter_cli/shell_test.py
  • python/packages/jumpstarter/jumpstarter/exporter/exporter.py
  • python/packages/jumpstarter/jumpstarter/client/client.py

@bennyz
Copy link
Member Author

bennyz commented Mar 18, 2026

@mangelajo added passphrase

Copy link
Member

@mangelajo mangelajo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a nit

but looks great

drivers = ClientConfigV1Alpha1Drivers()

insecure = os.environ.get(JMP_GRPC_INSECURE, "") == "1"
passphrase = os.environ.get(JMP_GRPC_PASSPHRASE) or None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh nice I was going to ask for an env var :D

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also have JUMPSTARTER_GRPC_INSECURE

IMHO we should just migrate to JMP_GRPC_INSECURE everywhere, so where JUMPSTARTER_GRPC_INSECURE is accepted it should also take JMP_GRPC_INSECURE for a while

we should also document the new env vars here: https://jumpstarter.dev/main/getting-started/configuration/files.html#client-configuration

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack, will add

except* Exception as excgroup:
_handle_exporter_exceptions(excgroup)
if listener_bind is not None:
host, port = _parse_listener_bind(listener_bind)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_parse_listener_bind is called inside _handle_child which runs after os.fork(). If the user passes an invalid value (jmp run --tls-grpc-listener foo:bar), the int() ValueError crashes the child, the parent sees non-zero exit from waitpid, and restarts it in a loop. IMHO parsing should move to the run() click command before the fork so the error shows up cleanly.

if tls_insecure:
if passphrase:
click.echo(
"WARNING: --passphrase has no effect without TLS; "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong - Its just transmitted in plain text.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's what this means, see the line below :)

bennyz added 5 commits March 18, 2026 12:47
Allow exporters to serve directly over TCP without a controller,
and clients to connect directly without going through the
a controller

Exporter: jmp run --tls-grpc-listener [HOST:]PORT [--tls-grpc-insecure]
Client:   jmp shell --tls-grpc HOST:PORT [--tls-grpc-insecure]

Signed-off-by: Benny Zlotnik <bzlotnik@redhat.com>
Assisted-by: claude-sonnet-4.6
Signed-off-by: Benny Zlotnik <bzlotnik@redhat.com>
Assisted-by: claude-sonnet-4.6
Also add hook tests for the standalone mode

Signed-off-by: Benny Zlotnik <bzlotnik@redhat.com>
Assisted-by: claude-opus-4.6
Signed-off-by: Benny Zlotnik <bzlotnik@redhat.com>
Server: --passphrase flag on `jmp run` creates a PassphraseInterceptor
that rejects RPCs missing the correct x-jumpstarter-passphrase metadata.
Client: --passphrase flag on `jmp shell` injects the passphrase via
client channel interceptors on every RPC. Subprocess `j` commands
inherit the passphrase via JMP_GRPC_PASSPHRASE env var.
passphrase would be transmitted in plaintext.

Assisted-by: claude-sonnet-4.6
Signed-off-by: Benny Zlotnik <bzlotnik@redhat.com>
@bennyz bennyz force-pushed the exporter-grpc-listener branch from 47aae1e to 88e12d7 Compare March 18, 2026 10:56
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (5)
python/packages/jumpstarter/jumpstarter/client/client.py (2)

79-82: ⚠️ Potential issue | 🟠 Major

Apply passphrase interceptors in the UDS branch as well.

passphrase is documented as metadata on every RPC, but the Unix-channel path drops interceptors. That creates inconsistent auth behavior between TCP and UDS connections.

Suggested fix
-        async with grpc.aio.secure_channel(
-            f"unix://{path}", grpc.local_channel_credentials(grpc.LocalConnectionType.UDS)
-        ) as channel:
+        async with grpc.aio.secure_channel(
+            f"unix://{path}",
+            grpc.local_channel_credentials(grpc.LocalConnectionType.UDS),
+            interceptors=interceptors,
+        ) as channel:
             yield await client_from_channel(channel, portal, stack, allow, unsafe)
#!/bin/bash
# Verify passphrase interceptor wiring across client_from_path branches.
rg -n -C4 'def client_from_path|interceptors|insecure_channel|aio_secure_channel|unix://' \
  python/packages/jumpstarter/jumpstarter/client/client.py
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter/jumpstarter/client/client.py` around lines 79 -
82, The UDS branch opens a grpc.aio.secure_channel but never applies the
passphrase/interceptors, causing inconsistent metadata auth versus the TCP path;
update the UDS branch in client_from_path to wrap the created channel with the
same passphrase/interceptors before calling client_from_channel (e.g., use
grpc.aio.intercept_channel or the same interceptor-wrapping logic used in the
TCP/insecure branch) so that the passphrase metadata interceptor (and any other
interceptors) are applied for Unix-domain sockets as well.

23-34: ⚠️ Potential issue | 🟡 Minor

Tighten TCP-address detection to reject malformed targets early.

Current detection accepts ambiguous values by only validating the final port token. Enforce non-empty host and bracketed IPv6 for multi-colon addresses.

Suggested fix
 def _is_tcp_address(path: str) -> bool:
     """Return True if path looks like host:port (TCP address)."""
-    if ":" not in path:
+    if ":" not in path:
         return False
-    parts = path.rsplit(":", 1)
-    if len(parts) != 2:
-        return False
+    if path.startswith("["):
+        end = path.find("]")
+        if end == -1 or end + 1 >= len(path) or path[end + 1] != ":":
+            return False
+        host = path[1:end]
+        port_str = path[end + 2 :]
+    else:
+        host, port_str = path.rsplit(":", 1)
+        if not host or ":" in host:
+            return False
     try:
-        port = int(parts[1], 10)
+        port = int(port_str, 10)
         return 1 <= port <= 65535
     except ValueError:
         return False
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter/jumpstarter/client/client.py` around lines 23 -
34, The _is_tcp_address function currently only validates the final token as a
port and accepts ambiguous multi-colon addresses; tighten it by rejecting empty
hosts and unbracketed IPv6 forms: if path starts with '[' locate the matching
']' and ensure there is a single ':' after the closing bracket for the port,
parse that port; otherwise if path contains more than one ':' (i.e., an
unbracketed IPv6) return False; for the normal host:port case ensure the host
portion (left of the final ':') is non-empty and the port parses to an int in
1..65535 before returning True.
python/packages/jumpstarter-cli/jumpstarter_cli/run.py (1)

16-32: ⚠️ Potential issue | 🟡 Minor

Reject ambiguous unbracketed IPv6 in listener parsing.

rsplit(":", 1) still accepts malformed multi-colon input and can misparse IPv6-like values instead of failing fast. Require bracketed IPv6 ([::1]:1234) or raise a BadParameter for unbracketed multi-colon hosts.

Suggested fix
 def _parse_listener_bind(value: str) -> tuple[str, int]:
     """Parse '[host:]port' into (host, port). Default host is 0.0.0.0."""
-    if ":" in value:
+    if value.startswith("["):
+        end = value.find("]")
+        if end == -1:
+            raise click.BadParameter("invalid bracketed IPv6 host", param_hint="'--tls-grpc-listener'")
+        host = value[1:end].strip() or "::"
+        if end + 1 >= len(value) or value[end + 1] != ":":
+            raise click.BadParameter("missing :PORT after IPv6 host", param_hint="'--tls-grpc-listener'")
+        port_str = value[end + 2 :]
+    elif ":" in value:
         host, port_str = value.rsplit(":", 1)
         host = host.strip() or "0.0.0.0"
+        if ":" in host:
+            raise click.BadParameter(
+                "use bracketed IPv6 format, e.g. [::1]:1234",
+                param_hint="'--tls-grpc-listener'",
+            )
     else:
         host = "0.0.0.0"
         port_str = value
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter-cli/jumpstarter_cli/run.py` around lines 16 - 32,
The _parse_listener_bind function currently uses rsplit(":", 1) which can
misparse unbracketed IPv6-like values; update parsing to accept bracketed IPv6
only and reject unbracketed multi-colon hosts by raising click.BadParameter.
Specifically, detect bracketed form (value starts with '[') and parse host
between '[' and ']' with a following ':' for port, otherwise if the host portion
contains more than one ':' (i.e., an unbracketed IPv6-like string) raise
click.BadParameter with the same param_hint "'--tls-grpc-listener'"; otherwise
continue with the existing host/port parsing and integer/port-range validation
in _parse_listener_bind.
python/packages/jumpstarter/jumpstarter/exporter/session.py (1)

153-158: ⚠️ Potential issue | 🟠 Major

Fail fast when gRPC port bind returns 0.

At Line 153/Line 157 and Line 197/Line 200/Line 203, bind results are not validated before server.start(). A failed bind can silently proceed with no usable listener.

Proposed fix
         if tls_credentials is not None:
             bound_port = server.add_secure_port(address, tls_credentials)
             logger.debug("Session server listening on %s (TLS)", address)
         else:
             bound_port = server.add_insecure_port(address)
             logger.debug("Session server listening on %s (insecure)", address)
+        if bound_port == 0:
+            raise OSError(f"Failed to bind session server to {address}")
         server = grpc.aio.server(interceptors=interceptors)
         address = f"{host}:{port}"
-        server.add_insecure_port(f"unix://{unix_path}")
+        unix_bound = server.add_insecure_port(f"unix://{unix_path}")
+        if unix_bound == 0:
+            raise OSError(f"Failed to bind session server to unix://{unix_path}")
         logger.debug("Session server listening on unix://%s (hooks)", unix_path)
         if tls_credentials is not None:
-            server.add_secure_port(address, tls_credentials)
+            tcp_bound = server.add_secure_port(address, tls_credentials)
             logger.debug("Session server listening on %s (TLS)", address)
         else:
-            server.add_insecure_port(address)
+            tcp_bound = server.add_insecure_port(address)
             logger.debug("Session server listening on %s (insecure)", address)
+        if tcp_bound == 0:
+            raise OSError(f"Failed to bind session server to {address}")
#!/bin/bash
set -euo pipefail

echo "1) Inspect current bind sites in session.py"
cat -n python/packages/jumpstarter/jumpstarter/exporter/session.py | sed -n '148,210p'

echo
echo "2) Verify gRPC AsyncIO add_*_port return contract from upstream source"
curl -sL https://raw.githubusercontent.com/grpc/grpc/master/src/python/grpcio/grpc/aio/_server.py \
  | sed -n '110,220p' | grep -nE 'def add_insecure_port|def add_secure_port|add_http2_port|return'

Also applies to: 197-204

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter/jumpstarter/exporter/session.py` around lines 153
- 158, The gRPC port bind result from server.add_secure_port /
server.add_insecure_port is not validated before calling server.start(), which
can silently proceed if the bind returns 0; after calling server.add_secure_port
or server.add_insecure_port assign to bound_port and immediately check if
bound_port == 0 (or falsy), and on failure raise an exception or log an error
and abort (do not call server.start()); apply the same check for every usage of
server.add_secure_port/server.add_insecure_port in this module (referencing the
bound_port variable and server.start calls) so the process fails fast when
binding fails.
python/packages/jumpstarter-cli/jumpstarter_cli/shell.py (1)

432-435: ⚠️ Potential issue | 🟡 Minor

Differentiate missing-default from explicit missing-config tuple.

At Line 432, treating every tuple config as generic usage error hides explicit “requested config not found” cases and makes troubleshooting harder.

Proposed fix
-    if config is None or isinstance(config, tuple):
+    if config is None:
         raise click.UsageError(
             "Specify one of: --client / --client-config, --exporter / --exporter-config, or --tls-grpc HOST:PORT"
         )
+    if isinstance(config, tuple):
+        kind, value = config
+        if (kind, value) == ("client", "default"):
+            raise click.UsageError(
+                "Specify one of: --client / --client-config, --exporter / --exporter-config, or --tls-grpc HOST:PORT"
+            )
+        raise click.ClickException(f"Failed to load {kind}: {value}")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter-cli/jumpstarter_cli/shell.py` around lines 432 -
435, The current check lumps config is None and isinstance(config, tuple)
together and raises a generic click.UsageError; split into two branches so
missing-default (config is None) still raises the original usage error, while an
explicit missing-config represented by isinstance(config, tuple) raises a
clearer error that includes the tuple details (e.g., the requested config
name/identifier) to aid troubleshooting; modify the block around the config
variable so: if config is None -> raise the existing click.UsageError, elif
isinstance(config, tuple) -> raise a new click error (UsageError or FileError)
with a message that mentions the requested config (use the tuple contents) so
the user sees which config was not found.
🧹 Nitpick comments (2)
python/docs/source/getting-started/guides/setup-direct-mode.md (1)

59-60: Consider adding certificate generation guidance.

The documentation mentions using --tls-cert and --tls-key for production but doesn't provide guidance on generating certificates. Consider adding a brief example or reference (e.g., using OpenSSL or referencing a certificate generation guide) to help users transition from development to production.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/docs/source/getting-started/guides/setup-direct-mode.md` around lines
59 - 60, Add a short certificate-generation subsection to the existing TLS note
that shows users how to create self-signed certs for testing and points them to
production options; specifically mention the flags --tls-cert and --tls-key,
provide a brief OpenSSL example (describe the commands to generate a private key
and a self-signed cert) for local/development use, and add a sentence
recommending using a CA-signed certificate (e.g., Let's Encrypt or corporate CA)
with a reference link to a certificate/CSR guide for production deployments so
readers know how to transition from --tls-grpc-insecure to --tls-cert/--tls-key.
python/packages/jumpstarter/jumpstarter/exporter/auth.py (1)

97-104: Add stream-unary interceptor for complete passphrase coverage.

passphrase_client_interceptors() is missing a stream-unary interceptor. Stream-unary RPC methods exist in the service (e.g., AuditStream), so passphrase metadata won't be injected for those calls. Add _StreamUnaryInterceptor to match the server-side auth logic.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/packages/jumpstarter/jumpstarter/exporter/auth.py` around lines 97 -
104, The passphrase_client_interceptors function currently returns interceptors
for unary-unary, unary-stream, and stream-stream but omits stream-unary, so
stream-unary RPCs (e.g., AuditStream) won't receive the passphrase metadata;
update passphrase_client_interceptors to include _StreamUnaryInterceptor(extra)
in the returned list alongside _UnaryUnaryInterceptor, _UnaryStreamInterceptor,
and _StreamStreamInterceptor so all RPC types are covered and match the
server-side auth logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@e2e/tests-direct-listener.bats`:
- Around line 76-78: start_exporter_with_passphrase currently only waits for a
TCP socket because it passes an empty string for the gRPC readiness probe, so
the tests can flake; modify start_exporter_with_passphrase (and any similar
invocations in the 149-170 region) to thread the known-good passphrase into the
existing gRPC readiness check by replacing the empty string argument (the third
parameter to _start_exporter) with the passphrase (e.g., "$1" or a
KNOWN_GOOD_PASSPHRASE variable) so the readiness probe performs an authenticated
gRPC check before running the assertions.
- Line 6: The test pins LISTENER_PORT=19090 and may accidentally talk to a
leftover exporter; update the test to verify the exporter process around
readiness by checking the variable LISTENER_PID with a kill -0 "$LISTENER_PID"
(or equivalent pid-alive check) both immediately after starting the new process
and again after the readiness probe succeeds, and fail the test if the PID is
not alive; apply the same PID-alive checks in the other related blocks
referenced (around the readiness/wait logic in the 42-68 region) so the test
fails on bind/startup errors rather than connecting to a stale process.

---

Duplicate comments:
In `@python/packages/jumpstarter-cli/jumpstarter_cli/run.py`:
- Around line 16-32: The _parse_listener_bind function currently uses
rsplit(":", 1) which can misparse unbracketed IPv6-like values; update parsing
to accept bracketed IPv6 only and reject unbracketed multi-colon hosts by
raising click.BadParameter. Specifically, detect bracketed form (value starts
with '[') and parse host between '[' and ']' with a following ':' for port,
otherwise if the host portion contains more than one ':' (i.e., an unbracketed
IPv6-like string) raise click.BadParameter with the same param_hint
"'--tls-grpc-listener'"; otherwise continue with the existing host/port parsing
and integer/port-range validation in _parse_listener_bind.

In `@python/packages/jumpstarter-cli/jumpstarter_cli/shell.py`:
- Around line 432-435: The current check lumps config is None and
isinstance(config, tuple) together and raises a generic click.UsageError; split
into two branches so missing-default (config is None) still raises the original
usage error, while an explicit missing-config represented by isinstance(config,
tuple) raises a clearer error that includes the tuple details (e.g., the
requested config name/identifier) to aid troubleshooting; modify the block
around the config variable so: if config is None -> raise the existing
click.UsageError, elif isinstance(config, tuple) -> raise a new click error
(UsageError or FileError) with a message that mentions the requested config (use
the tuple contents) so the user sees which config was not found.

In `@python/packages/jumpstarter/jumpstarter/client/client.py`:
- Around line 79-82: The UDS branch opens a grpc.aio.secure_channel but never
applies the passphrase/interceptors, causing inconsistent metadata auth versus
the TCP path; update the UDS branch in client_from_path to wrap the created
channel with the same passphrase/interceptors before calling client_from_channel
(e.g., use grpc.aio.intercept_channel or the same interceptor-wrapping logic
used in the TCP/insecure branch) so that the passphrase metadata interceptor
(and any other interceptors) are applied for Unix-domain sockets as well.
- Around line 23-34: The _is_tcp_address function currently only validates the
final token as a port and accepts ambiguous multi-colon addresses; tighten it by
rejecting empty hosts and unbracketed IPv6 forms: if path starts with '[' locate
the matching ']' and ensure there is a single ':' after the closing bracket for
the port, parse that port; otherwise if path contains more than one ':' (i.e.,
an unbracketed IPv6) return False; for the normal host:port case ensure the host
portion (left of the final ':') is non-empty and the port parses to an int in
1..65535 before returning True.

In `@python/packages/jumpstarter/jumpstarter/exporter/session.py`:
- Around line 153-158: The gRPC port bind result from server.add_secure_port /
server.add_insecure_port is not validated before calling server.start(), which
can silently proceed if the bind returns 0; after calling server.add_secure_port
or server.add_insecure_port assign to bound_port and immediately check if
bound_port == 0 (or falsy), and on failure raise an exception or log an error
and abort (do not call server.start()); apply the same check for every usage of
server.add_secure_port/server.add_insecure_port in this module (referencing the
bound_port variable and server.start calls) so the process fails fast when
binding fails.

---

Nitpick comments:
In `@python/docs/source/getting-started/guides/setup-direct-mode.md`:
- Around line 59-60: Add a short certificate-generation subsection to the
existing TLS note that shows users how to create self-signed certs for testing
and points them to production options; specifically mention the flags --tls-cert
and --tls-key, provide a brief OpenSSL example (describe the commands to
generate a private key and a self-signed cert) for local/development use, and
add a sentence recommending using a CA-signed certificate (e.g., Let's Encrypt
or corporate CA) with a reference link to a certificate/CSR guide for production
deployments so readers know how to transition from --tls-grpc-insecure to
--tls-cert/--tls-key.

In `@python/packages/jumpstarter/jumpstarter/exporter/auth.py`:
- Around line 97-104: The passphrase_client_interceptors function currently
returns interceptors for unary-unary, unary-stream, and stream-stream but omits
stream-unary, so stream-unary RPCs (e.g., AuditStream) won't receive the
passphrase metadata; update passphrase_client_interceptors to include
_StreamUnaryInterceptor(extra) in the returned list alongside
_UnaryUnaryInterceptor, _UnaryStreamInterceptor, and _StreamStreamInterceptor so
all RPC types are covered and match the server-side auth logic.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 09a6ddaa-670e-4a3c-a566-a2d35dd89e1e

📥 Commits

Reviewing files that changed from the base of the PR and between 47aae1e and 88e12d7.

📒 Files selected for processing (23)
  • e2e/exporters/exporter-direct-hooks-before.yaml
  • e2e/exporters/exporter-direct-hooks-both.yaml
  • e2e/exporters/exporter-direct-listener.yaml
  • e2e/tests-direct-listener.bats
  • python/docs/source/getting-started/configuration/files.md
  • python/docs/source/getting-started/guides/index.md
  • python/docs/source/getting-started/guides/setup-direct-mode.md
  • python/packages/jumpstarter-cli/jumpstarter_cli/run.py
  • python/packages/jumpstarter-cli/jumpstarter_cli/shell.py
  • python/packages/jumpstarter-cli/jumpstarter_cli/shell_test.py
  • python/packages/jumpstarter/jumpstarter/client/__init__.py
  • python/packages/jumpstarter/jumpstarter/client/client.py
  • python/packages/jumpstarter/jumpstarter/client/client_test.py
  • python/packages/jumpstarter/jumpstarter/client/lease.py
  • python/packages/jumpstarter/jumpstarter/common/grpc.py
  • python/packages/jumpstarter/jumpstarter/common/utils.py
  • python/packages/jumpstarter/jumpstarter/config/env.py
  • python/packages/jumpstarter/jumpstarter/config/exporter.py
  • python/packages/jumpstarter/jumpstarter/exporter/auth.py
  • python/packages/jumpstarter/jumpstarter/exporter/exporter.py
  • python/packages/jumpstarter/jumpstarter/exporter/session.py
  • python/packages/jumpstarter/jumpstarter/exporter/session_test.py
  • python/packages/jumpstarter/jumpstarter/utils/env.py
✅ Files skipped from review due to trivial changes (1)
  • python/docs/source/getting-started/configuration/files.md
🚧 Files skipped from review as they are similar to previous changes (7)
  • e2e/exporters/exporter-direct-listener.yaml
  • python/packages/jumpstarter/jumpstarter/config/env.py
  • e2e/exporters/exporter-direct-hooks-both.yaml
  • e2e/exporters/exporter-direct-hooks-before.yaml
  • python/packages/jumpstarter/jumpstarter/client/client_test.py
  • python/packages/jumpstarter/jumpstarter/client/init.py
  • python/packages/jumpstarter/jumpstarter/exporter/session_test.py


SCRIPT_DIR="$(cd "$(dirname "$BATS_TEST_FILENAME")" && pwd)"
EXPORTER_CONFIG="${SCRIPT_DIR}/exporters/exporter-direct-listener.yaml"
LISTENER_PORT=19090
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fail fast if the new exporter never owns the listener port.

Because Line 6 pins a fixed port, a leftover listener on 19090 can satisfy the readiness probe even when this jmp run process already died on bind. Please verify "$LISTENER_PID" is still alive before and after readiness so the test fails against the startup error instead of talking to a stale exporter.

🛠️ Suggested hardening
   LISTENER_PID=$!
   echo "$LISTENER_PID" > "${BATS_TEST_TMPDIR}/exporter.pid"

   local retries=30
+  local startup_error="Exporter exited before becoming ready; is port ${LISTENER_PORT} already in use?"
   if [ "$readiness" = "port" ]; then
     # TCP-only check: doesn't drain the LogStream queue, so hook output
     # remains buffered for the test command to consume.
     while ! nc -z 127.0.0.1 "$LISTENER_PORT" 2>/dev/null; do
+      if ! ps -p "$LISTENER_PID" > /dev/null 2>&1; then
+        echo "$startup_error" >&2
+        return 1
+      fi
       retries=$((retries - 1))
       if [ "$retries" -le 0 ]; then
         echo "Port $LISTENER_PORT did not become available" >&2
         return 1
       fi
       sleep 0.5
     done
   else
     # Full gRPC check: ensures exporter is ready for commands.
     # Drains LogStream queue (unsuitable for hook output tests).
     while ! jmp shell --tls-grpc "127.0.0.1:${LISTENER_PORT}" --tls-grpc-insecure -- j --help >/dev/null 2>&1; do
+      if ! ps -p "$LISTENER_PID" > /dev/null 2>&1; then
+        echo "$startup_error" >&2
+        return 1
+      fi
       retries=$((retries - 1))
       if [ "$retries" -le 0 ]; then
         echo "Exporter did not become ready in time" >&2
         return 1
       fi
       sleep 0.5
     done
   fi
+  if ! ps -p "$LISTENER_PID" > /dev/null 2>&1; then
+    echo "$startup_error" >&2
+    return 1
+  fi
 }

Based on learnings: E2E tests in bats should be runnable in isolation. Do not rely on prior suite setup or shared state across tests. Each test (or setup/teardown blocks) should explicitly prepare its prerequisites, reset global state, and ensure deterministic results.

Also applies to: 42-68

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@e2e/tests-direct-listener.bats` at line 6, The test pins LISTENER_PORT=19090
and may accidentally talk to a leftover exporter; update the test to verify the
exporter process around readiness by checking the variable LISTENER_PID with a
kill -0 "$LISTENER_PID" (or equivalent pid-alive check) both immediately after
starting the new process and again after the readiness probe succeeds, and fail
the test if the PID is not alive; apply the same PID-alive checks in the other
related blocks referenced (around the readiness/wait logic in the 42-68 region)
so the test fails on bind/startup errors rather than connecting to a stale
process.

Comment on lines +76 to +78
# Port-only readiness: can't do gRPC readiness without threading
# the passphrase through the readiness check.
start_exporter_with_passphrase() { _start_exporter "${2:-$EXPORTER_CONFIG}" port "" "$1"; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use an authenticated gRPC readiness probe for passphrase mode.

start_exporter_with_passphrase only waits for the TCP socket, so Line 154 can flake and Lines 162/170 can pass on “server not ready yet” rather than actual passphrase rejection. Since these tests do not inspect exporter logs, thread the known-good passphrase into the existing gRPC readiness check before running the assertions.

🛠️ Suggested hardening
 _start_exporter() {
   local config="${1:-$EXPORTER_CONFIG}"
   local readiness="${2:-grpc}"
   local capture_logs="${3:-}"
   local passphrase="${4:-}"
+  local probe_passphrase="${5:-}"
@@
   else
+    local probe_args=(--tls-grpc "127.0.0.1:${LISTENER_PORT}" --tls-grpc-insecure)
+    if [ -n "$probe_passphrase" ]; then
+      probe_args+=(--passphrase "$probe_passphrase")
+    fi
     # Full gRPC check: ensures exporter is ready for commands.
     # Drains LogStream queue (unsuitable for hook output tests).
-    while ! jmp shell --tls-grpc "127.0.0.1:${LISTENER_PORT}" --tls-grpc-insecure -- j --help >/dev/null 2>&1; do
+    while ! jmp shell "${probe_args[@]}" -- j --help >/dev/null 2>&1; do
       retries=$((retries - 1))
       if [ "$retries" -le 0 ]; then
         echo "Exporter did not become ready in time" >&2
         return 1
       fi
       sleep 0.5
     done
   fi
 }
@@
-start_exporter_with_passphrase() { _start_exporter "${2:-$EXPORTER_CONFIG}" port "" "$1"; }
+start_exporter_with_passphrase() { _start_exporter "${2:-$EXPORTER_CONFIG}" grpc "" "$1" "$1"; }

Also applies to: 149-170

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@e2e/tests-direct-listener.bats` around lines 76 - 78,
start_exporter_with_passphrase currently only waits for a TCP socket because it
passes an empty string for the gRPC readiness probe, so the tests can flake;
modify start_exporter_with_passphrase (and any similar invocations in the
149-170 region) to thread the known-good passphrase into the existing gRPC
readiness check by replacing the empty string argument (the third parameter to
_start_exporter) with the passphrase (e.g., "$1" or a KNOWN_GOOD_PASSPHRASE
variable) so the readiness probe performs an authenticated gRPC check before
running the assertions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

build-pr-images/jumpstarter request to build only the jumpstarter image from PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants