Revert "Revert "shell: keep shell usable after token refreshes""#290
Revert "Revert "shell: keep shell usable after token refreshes""#290
Conversation
📝 WalkthroughWalkthroughAdds lease-aware token auto-refresh and disk-backed token reload to the CLI token-monitoring flow, updates monitor to accept a lease, adds Lease.channel refresh and end-time helpers, and introduces comprehensive tests plus an anyio pytest fixture. Changes
Sequence DiagramsequenceDiagram
participant Monitor as _monitor_token_expiry
participant Config
participant OIDC as OIDC_Server
participant Lease as Lease_Channel
participant Disk as Disk_Storage
Monitor->>Config: Check token remaining time
alt remaining <= threshold
Monitor->>Monitor: _attempt_token_recovery()
alt try refresh
Monitor->>OIDC: Refresh token (refresh_token)
alt refresh success
OIDC-->>Config: New token (+ optional refresh_token)
Monitor->>Config: Save updated token
Monitor->>Lease: _update_lease_channel(config, lease)
else refresh failed
Monitor->>Disk: Fallback -> try reload from disk
end
end
alt disk reload success
Disk-->>Config: Newer valid token
Monitor->>Lease: _update_lease_channel(config, lease)
else no valid token
Monitor->>Monitor: _warn_refresh_failed(remaining)
end
end
Monitor->>Monitor: Sleep (30s if above threshold, 5s if below)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip You can get early access to new features in CodeRabbit.Enable the |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@python/packages/jumpstarter-cli/jumpstarter_cli/shell_test.py`:
- Around line 339-341: The test currently mutates the global Mock class via
type(cancel_scope).cancel_called = property(...), which leaks state; instead
define a local subclass (e.g., class CancelMock(Mock): cancel_called =
property(lambda self: check_cancelled())) and instantiate cancel_scope =
CancelMock() (or similarly create the subclass inline) so the cancel_called
property is only on that test's instance/class and does not modify
unittest.mock.Mock globally; replace references to the existing instance
creation and the type(...) assignment with this local subclass instantiation.
In `@python/packages/jumpstarter-cli/jumpstarter_cli/shell.py`:
- Around line 180-199: The refresh-failure flag warned_refresh_failed can
permanently suppress the final expired warning; update the logic so whenever you
emit the expired warning you also clear warned_refresh_failed (and set
warned_expiry appropriately). Concretely, in the loop branches that call
_warn_expired (or otherwise display the token-expired red message) ensure you
set warned_refresh_failed = False and warned_expiry = True (and mirror the same
reset in the similar block around lines 211-216) so a prior refresh-failure
warning cannot hide the later expired warning.
In `@python/packages/jumpstarter/jumpstarter/client/lease.py`:
- Around line 290-296: The current _get_lease_end_time method returns None
whenever duration or effective_begin_time is missing, even if
lease.effective_end_time is present; change the logic in _get_lease_end_time to
first check and return lease.effective_end_time if it exists, and only if it
does not exist then require both lease.effective_begin_time and lease.duration
(return None if those are missing) and otherwise return
lease.effective_begin_time + lease.duration so monitor_async can use explicit
end times.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ccedc8d2-f256-479e-a629-f46aefaf66d3
📒 Files selected for processing (5)
python/packages/jumpstarter-cli/conftest.pypython/packages/jumpstarter-cli/jumpstarter_cli/shell.pypython/packages/jumpstarter-cli/jumpstarter_cli/shell_test.pypython/packages/jumpstarter/jumpstarter/client/lease.pypython/packages/jumpstarter/jumpstarter/client/lease_test.py
e288984 to
ec8a54c
Compare
✅ Deploy Preview for jumpstarter-docs ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
90676bb to
0e45e8b
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (2)
python/packages/jumpstarter/jumpstarter/client/lease.py (1)
350-355:⚠️ Potential issue | 🟠 MajorCheck
effective_end_timebefore the fallback guard.Line 352 returns
Nonebefore looking atlease.effective_end_time, so leases that carry an explicit end time but omiteffective_begin_timeordurationnever report expiry tomonitor_async().💡 Proposed fix
def _get_lease_end_time(self, lease) -> datetime | None: """Extract the end time from a lease response, or None if not available.""" - if not (lease.effective_begin_time and lease.duration): - return None if lease.effective_end_time: return lease.effective_end_time + if not (lease.effective_begin_time and lease.duration): + return None return lease.effective_begin_time + lease.duration🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@python/packages/jumpstarter/jumpstarter/client/lease.py` around lines 350 - 355, In _get_lease_end_time, check lease.effective_end_time first and return it if present, before the current guard that returns None when effective_begin_time or duration are missing; otherwise fall back to computing end time from lease.effective_begin_time + lease.duration (if both present) and return None only when neither an explicit effective_end_time nor the begin+duration can be derived—this ensures monitor_async() sees explicit expiries even when effective_begin_time or duration are omitted.python/packages/jumpstarter-cli/jumpstarter_cli/shell_test.py (1)
339-340:⚠️ Potential issue | 🟠 MajorKeep
cancel_calledoff the baseMockclass.Line 340 mutates
unittest.mock.Mockitself, so laterMock(cancel_called=False)instances in this file can read the property instead of their own value. That makes the suite order-dependent.💡 Proposed fix
- cancel_scope = Mock() - type(cancel_scope).cancel_called = property(lambda self: check_cancelled()) + class _CancelScope: + `@property` + def cancel_called(self): + return check_cancelled() + + cancel_scope = _CancelScope()Run this to confirm the leak:
#!/bin/bash python - <<'PY' from unittest.mock import Mock first = Mock() type(first).cancel_called = property(lambda self: True) second = Mock(cancel_called=False) print(second.cancel_called) PYIf this prints
True, the class-level property leaked into laterMockinstances.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@python/packages/jumpstarter-cli/jumpstarter_cli/shell_test.py` around lines 339 - 340, The test is mutating unittest.mock.Mock by assigning type(cancel_scope).cancel_called = property(...), leaking into other Mock instances; instead define a dedicated subclass (e.g. class CancelScopeMock(Mock): `@property` def cancel_called(self): return check_cancelled()) and instantiate that (CancelScopeMock()) for cancel_scope, or attach the needed boolean/result directly to the instance (cancel_scope.cancel_called = <value>) so you do not modify Mock itself; update the creation of cancel_scope where type(cancel_scope).cancel_called is used to use the new subclass or instance attribute.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@python/packages/jumpstarter-cli/jumpstarter_cli/shell_test.py`:
- Line 395: The test file contains a redundant inner import of
_warn_refresh_failed; remove the unused "from jumpstarter_cli.shell import
_warn_refresh_failed" statement in jumpstarter_cli/shell_test.py (the inner
import at the indicated location) so Ruff F401 is resolved—ensure any references
to _warn_refresh_failed use the existing valid import scope (or add a single
top-level import if the test actually needs it) and then run make lint to verify
the warning is gone.
In `@python/packages/jumpstarter-cli/jumpstarter_cli/shell.py`:
- Around line 129-133: The current logic conditionally copies
disk_config.refresh_token into config only when disk_refresh is not None,
causing in-memory config to retain an old refresh token if the disk value is
explicitly None; change the behavior in the block that currently sets
config.token and reads disk_refresh (referencing config, disk_config, and
refresh_token) to always assign config.refresh_token = getattr(disk_config,
"refresh_token", None) (i.e., mirror the disk refresh_token exactly, removing
the conditional) so the in-memory session matches the reloaded disk credentials.
- Around line 184-239: _extract_monitor_token_expiry complexity by moving the
recovery/warning logic and sleep selection out of _monitor_token_expiry into
small helper functions (e.g., _handle_recovery_and_warnings(config, lease,
remaining, warned_expiry, warned_refresh_failed, warned_token_expired) which
returns updated warned_* flags and any recovery message, and
_select_sleep_duration(remaining) which returns the sleep seconds); update
_monitor_token_expiry to call these helpers and await anyio.sleep(duration)
while preserving all existing behaviors and state transitions for warned_expiry,
warned_refresh_failed, and warned_token_expired so Ruff’s C901 complexity is no
longer exceeded.
---
Duplicate comments:
In `@python/packages/jumpstarter-cli/jumpstarter_cli/shell_test.py`:
- Around line 339-340: The test is mutating unittest.mock.Mock by assigning
type(cancel_scope).cancel_called = property(...), leaking into other Mock
instances; instead define a dedicated subclass (e.g. class
CancelScopeMock(Mock): `@property` def cancel_called(self): return
check_cancelled()) and instantiate that (CancelScopeMock()) for cancel_scope, or
attach the needed boolean/result directly to the instance
(cancel_scope.cancel_called = <value>) so you do not modify Mock itself; update
the creation of cancel_scope where type(cancel_scope).cancel_called is used to
use the new subclass or instance attribute.
In `@python/packages/jumpstarter/jumpstarter/client/lease.py`:
- Around line 350-355: In _get_lease_end_time, check lease.effective_end_time
first and return it if present, before the current guard that returns None when
effective_begin_time or duration are missing; otherwise fall back to computing
end time from lease.effective_begin_time + lease.duration (if both present) and
return None only when neither an explicit effective_end_time nor the
begin+duration can be derived—this ensures monitor_async() sees explicit
expiries even when effective_begin_time or duration are omitted.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: bfe3f43c-ef68-4dbe-9a1f-b4ceea5f31c8
📒 Files selected for processing (5)
python/packages/jumpstarter-cli/conftest.pypython/packages/jumpstarter-cli/jumpstarter_cli/shell.pypython/packages/jumpstarter-cli/jumpstarter_cli/shell_test.pypython/packages/jumpstarter/jumpstarter/client/lease.pypython/packages/jumpstarter/jumpstarter/client/lease_test.py
🚧 Files skipped from review as they are similar to previous changes (1)
- python/packages/jumpstarter-cli/conftest.py
| # Token on disk is valid and different - use it | ||
| config.token = disk_token | ||
| disk_refresh = getattr(disk_config, "refresh_token", None) | ||
| if disk_refresh is not None: | ||
| config.refresh_token = disk_refresh |
There was a problem hiding this comment.
Mirror the disk refresh_token exactly.
If the reloaded config has refresh_token=None, keeping the old in-memory value leaves the session with mixed credentials. The next auto-refresh can then retry a stale refresh token against a token that came from a different login.
💡 Proposed fix
- disk_refresh = getattr(disk_config, "refresh_token", None)
- if disk_refresh is not None:
- config.refresh_token = disk_refresh
+ config.refresh_token = getattr(disk_config, "refresh_token", None)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@python/packages/jumpstarter-cli/jumpstarter_cli/shell.py` around lines 129 -
133, The current logic conditionally copies disk_config.refresh_token into
config only when disk_refresh is not None, causing in-memory config to retain an
old refresh token if the disk value is explicitly None; change the behavior in
the block that currently sets config.token and reads disk_refresh (referencing
config, disk_config, and refresh_token) to always assign config.refresh_token =
getattr(disk_config, "refresh_token", None) (i.e., mirror the disk refresh_token
exactly, removing the conditional) so the in-memory session matches the reloaded
disk credentials.
raballew
left a comment
There was a problem hiding this comment.
_attempt_token_recovery accepts a remaining parameter that's never used in the function body.
_try_refresh_token hardcodes client_id="jumpstarter-cli", but the login flow supports server-configured client IDs (login.py line 181). If someone logged in with a different client ID, token refresh will silently fail and the user won't know why. Same issue exists in relogin_client (has a TODO comment about it).
In _try_refresh_token, if _update_lease_channel fails after config.token was already updated, the rollback restores config.token and config.refresh_token but doesn't undo the channel swap on the lease object -- the lease could be left with a broken channel even though the function returns False.
monitor_async does await sleep(min(remain.total_seconds(), check_interval)) -- when remain is tiny (e.g. 0.001s near lease end), this creates a tight polling loop hitting the gRPC server rapidly. A minimum sleep (e.g. 1s) would prevent this.
Signed-off-by: Benny Zlotnik <bzlotnik@redhat.com> Assisted-by: claude-opus-4.6
Signed-off-by: Benny Zlotnik <bzlotnik@redhat.com>
Consider effective_begin_time or duration Signed-off-by: Benny Zlotnik <bzlotnik@redhat.com>
Signed-off-by: Benny Zlotnik <bzlotnik@redhat.com>
6a4fb1d to
44a4190
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (1)
python/packages/jumpstarter-cli/jumpstarter_cli/shell.py (1)
131-135:⚠️ Potential issue | 🟡 MinorMirror the disk
refresh_tokenexactly.The conditional assignment
if disk_refresh is not Nonecan leave stale credentials when the disk config explicitly hasrefresh_token=None. This creates a mixed credential state where the session uses a new access token with an old refresh token.💡 Proposed fix
# Token on disk is valid and different - use it config.token = disk_token - disk_refresh = getattr(disk_config, "refresh_token", None) - if disk_refresh is not None: - config.refresh_token = disk_refresh + config.refresh_token = getattr(disk_config, "refresh_token", None)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@python/packages/jumpstarter-cli/jumpstarter_cli/shell.py` around lines 131 - 135, The current code conditionally sets config.refresh_token only when disk_refresh is not None which can leave a stale refresh token; instead always mirror the disk value by assigning config.refresh_token = getattr(disk_config, "refresh_token", None) (remove the if disk_refresh is not None check) so the in-memory config exactly matches disk_config.refresh_token even when it is explicitly None; update the block that currently sets config.token and disk_refresh to unconditionally set config.refresh_token using disk_config/ disk_refresh.
🧹 Nitpick comments (1)
python/packages/jumpstarter/jumpstarter/client/lease_test.py (1)
488-523: Consider increasing the timing margins for CI stability.The test uses a 50ms future end time (line 496) and a 200ms sleep (line 518). While this works locally, tight timing can cause flaky failures in slow CI environments. Consider using slightly larger margins (e.g., 200ms future, 500ms sleep) to improve reliability without significantly increasing test duration.
💡 Suggested adjustment
- future_end = datetime.now(tz=timezone.utc) + timedelta(milliseconds=50) + future_end = datetime.now(tz=timezone.utc) + timedelta(milliseconds=200) ... - await asyncio.sleep(0.2) + await asyncio.sleep(0.5)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@python/packages/jumpstarter/jumpstarter/client/lease_test.py` around lines 488 - 523, The timing in test_estimates_expiry_from_last_known_end_time is tight; increase the future_end margin and the monitor sleep to make the test stable in CI — specifically, change the future_end creation (currently +50ms) to a larger value (e.g., +200ms) and increase the await asyncio.sleep(...) (currently 0.2s) to a larger value (e.g., 0.5s) so the monitor loop (lease.monitor_async and the get_then_fail sequence) has enough time to run reliably.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@python/packages/jumpstarter-cli/jumpstarter_cli/shell.py`:
- Around line 131-135: The current code conditionally sets config.refresh_token
only when disk_refresh is not None which can leave a stale refresh token;
instead always mirror the disk value by assigning config.refresh_token =
getattr(disk_config, "refresh_token", None) (remove the if disk_refresh is not
None check) so the in-memory config exactly matches disk_config.refresh_token
even when it is explicitly None; update the block that currently sets
config.token and disk_refresh to unconditionally set config.refresh_token using
disk_config/ disk_refresh.
---
Nitpick comments:
In `@python/packages/jumpstarter/jumpstarter/client/lease_test.py`:
- Around line 488-523: The timing in
test_estimates_expiry_from_last_known_end_time is tight; increase the future_end
margin and the monitor sleep to make the test stable in CI — specifically,
change the future_end creation (currently +50ms) to a larger value (e.g.,
+200ms) and increase the await asyncio.sleep(...) (currently 0.2s) to a larger
value (e.g., 0.5s) so the monitor loop (lease.monitor_async and the
get_then_fail sequence) has enough time to run reliably.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 220fe33f-fa60-402b-b130-8d4db26f64da
📒 Files selected for processing (5)
python/packages/jumpstarter-cli/conftest.pypython/packages/jumpstarter-cli/jumpstarter_cli/shell.pypython/packages/jumpstarter-cli/jumpstarter_cli/shell_test.pypython/packages/jumpstarter/jumpstarter/client/lease.pypython/packages/jumpstarter/jumpstarter/client/lease_test.py
🚧 Files skipped from review as they are similar to previous changes (1)
- python/packages/jumpstarter-cli/conftest.py
Reverts #210