Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 74 additions & 0 deletions docs/maintenance/2026-03-17-status-rate-limit-visibility.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
# /status Rate Limit Visibility

## Feature Summary

Enhance the Telegram `/status` command so it shows the active half-day usage window, current usage percentage, remaining percentage, and the refresh time or countdown.

## Problem Background

Current behavior:

1. Open a Telegram chat that uses `nonebot-plugin-codex`.
2. Send `/status`.
3. The plugin opens the generic workspace panel only.

Current gaps:

- no current morning or afternoon usage state
- no usage percentage for the active quota window
- no remaining percentage
- no refresh time or countdown
- no graceful status text when rate-limit data is temporarily unavailable

Actual behavior today is that `/status` is effectively an alias of `/panel`. The panel shows chat preferences, workdir, session state, and recent history, but it does not expose quota information.

## Proposal

- Keep `/status` as the Telegram entrypoint for operational state.
- Extend the existing workspace or status rendering path with a dedicated rate-limit section.
- Prefer official Codex account rate-limit data from the `codex app-server` lane when available.
- Display percentage-based data and refresh timing, not guessed absolute credits.
- Show morning or afternoon wording based on the active local window.
- Fall back to explicit unavailable text when upstream rate-limit data cannot be fetched.

Expected user-visible result:

- current morning or afternoon status
- used percentage
- remaining percentage
- reset time
- human-readable time until refresh

## Alternatives

- Infer quota state only from local session token-usage logs.
- This is insufficient because local usage does not equal account-level remaining quota or refresh timing.
- Keep `/status` unchanged and add a separate quota command.
- This is possible, but weaker for discoverability because users already expect `/status` to answer this question.

## Scope And Constraints

- Preserve command compatibility for `/status` and `/panel` unless a deliberate behavior change is documented.
- Do not silently change documented config semantics.
- Keep the implementation small and reviewable.
- Follow TDD for behavior changes.
- If upstream rate-limit data is unavailable, degrade cleanly instead of estimating.

Affected files or commands likely include:

- `src/nonebot_plugin_codex/service.py`
- `src/nonebot_plugin_codex/telegram.py`
- `src/nonebot_plugin_codex/native_client.py`
- `tests/test_service.py`
- `tests/test_telegram_handlers.py`
- `/status`
- `/panel`
- `codex app-server`

## Verification Plan

- Add service-level tests covering status rendering with and without rate-limit data.
- Add Telegram handler tests confirming `/status` shows the enriched status panel.
- Run `pdm run pytest tests/test_service.py tests/test_telegram_handlers.py -q`.
- Run `pdm run pytest -q`.
- Run `pdm run ruff check .`.
18 changes: 18 additions & 0 deletions docs/maintenance/2026-03-17-telegram-subagent-visibility.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,17 @@ When the plugin uses the native `codex app-server` lane and Codex delegates work
- The native client forwarded all `agentMessage` deltas and completed texts without checking `phase`.
- Commentary text could therefore appear in the stream/final reply path.
- Collaboration tool calls were ignored, so the Telegram progress panel lacked main-agent/subagent context.
- In follow-up testing, main-agent final text could still be lost when it only existed in
`item/agentMessage/delta` frames, because the native fallback looked up the wrong
buffered key on `turn/completed`.
- Another follow-up issue appeared after multi-agent support landed: the native runner
treated any `turn/completed` as the end of the active run, including subagent turns.
That could bind later follow-up prompts to the subagent thread instead of the main
thread, and it also made the bridge vulnerable to leaking subagent result text into
the main-agent final-answer path.
- When that happened, Telegram could still finalize the main progress panel as
`Codex 已完成。`, then separately send `Codex 已完成,但没有返回可展示的最终文本。`,
which made successful-looking runs appear to stop right after a subagent failure.

## Affected Modules

Expand All @@ -48,3 +59,10 @@ When the plugin uses the native `codex app-server` lane and Codex delegates work
- final-answer `agentMessage`
- Confirm that only the final answer reaches `on_stream_text`.
- Confirm that progress updates mention both the main agent and the subagent state.
- Add a native-client regression where a subagent reports `errored` but the main agent
still produces a final answer through delta-only fallback.
- Add a native-client regression where a subagent emits its own `turn/completed`
before the main thread finishes, and confirm the client waits for the main
`turn/completed` before returning or updating the stored thread id.
- Confirm that Telegram uses `Codex 已完成,但没有返回可展示的最终文本。` for the main
progress panel instead of a plain `Codex 已完成。` when no final text is available.
10 changes: 10 additions & 0 deletions src/nonebot_plugin_codex/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,12 @@ async def _sync_telegram_commands(bot: Bot) -> None:
block=True,
rule=handlers.is_workspace_callback,
)
status_callback = on_type(
CallbackQueryEvent,
priority=10,
block=True,
rule=handlers.is_status_callback,
)

@codex_cmd.handle()
async def _handle_codex(
Expand Down Expand Up @@ -247,6 +253,10 @@ async def _handle_workspace_callback(
) -> None:
await handlers.handle_workspace_callback(bot, event)

@status_callback.handle()
async def _handle_status_callback(bot: Bot, event: CallbackQueryEvent) -> None:
await handlers.handle_status_callback(bot, event)

@follow_up.handle()
async def _handle_follow_up(bot: Bot, event: MessageEvent) -> None:
await handlers.handle_follow_up(bot, event)
87 changes: 78 additions & 9 deletions src/nonebot_plugin_codex/native_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,14 @@ class NativeAgentUpdate:
text: str


Callback = Callable[[NativeAgentUpdate], object]
@dataclass(slots=True)
class NativeTokenUsage:
total_tokens: int
model_context_window: int | None = None


Callback = Callable[[Any], object]
TokenUsageCallback = Callable[[NativeTokenUsage], object]
ProcessLauncher = Callable[..., Awaitable[Any]]


Expand Down Expand Up @@ -65,7 +72,7 @@ def _thread_summary_from_payload(thread: dict[str, Any]) -> NativeThreadSummary:
)


async def _maybe_call(callback: Callback | None, update: NativeAgentUpdate) -> None:
async def _maybe_call(callback: Callback | None, update: Any) -> None:
if callback is None:
return
result = callback(update)
Expand Down Expand Up @@ -211,7 +218,6 @@ def _format_collab_tool_progress(

return updates


async def _terminate_process(process: Any, timeout: float) -> None:
if process is None:
return
Expand Down Expand Up @@ -321,10 +327,12 @@ async def run_turn(
reasoning_effort: str | None = None,
on_progress: Callback | None = None,
on_stream_text: Callback | None = None,
on_token_usage: TokenUsageCallback | None = None,
) -> NativeRunResult:
diagnostics: list[str] = []
final_text = ""
pending_agent_messages: dict[str, str] = {}
pending_agent_message_phases: dict[str, str | None] = {}
last_streamed_text: dict[str, str] = {}
last_compaction_notice: dict[str, str] = {}

Expand Down Expand Up @@ -414,15 +422,30 @@ async def emit_compaction_notice(agent_key: str, text: str) -> None:
continue
if item_type == "agentMessage":
item_id = item.get("id")
if isinstance(item_id, str) and item_id:
pending_agent_messages.pop(f"{agent_key}:{item_id}", None)
phase = item.get("phase")
item_key = (
f"{agent_key}:{item_id}"
if isinstance(item_id, str) and item_id
else None
)
if item_key is not None:
pending_agent_message_phases[item_key] = (
phase if isinstance(phase, str) else None
)
if (
method == "item/completed"
and isinstance(item_id, str)
and item_id
):
pending_agent_messages.pop(item_key, None)
pending_agent_message_phases.pop(item_key, None)
text = item.get("text")
if isinstance(text, str) and text.strip():
phase = item.get("phase")
stripped = text.strip()
await emit_stream_update(agent_key, stripped)
if phase != "commentary" and agent_key == "main":
final_text = stripped
if phase != "commentary":
if agent_key == "main":
final_text = stripped
continue

if method == "item/agentMessage/delta":
Expand Down Expand Up @@ -454,9 +477,44 @@ async def emit_compaction_notice(agent_key: str, text: str) -> None:
)
notice = _extract_compaction_notice(params) or "已压缩较早对话上下文。"
await emit_compaction_notice(agent_key, notice)

if method == "thread/tokenUsage/updated":
agent_key = _normalize_agent_key(
params.get("threadId"),
main_thread_id=thread_id,
)
if agent_key != "main":
continue
token_usage = params.get("tokenUsage")
if not isinstance(token_usage, dict):
continue
total = token_usage.get("total")
total_tokens = (
total.get("totalTokens") if isinstance(total, dict) else None
)
model_context_window = token_usage.get("modelContextWindow")
if not isinstance(total_tokens, int):
continue
if model_context_window is not None and not isinstance(
model_context_window, int
):
model_context_window = None
await _maybe_call(
on_token_usage,
NativeTokenUsage(
total_tokens=total_tokens,
model_context_window=model_context_window,
),
)
continue

if method == "turn/completed":
completed_agent_key = _normalize_agent_key(
params.get("threadId"),
main_thread_id=thread_id,
)
if completed_agent_key != "main":
continue
turn = params.get("turn")
if not isinstance(turn, dict):
return NativeRunResult(
Expand All @@ -470,10 +528,14 @@ async def emit_compaction_notice(agent_key: str, text: str) -> None:
(
key
for key in reversed(list(pending_agent_messages))
if key.endswith(":main") or key == "__legacy__:main"
if key.startswith("main:") or key == "__legacy__:main"
),
None,
)
if fallback_key is not None:
fallback_phase = pending_agent_message_phases.get(fallback_key)
if fallback_phase == "commentary":
fallback_key = None
if fallback_key is not None:
buffered_text = pending_agent_messages[fallback_key].strip()
if buffered_text:
Expand Down Expand Up @@ -580,6 +642,13 @@ async def list_threads(self) -> list[NativeThreadSummary]:

return threads

async def read_rate_limits(self) -> dict[str, Any]:
result = await self._request("account/rateLimits/read", {})
snapshot = result.get("rateLimits")
if not isinstance(snapshot, dict):
raise RuntimeError("account/rateLimits/read 缺少 rateLimits 响应。")
return snapshot

def _permission_params(self, permission_mode: str) -> dict[str, str]:
if permission_mode == "safe":
return {"approvalPolicy": "never", "sandbox": "workspace-write"}
Expand Down
Loading