ttiee · ttiee · Mar 18, 2026 · Mar 18, 2026 · Mar 18, 2026 · Mar 18, 2026
diff --git a/docs/maintenance/2026-03-17-status-rate-limit-visibility.md b/docs/maintenance/2026-03-17-status-rate-limit-visibility.md
@@ -0,0 +1,74 @@
+# /status Rate Limit Visibility
+
+## Feature Summary
+
+Enhance the Telegram `/status` command so it shows the active half-day usage window, current usage percentage, remaining percentage, and the refresh time or countdown.
+
+## Problem Background
+
+Current behavior:
+
+1. Open a Telegram chat that uses `nonebot-plugin-codex`.
+2. Send `/status`.
+3. The plugin opens the generic workspace panel only.
+
+Current gaps:
+
+- no current morning or afternoon usage state
+- no usage percentage for the active quota window
+- no remaining percentage
+- no refresh time or countdown
+- no graceful status text when rate-limit data is temporarily unavailable
+
+Actual behavior today is that `/status` is effectively an alias of `/panel`. The panel shows chat preferences, workdir, session state, and recent history, but it does not expose quota information.
+
+## Proposal
+
+- Keep `/status` as the Telegram entrypoint for operational state.
+- Extend the existing workspace or status rendering path with a dedicated rate-limit section.
+- Prefer official Codex account rate-limit data from the `codex app-server` lane when available.
+- Display percentage-based data and refresh timing, not guessed absolute credits.
+- Show morning or afternoon wording based on the active local window.
+- Fall back to explicit unavailable text when upstream rate-limit data cannot be fetched.
+
+Expected user-visible result:
+
+- current morning or afternoon status
+- used percentage
+- remaining percentage
+- reset time
+- human-readable time until refresh
+
+## Alternatives
+
+- Infer quota state only from local session token-usage logs.
+  - This is insufficient because local usage does not equal account-level remaining quota or refresh timing.
+- Keep `/status` unchanged and add a separate quota command.
+  - This is possible, but weaker for discoverability because users already expect `/status` to answer this question.
+
+## Scope And Constraints
+
+- Preserve command compatibility for `/status` and `/panel` unless a deliberate behavior change is documented.
+- Do not silently change documented config semantics.
+- Keep the implementation small and reviewable.
+- Follow TDD for behavior changes.
+- If upstream rate-limit data is unavailable, degrade cleanly instead of estimating.
+
+Affected files or commands likely include:
+
+- `src/nonebot_plugin_codex/service.py`
+- `src/nonebot_plugin_codex/telegram.py`
+- `src/nonebot_plugin_codex/native_client.py`
+- `tests/test_service.py`
+- `tests/test_telegram_handlers.py`
+- `/status`
+- `/panel`
+- `codex app-server`
+
+## Verification Plan
+
+- Add service-level tests covering status rendering with and without rate-limit data.
+- Add Telegram handler tests confirming `/status` shows the enriched status panel.
+- Run `pdm run pytest tests/test_service.py tests/test_telegram_handlers.py -q`.
+- Run `pdm run pytest -q`.
+- Run `pdm run ruff check .`.
diff --git a/docs/maintenance/2026-03-17-telegram-subagent-visibility.md b/docs/maintenance/2026-03-17-telegram-subagent-visibility.md
@@ -32,6 +32,17 @@ When the plugin uses the native `codex app-server` lane and Codex delegates work
 - The native client forwarded all `agentMessage` deltas and completed texts without checking `phase`.
 - Commentary text could therefore appear in the stream/final reply path.
 - Collaboration tool calls were ignored, so the Telegram progress panel lacked main-agent/subagent context.
+- In follow-up testing, main-agent final text could still be lost when it only existed in
+  `item/agentMessage/delta` frames, because the native fallback looked up the wrong
+  buffered key on `turn/completed`.
+- Another follow-up issue appeared after multi-agent support landed: the native runner
+  treated any `turn/completed` as the end of the active run, including subagent turns.
+  That could bind later follow-up prompts to the subagent thread instead of the main
+  thread, and it also made the bridge vulnerable to leaking subagent result text into
+  the main-agent final-answer path.
+- When that happened, Telegram could still finalize the main progress panel as
+  `Codex 已完成。`, then separately send `Codex 已完成，但没有返回可展示的最终文本。`,
+  which made successful-looking runs appear to stop right after a subagent failure.
 
 ## Affected Modules
 
@@ -48,3 +59,10 @@ When the plugin uses the native `codex app-server` lane and Codex delegates work
   - final-answer `agentMessage`
 - Confirm that only the final answer reaches `on_stream_text`.
 - Confirm that progress updates mention both the main agent and the subagent state.
+- Add a native-client regression where a subagent reports `errored` but the main agent
+  still produces a final answer through delta-only fallback.
+- Add a native-client regression where a subagent emits its own `turn/completed`
+  before the main thread finishes, and confirm the client waits for the main
+  `turn/completed` before returning or updating the stored thread id.
+- Confirm that Telegram uses `Codex 已完成，但没有返回可展示的最终文本。` for the main
+  progress panel instead of a plain `Codex 已完成。` when no final text is available.
diff --git a/src/nonebot_plugin_codex/__init__.py b/src/nonebot_plugin_codex/__init__.py
@@ -134,6 +134,12 @@ async def _sync_telegram_commands(bot: Bot) -> None:
         block=True,
         rule=handlers.is_workspace_callback,
     )
+    status_callback = on_type(
+        CallbackQueryEvent,
+        priority=10,
+        block=True,
+        rule=handlers.is_status_callback,
+    )
 
     @codex_cmd.handle()
     async def _handle_codex(
@@ -247,6 +253,10 @@ async def _handle_workspace_callback(
     ) -> None:
         await handlers.handle_workspace_callback(bot, event)
 
+    @status_callback.handle()
+    async def _handle_status_callback(bot: Bot, event: CallbackQueryEvent) -> None:
+        await handlers.handle_status_callback(bot, event)
+
     @follow_up.handle()
     async def _handle_follow_up(bot: Bot, event: MessageEvent) -> None:
         await handlers.handle_follow_up(bot, event)
diff --git a/src/nonebot_plugin_codex/native_client.py b/src/nonebot_plugin_codex/native_client.py
@@ -15,7 +15,14 @@ class NativeAgentUpdate:
     text: str
 
 
-Callback = Callable[[NativeAgentUpdate], object]
+@dataclass(slots=True)
+class NativeTokenUsage:
+    total_tokens: int
+    model_context_window: int | None = None
+
+
+Callback = Callable[[Any], object]
+TokenUsageCallback = Callable[[NativeTokenUsage], object]
 ProcessLauncher = Callable[..., Awaitable[Any]]
 
 
@@ -65,7 +72,7 @@ def _thread_summary_from_payload(thread: dict[str, Any]) -> NativeThreadSummary:
     )
 
 
-async def _maybe_call(callback: Callback | None, update: NativeAgentUpdate) -> None:
+async def _maybe_call(callback: Callback | None, update: Any) -> None:
     if callback is None:
         return
     result = callback(update)
@@ -211,7 +218,6 @@ def _format_collab_tool_progress(
 
     return updates
 
-
 async def _terminate_process(process: Any, timeout: float) -> None:
     if process is None:
         return
@@ -321,10 +327,12 @@ async def run_turn(
         reasoning_effort: str | None = None,
         on_progress: Callback | None = None,
         on_stream_text: Callback | None = None,
+        on_token_usage: TokenUsageCallback | None = None,
     ) -> NativeRunResult:
         diagnostics: list[str] = []
         final_text = ""
         pending_agent_messages: dict[str, str] = {}
+        pending_agent_message_phases: dict[str, str | None] = {}
         last_streamed_text: dict[str, str] = {}
         last_compaction_notice: dict[str, str] = {}
 
@@ -414,15 +422,30 @@ async def emit_compaction_notice(agent_key: str, text: str) -> None:
                     continue
                 if item_type == "agentMessage":
                     item_id = item.get("id")
-                    if isinstance(item_id, str) and item_id:
-                        pending_agent_messages.pop(f"{agent_key}:{item_id}", None)
+                    phase = item.get("phase")
+                    item_key = (
+                        f"{agent_key}:{item_id}"
+                        if isinstance(item_id, str) and item_id
+                        else None
+                    )
+                    if item_key is not None:
+                        pending_agent_message_phases[item_key] = (
+                            phase if isinstance(phase, str) else None
+                        )
+                    if (
+                        method == "item/completed"
+                        and isinstance(item_id, str)
+                        and item_id
+                    ):
+                        pending_agent_messages.pop(item_key, None)
+                        pending_agent_message_phases.pop(item_key, None)
                     text = item.get("text")
                     if isinstance(text, str) and text.strip():
-                        phase = item.get("phase")
                         stripped = text.strip()
                         await emit_stream_update(agent_key, stripped)
-                        if phase != "commentary" and agent_key == "main":
-                            final_text = stripped
+                        if phase != "commentary":
+                            if agent_key == "main":
+                                final_text = stripped
                     continue
 
             if method == "item/agentMessage/delta":
@@ -454,9 +477,44 @@ async def emit_compaction_notice(agent_key: str, text: str) -> None:
                 )
                 notice = _extract_compaction_notice(params) or "已压缩较早对话上下文。"
                 await emit_compaction_notice(agent_key, notice)
+
+            if method == "thread/tokenUsage/updated":
+                agent_key = _normalize_agent_key(
+                    params.get("threadId"),
+                    main_thread_id=thread_id,
+                )
+                if agent_key != "main":
+                    continue
+                token_usage = params.get("tokenUsage")
+                if not isinstance(token_usage, dict):
+                    continue
+                total = token_usage.get("total")
+                total_tokens = (
+                    total.get("totalTokens") if isinstance(total, dict) else None
+                )
+                model_context_window = token_usage.get("modelContextWindow")
+                if not isinstance(total_tokens, int):
+                    continue
+                if model_context_window is not None and not isinstance(
+                    model_context_window, int
+                ):
+                    model_context_window = None
+                await _maybe_call(
+                    on_token_usage,
+                    NativeTokenUsage(
+                        total_tokens=total_tokens,
+                        model_context_window=model_context_window,
+                    ),
+                )
                 continue
 
             if method == "turn/completed":
+                completed_agent_key = _normalize_agent_key(
+                    params.get("threadId"),
+                    main_thread_id=thread_id,
+                )
+                if completed_agent_key != "main":
+                    continue
                 turn = params.get("turn")
                 if not isinstance(turn, dict):
                     return NativeRunResult(
@@ -470,10 +528,14 @@ async def emit_compaction_notice(agent_key: str, text: str) -> None:
                         (
                             key
                             for key in reversed(list(pending_agent_messages))
-                            if key.endswith(":main") or key == "__legacy__:main"
+                            if key.startswith("main:") or key == "__legacy__:main"
                         ),
                         None,
                     )
+                    if fallback_key is not None:
+                        fallback_phase = pending_agent_message_phases.get(fallback_key)
+                        if fallback_phase == "commentary":
+                            fallback_key = None
                     if fallback_key is not None:
                         buffered_text = pending_agent_messages[fallback_key].strip()
                         if buffered_text:
@@ -580,6 +642,13 @@ async def list_threads(self) -> list[NativeThreadSummary]:
 
         return threads
 
+    async def read_rate_limits(self) -> dict[str, Any]:
+        result = await self._request("account/rateLimits/read", {})
+        snapshot = result.get("rateLimits")
+        if not isinstance(snapshot, dict):
+            raise RuntimeError("account/rateLimits/read 缺少 rateLimits 响应。")
+        return snapshot
+
     def _permission_params(self, permission_mode: str) -> dict[str, str]:
         if permission_mode == "safe":
             return {"approvalPolicy": "never", "sandbox": "workspace-write"}