Skip to content

fix: show external Codex usage without inflating OpenCode Zen#119

Merged
kargnas merged 6 commits intoopgginc:mainfrom
Daltonganger:fix/minimal-codex-lb-usage
Apr 7, 2026
Merged

fix: show external Codex usage without inflating OpenCode Zen#119
kargnas merged 6 commits intoopgginc:mainfrom
Daltonganger:fix/minimal-codex-lb-usage

Conversation

@Daltonganger
Copy link
Copy Markdown
Contributor

Summary

  • add codex-lb self-service /v1/usage support for external OpenAI API-key Codex accounts and map the returned limits into Codex quota data
  • keep OpenCode Zen from counting openai/* spend as pay-as-you-go when OpenCode routes OpenAI through provider.openai.options.baseURL
  • add focused TokenManager and decoding support for OpenAI API-key auth plus regression tests for both flows

Testing

  • make lint-swift
  • xcodebuild -project "CopilotMonitor.xcodeproj" -scheme "CopilotMonitor" -configuration Debug build -quiet

@kargnas
Copy link
Copy Markdown
Member

kargnas commented Apr 2, 2026

@op-gg-ai-devops

@opgginc opgginc deleted a comment from op-gg-ai-devops bot Apr 2, 2026
@op-gg-ai-devops
Copy link
Copy Markdown
Contributor

op-gg-ai-devops bot commented Apr 2, 2026

✅ AI Code Review Completed

Status Task Result
Transform severity/category into shield badges and create final payload Done
Post review via gh api Done — 6 inline comments posted
Update status comment on success Done

Review posted successfully — APPROVED with 6 minor inline comments.

Reviewed with anthropic/claude-opus-4-6


📋 View Full Logs

Copy link
Copy Markdown
Contributor

@op-gg-ai-devops op-gg-ai-devops bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: fix: show external Codex usage without inflating OpenCode Zen

Solid contribution from an external contributor. This PR does three things well:

  1. Self-service /v1/usage endpoint for API-key Codex accounts — clean routing logic with proper URL path rewriting, flexible decoding for mixed API response shapes, and a sensible partition of base vs spark limits. The SelfServiceLimit decoder is defensive and handles all the weird type coercion this project's APIs love to throw. Good stuff.

  2. OpenCode Zen cost deduction — removing openai/* model costs from the pay-as-you-go total when OpenCode routes through an external Codex LB prevents double-counting. The proportional avgCostPerDay adjustment is mathematically sound.

  3. OpenAI API key auth flow — reading openaiAPIKey from both auth.json (as APIKey struct) and ~/.codex/auth.json (as raw OPENAI_API_KEY string) expands auth coverage cleanly.

Tests are thorough — URL rewriting, self-service payload mapping, missing limits, and the OpenCode Zen exclusion logic all have focused test cases. CI is green across build, lint, and test.

A few minor things to consider below, but nothing blocking.


Reviewed with anthropic/claude-opus-4-6


The bot will automatically re-review when new commits are pushed to this branch. You can also request a review by mentioning @op-gg-ai-devops in a comment. Examples:

  • @op-gg-ai-devops review typos
  • @op-gg-ai-devops focus on security
  • @op-gg-ai-devops check the URL rewriting logic

@op-gg-ai-devops
Copy link
Copy Markdown
Contributor

op-gg-ai-devops bot commented Apr 2, 2026

❌ AI Code Review Cancelled

Newer commit detected or workflow was manually cancelled.


📋 View Logs | 🤖 Model: anthropic/claude-opus-4-6

@kargnas
Copy link
Copy Markdown
Member

kargnas commented Apr 3, 2026

@Daltonganger there is a conflict!

@kargnas kargnas marked this pull request as draft April 3, 2026 16:02
@Daltonganger
Copy link
Copy Markdown
Contributor Author

I will have a look! Thank you for the double check, Im still learning from it!

Keep the safer OpenCode OpenAI auth decoding from main while preserving the PR's API-key and JSONC config coverage for Codex usage.
@Daltonganger Daltonganger marked this pull request as ready for review April 3, 2026 16:21
@kargnas
Copy link
Copy Markdown
Member

kargnas commented Apr 7, 2026

@op-gg-ai-devops

@op-gg-ai-devops
Copy link
Copy Markdown
Contributor

op-gg-ai-devops bot commented Apr 7, 2026

✅ AI Code Review Completed

Status Task Result
Read runtime context Done — English, IS_FORK=true
Gather PR context & resolved issues Done — 6 prior threads all resolved
Code analysis Done — 3 findings (2 major, 1 minor)
Build review payload Done — review_payload.json ready (REQUEST_CHANGES)
Post review Done — Review posted successfully

Reviewed with openrouter/qwen/qwen3.6-plus:free


📋 View Full Logs

Copy link
Copy Markdown
Contributor

@op-gg-ai-devops op-gg-ai-devops bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: fix: show external Codex usage without inflating OpenCode Zen (re-review)

All six inline comments from the previous round were addressed cleanly. The codexWindowMetadata helper deriving window labels from limit_window_seconds is exactly the right approach, sameUsageValue with epsilon comparison is solid, and the subscriptionId doc comment is back where it belongs. The double-/v1 URL injection fix is well-protected by the hasSuffix("/v1/usage") early-exit guard, and the SelfServiceLimit decoder handles mixed numeric types gracefully.

Two things caught my attention in this re-review:

1. formatCodexWindowLabel threshold vs. test expectation

compactWindowLabel(2419200) returns "4w" (exactly 4 weeks), which is passed into formatCodexWindowLabel. That function reaches the hours path: 4 × 24 × 7 = 672 hours → 28 days → hits if days >= 28 { return "Monthly" } and returns "Monthly". But testDecodeUsagePayloadDerivesStandardWindowLabelsFromLimitSeconds asserts sparkSecondaryWindowLabel == "28d" for the 2419200-second window. These two disagree — the test would fail if actually run. Either the threshold should be >= 30 (to reserve "Monthly" for 30-31 day periods and let 28-day windows render as "28d"), or the test expectation should be updated to "Monthly". Based on the test intent, >= 30 looks correct.

2. CI "Test" job doesn't actually run tests

test.yml has two xcodebuild build steps but zero xcodebuild test steps. The green "Test" CI badge only confirms the project compiles, not that any XCTest assertion passes. This is how the threshold mismatch above slipped through: no tests were executed. Adding xcodebuild test -scheme CopilotMonitor -destination 'platform=macOS' would catch regressions like this.


The isEmpty != false optional comparison in TokenManager is a minor style nit — see inline.

Note: Smoke tests could not be run in the review environment (no Xcode available on Linux runner). CI build results were used as the build signal.

if hours % 24 == 0 {
let days = hours / 24
if days == 7 { return "Weekly" }
if days >= 28 { return "Monthly" }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Major: bug threshold mismatch: 28-day window returns Monthly, but test expects 28d
compactWindowLabel(2419200) produces "4w", and then formatCodexWindowLabel("4w") computes 4 × 24 × 7 = 672 hours → 28 days → hits this branch and returns "Monthly". But the test at line 468 of CodexProviderTests.swift asserts sparkSecondaryWindowLabel == "28d" for that same input. The test would fail at runtime. Change to >= 30 to keep 28-day (4-week) windows as "28d":

if days >= 30 { return "Monthly" }

XCTAssertEqual(payload.details.codexSecondaryWindowHours, 336)
XCTAssertEqual(payload.details.sparkPrimaryWindowLabel, "12h")
XCTAssertEqual(payload.details.sparkPrimaryWindowHours, 12)
XCTAssertEqual(payload.details.sparkSecondaryWindowLabel, "28d")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Major: bug test expectation conflicts with code: 28-day window formatted as Monthly, not 28d
This test expects "28d" for limit_window_seconds: 2419200 (= 4 × 604800, exactly 4 weeks), but formatCodexWindowLabel would return "Monthly" because 2419200 → compactWindowLabel"4w" → 672h → 28 days → days >= 28 → "Monthly". The assertion will fail when tests are actually run. See the corresponding fix in CodexProvider.swift:842.

}

if let codexAuth = readCodexAuth(),
codexAuth.tokens?.accessToken?.isEmpty != false,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: style confusing nil comparison: isEmpty != false on Optional
The != false comparison on an Optional<Bool> is correct but hard to read at a glance. Prefer:

(codexAuth.tokens?.accessToken ?? "").isEmpty

This returns true when accessToken is nil or empty, and false when it has a real value — the same semantics but obviously so.

@kargnas
Copy link
Copy Markdown
Member

kargnas commented Apr 7, 2026

@op-gg-ai-devops 스크린샷 보여줘

@op-gg-ai-devops
Copy link
Copy Markdown
Contributor

op-gg-ai-devops bot commented Apr 7, 2026

✅ AI Code Review Completed

Status Task Result
[S1] Read runtime context and lock review language Context read and English locked.
[S2] Initialize todo tracking Task tracking initialized.
[S3] Recall past team feedback via supermemory Team feedback retrieved.
[S4] Gather resolved issues, dependency, and repo context Repo and issue context mapped.
[S5] Save durable team feedback to supermemory Insights persisted to memory.
[S6] Run large-PR domain analysis Domain impact analyzed.
[S7] Review PR exhaustively and filter duplicates Review items generated and deduplicated.
[S8] Smoke tests — macOS project, Linux CI, build-only Build verification successful.
[S9] Collect CI status, existing reviews, and env gaps Environment gaps identified.
[S10] Save stable repo knowledge to supermemory Codebase patterns indexed.
[S11] Visual snapshot — skipped (native macOS app, no web UI) Skipped (Native macOS).
[S12] Build review_payload.json Payload successfully generated.
[S13] Update status comment and stop Completed

Reviewed with anthropic/claude-opus-4-6|max


📋 View Full Logs

Copy link
Copy Markdown
Contributor

@op-gg-ai-devops op-gg-ai-devops bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

Codex self-service API key 인증 지원과 OpenCode Zen 이중 과금 방지 로직 전반적으로 잘 짜여있음. OpenAICredentialType enum 설계도 깔끔하고, codexWindowMetadata 헬퍼로 window label 추출하는 것도 적절한 리팩토링임.

좋은 점:

  • SelfServiceUsageResponse의 유연한 타입 디코딩 (parseFlexibleDateString, numeric field fallbacks) — 외부 API 스키마가 불안정할 때 이런 방어적 디코딩이 필수적임
  • adjustStatsForDisplay의 static 설계로 테스트 용이성 확보한 점
  • buildStandardPayloadbaseWindows?.shortWindow ?? primaryWindow ?? synthetic zero fallback 체인이 resilience 측면에서 좋음
  • 테스트 커버리지 추가가 의미 있음 — 특히 CodexProviderTests에 8개 새 케이스

주요 지적사항:

  • formatCodexWindowLabel에서 days >= 28을 "Monthly"로 매핑하는데, 테스트는 28일 window에 "28d"를 기대함 — CI가 xcodebuild test를 실행하지 않아서 잡히지 않고 있음
  • Self-service limit 배열 디코딩이 all-or-nothing이라 한 원소만 깨져도 전체 limit이 사라짐
  • Usage percent가 100으로 하드캡되어 overage 표시 불가 (standard path와 불일치)

CI 상태: Lint ✅ | Build ✅ | Test ✅ | CI ✅ (단, Test는 build만 하고 실제 unit test는 실행하지 않음)

Visual Changes

📸 스크린샷 캡처 실패: 이 PR은 macOS 네이티브 Swift 앱의 백엔드/데이터 레이어 변경사항(CodexProvider, OpenCodeZenProvider, TokenManager)만 포함하고 있어서 웹 UI 스크린샷 캡처가 불가능함. 변경된 파일이 모두 .swift 파일이며, UI 컴포넌트 변경은 없음.

Avoid treating numeric 28-31 day windows as monthly so Codex usage labels remain exact across February and other month lengths. Also run real XCTest coverage in CI and align merge-related TokenManager call sites with the updated models.
@op-gg-ai-devops
Copy link
Copy Markdown
Contributor

op-gg-ai-devops bot commented Apr 7, 2026

✅ AI Code Review Completed

Status Task Result
S1: Read runtime context and lock review language Context read and English locked.
S2: Initialize todo tracking Todos initialized for tracking.
S3: Recall past team feedback via supermemory Recalled feedback from supermemory.
S4: Gather resolved issues, dependency, and repo context Context and dependencies gathered.
S5: Save durable team feedback to supermemory Cancelled.
S6: Large-PR domain analysis Domain analysis finished.
S7: Exhaustive code review and filter duplicates Review complete and nits filtered.
S8: Run smoke tests and dependency installation Tests passed successfully.
S9: Collect CI status, existing reviews, env gaps CI and review context gathered.
S10: Save stable repo knowledge to supermemory Cancelled.
S11: Visual snapshot (conditional) Cancelled.
S12: Build review_payload.json Payload generated.
S13: Post review and update status comment Review posted successfully.

Reviewed with anthropic/claude-opus-4-6|max


📋 View Full Logs

Copy link
Copy Markdown
Contributor

@op-gg-ai-devops op-gg-ai-devops bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: fix: keep Codex day windows explicit

nice work here — the self-service API key path, flexible decoding, and OpenCode Zen cost dedup are all well thought out. the DecodedUsagePayload extraction is clean and the adjustStatsForDisplay logic is a solid approach to avoid double-counting routed OpenAI usage.

a few things worth addressing:

good stuff:

  • decodeFlexibleDouble / decodeFlexibleDate handle the typical API type chaos gracefully (Double, Int, String, timestamps vs ISO8601). solid defensive coding.
  • buildSelfServicePayload and buildStandardPayload split is clean — keeps the two response formats from contaminating each other.
  • sameUsageValue with epsilon comparison is the right call for floating-point dedup.
  • CI workflow set -o pipefail fix means actual build failures won't silently pass anymore. good catch.
  • test coverage for the new URL rewriting, self-service payload decoding, and stats adjustment is thorough.

heads up:

  • the limits array decoding swallows the entire array if any single entry fails — one bad limit kills all valid ones. see inline comment.
  • self-service path caps usage at 100% while standard path allows raw percentages through. inconsistent behavior for overage scenarios.
  • ISO8601DateFormatter is expensive to allocate and gets created fresh on every parseFlexibleDateString call.
  • the codexAuth.tokens?.accessToken?.isEmpty != false guard in TokenManager works correctly but reads like a logic puzzle.

all CI checks are green (Build, SwiftLint, GitHub Actions Lint, Build & Test all pass). could not run local smoke tests since this is a macOS Xcode project and review runs on Linux.


This branch will be automatically reviewed again on new pushes. You can also request a re-review by mentioning @op-gg-ai-devops in a PR comment with optional directions, e.g.:

  • @op-gg-ai-devops review typos
  • @op-gg-ai-devops focus on error handling
  • @op-gg-ai-devops check the test coverage

Reviewed with anthropic/claude-opus-4-6|max

totalTokens = try container.decodeIfPresent(Int.self, forKey: .totalTokens)
cachedInputTokens = try container.decodeIfPresent(Int.self, forKey: .cachedInputTokens)
totalCostUSD = try container.decodeIfPresent(Double.self, forKey: .totalCostUSD)
limits = (try? container.decodeIfPresent([SelfServiceLimit].self, forKey: .limits)) ?? []
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Major: bug all-or-nothing limit decode: one bad entry drops all valid windows
(try? container.decodeIfPresent([SelfServiceLimit].self, forKey: .limits)) ?? [] drops the entire array if any single element fails to decode. one malformed limit entry (e.g. unexpected type for max_value) silently wipes all valid windows.

consider a lossy per-element approach:

let rawLimits = (try? container.decodeIfPresent([AnyCodable].self, forKey: .limits)) ?? []
limits = rawLimits.compactMap { element in
    try? JSONDecoder().decode(SelfServiceLimit.self, from: JSONEncoder().encode(element))
}

or use a LossyDecodableArray<SelfServiceLimit> wrapper so valid entries survive alongside broken ones.

}

let rawPercent = (currentValue / maxValue) * 100.0
let usagePercent = min(max(rawPercent, 0), 100)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: bug usage percent cap inconsistency: self-service clips at 100% but standard path doesn't
min(max(rawPercent, 0), 100) caps self-service usage at 100%, but buildStandardPayload passes raw used_percent straight through without clamping. if the API reports >100% during overage, the standard path shows it while self-service hides it.

either cap both paths consistently or let both pass raw values through — the display layer should decide how to present overages.

}

if let codexAuth = readCodexAuth(),
codexAuth.tokens?.accessToken?.isEmpty != false,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: style triple-optional negation: correct but reads like a logic puzzle
codexAuth.tokens?.accessToken?.isEmpty != false is a triple-optional-negation chain that correctly means "proceed when there's no usable OAuth token" — but it reads like a logic puzzle. a future maintainer could easily misread or break this.

something like this would be self-documenting:

let hasOAuthToken = codexAuth.tokens?.accessToken.map { !$0.isEmpty } ?? false
guard !hasOAuthToken,

: Date(timeIntervalSince1970: timestamp)
}

let formatterWithFractional = ISO8601DateFormatter()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: perf ISO8601DateFormatter allocated per call: expensive object re-created on every parse
ISO8601DateFormatter() is expensive to allocate — two new instances get created on every parseFlexibleDateString call. for a provider that fetches periodically, this adds up.

hoist them to static properties:

private static let isoFormatterWithFrac: ISO8601DateFormatter = {
    let f = ISO8601DateFormatter()
    f.formatOptions = [.withInternetDateTime, .withFractionalSeconds]
    return f
}()
private static let isoFormatter: ISO8601DateFormatter = {
    let f = ISO8601DateFormatter()
    f.formatOptions = [.withInternetDateTime]
    return f
}()

@kargnas kargnas merged commit 03f3824 into opgginc:main Apr 7, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants