feat: revenue + TEE hardening (Phase 2, 7 sprints)#4
Conversation
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (81)
✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
c26b400 to
ddaab9a
Compare
Phase 2 Sprint 1 of the revenue-and-tee track. Adds two privacy primitives that run on the existing API service without infra changes. Auto-purge - New sthrip/services/purge_service.py with per-table purge functions for transactions, escrow_deals, escrow_milestones, message_relays, and audit_log, plus a run_full_purge orchestrator. - Terminal-status guard on the three payment-graph tables; FK protection on transactions (active escrow referencing same agents blocks delete). - HMAC chain rolling reset for audit_log: synthetic chain_reset row with NULL prev_hmac/entry_hmac, which the existing verify_chain already treats as legacy. The new chain restarts seamlessly. - New env STHRIP_DATA_RETENTION_DAYS (default 60, validated 7..365). Warrant canary - New sthrip/services/canary_service.py: Ed25519 detached signature over canonical JSON. publish_daily_canary upserts row id=1. get_current_canary returns None when stored signature is older than 48h (staleness signal). - New env CANARY_SIGNING_KEY (base64-encoded Ed25519 32-byte seed). - Dedicated key per Lead Decision: compromise of one signing key does not cascade into webhook integrity. - New endpoint GET /.well-known/canary.txt — 200 fresh, 503 stale. Migration - migrations/versions/w4x5y6z7a8b9_purge_metadata.py adds purge_metadata and canary_state tables. Idempotent CREATE / DROP. Up/down round-trip verified on SQLite stamped at the predecessor revision. Scheduler - Two new background tasks in api/main_v2.py: _purge_loop (daily 03:00 UTC) and _canary_loop (daily 03:05 UTC). Both use the existing Redis distributed-lease pattern so only one replica runs each cycle. Canary loop short-circuits when CANARY_SIGNING_KEY is unset (opt-in). Tests - 17 new tests across tests/test_purge_service.py and tests/test_canary_service.py. Covers all 10 contract acceptance criteria plus 7 supporting cases. Full suite: 2812 passed (+17 vs baseline 2795), 24 pre-existing failures unchanged, zero regressions. Threat model - docs/THREAT_MODEL.md notes the data-minimization addition.
… (Phase 2 Sprint 3) Adds the FREE-tier 100-tx/month enforcement primitive plus the self-service tier-management endpoints required by lead-decisions.md. Schema (migration y6z7a8b9c0d1): - agents.tier_grace_until TIMESTAMPTZ NULL — billing cron sets this when an XMR auto-debit fails; middleware preserves declared tier until expiry, after which the agent is enforced as FREE. - New table agent_monthly_stats (PK on (agent_id, month_start)) — per- agent monthly transfer counter with composite-key index. - Grandfather invariant: rows with tier IS NULL are set to FREE. Stats counter (sthrip/services/agent_stats_service.py): - record_transaction uses INSERT ... ON CONFLICT DO UPDATE on Postgres and INSERT OR IGNORE + guarded UPDATE on SQLite — atomic on both. - Wired into TransactionRepository.create_with_commission AFTER the fee- row insert and BEFORE commit (single DB transaction). Legacy create() path is intentionally NOT wired — system-op writers don't bump. Middleware (api/middleware/tier_limit.py): - Gates POST /v2/payments/hub-routing and /v2/payments/internal-transfer for FREE agents at FREE_TIER_MONTHLY_LIMIT (100). Paid tiers unlimited. 429 body matches contract: tier_limit_reached / current_count / limit / upgrade_url / message. - Honors tier_grace_until: future = preserve declared paid tier; past = enforce as FREE. - Registered in api/main_v2.py AFTER configure_middleware so request-id / metrics / auth run first. Self-service endpoints (api/routers/agents.py): - POST /v2/me/upgrade — accepts both labels (pro/enterprise) and enum (verified/premium/free), case-insensitive. Audit-logs tier_upgrade. - POST /v2/me/downgrade — symmetric; audit-logs tier_downgrade. - GET /v2/me/tier — returns tier/label/current_month_count/limit/ remaining/tier_grace_until. - XMR billing stubbed with explicit "TODO: Sprint 4" markers. Tests: - 14 new tests in tests/test_tier_enforcement.py covering all 14 contract criteria (limit enforcement, upgrade/downgrade label parsing, audit emission, grace semantics both ways, month rollover, counter atomicity, system-op bypass, migration grandfather). - Existing fixtures updated to create the new agent_monthly_stats table: tests/conftest.py, tests/test_commission_on_transfer.py, tests/test_idempotency_db_v2.py. - tests/test_production_fixes_round2.py::test_body_limit_covers_delete now finds api/middleware.py OR api/middleware/__init__.py (the middleware module became a package this sprint). Suite delta: 2842 -> 2856 passed (+14). Failure-set unchanged (diff against baseline = empty). Zero regressions. Migration round-trip verified in isolation: alembic stamp x5y6z7a8b9c0 -> upgrade -> downgrade -> upgrade
…2 Sprint 4) Wires the actual money flow into the Sprint 3 self-service tier endpoints, adds the monthly billing cron, daily grace-expiry sweep, live XMR/USD rate cache, and per-agent billing ledger. Closes Phase 2 Revenue. * sthrip/services/subscription_billing_service.py — bill_pro_subscriptions, handle_grace_expiry, start_grace_period, prorate_charge, compute_refund. Idempotent on (agent_id, month_start, status='monthly_charge'). Atomic balance-deduct + history-insert + tier-mutation in one DB transaction. * sthrip/services/xmr_rate_service.py — CoinGecko free-tier fetch with 5-minute fresh / 24h stale cache. Raises RateUnavailableError beyond staleness window so a chronic feed outage cannot silently use an ancient rate. * sthrip/db/models.py + migration z7a8b9c0d1e2 — agent_billing_history ledger with partial unique index for cron idempotency on Postgres. * api/routers/agents.py — /v2/me/upgrade now charges pro-rated XMR (returns 402 on insufficient balance, no tier change). /v2/me/downgrade refunds the unused portion to balance. * api/main_v2.py — _subscription_billing_loop (1st of month 04:00 UTC) and _grace_expiry_loop (daily 04:30 UTC), both gated by Redis lease. * api/middleware/tier_limit.py — Sprint 3 carry-over: emit a metric counter when the DB-backed counter lookup fails open. * sthrip/services/metrics.py — tier_limit_fail_open_total + subscription_billing_total counters. * tests/test_subscription_billing.py — 15 tests covering all contract acceptance criteria (charge, grace, expiry, prorate, refund, atomic rollback, rate cache, FREE-skip, idempotency). * PRIVACY_FEATURES.md, docs/THREAT_MODEL.md — note billing data retention follows the Phase 1 auto-purge contract. Sprint 4 contract: .harness/phase2-money-and-tee/sprint-4-contract.md
…e 3 Sprint 5) Phase 3 kickoff — TEE migration deploy artefacts. No live GCP resources provisioned; runbook ready for operator to execute Sprint 6+. * gcp/payment_tee_deploy/payment_service.py — minimal FastAPI app with /health, /attestation, /v2/payments/hub-routing. Re-implements the hub routing core via TransactionRepository.create_with_commission, avoiding imports of marketplace/escrow/MCP/admin-UI/Tor/etc. * gcp/payment_tee_deploy/import_guard.py — boot-time scan of sys.modules with project-only restriction + ALLOWED_OVERRIDES carve-outs. Opt-in via TEE_ENFORCE_BOUNDARY=1 (Dockerfile sets it; tests force it via subprocess). * Multi-stage Dockerfile (python:3.9-slim, non-root uid 10001, EXPOSE 8080, --no-install-recommends discipline, OCI labels). * setup-vm.sh / teardown-vm.sh — idempotent provisioning with --dry-run, AMD SEV-SNP n2d-standard-2 + shielded-VM hardening + named static IP. * mtls/generate-certs.sh — openssl CA + server + client cert mint with proper extensions and 600/644 perms. * Comprehensive operator README + mtls README. * tests/test_payment_service_self_contained.py — 8 contract tests covering import allowlist (positive + negative), endpoints, Dockerfile lints, setup-vm dry-run, idempotency probe, and cert generation. Suite: 2871 → 2879 passed (+8). Same 24 pre-existing failures. Zero regressions.
…se 3 Sprint 7 — FINAL)
ddaab9a to
0bbb37a
Compare
Summary
Phase 2 — revenue streams + TEE migration. 7 sprints, 8 commits, 104 new tests, 0 regressions, suite 2795 → 2899.
Built via autonomous /harness-long-task with cron-driven /loop. All 7 sprints PASSed independent Evaluator review.
What ships
Phase 1 — Data minimization (1 sprint)
a3a6e38): auto-purge cron (60-day default) + warrant canary at/.well-known/canary.txt(Ed25519 signed JSON, 503 on 48h staleness)Phase 2 — Revenue (3 sprints)
768d0ea+b1d05a3): commission deduction at transfer write — 0.3% Free / 0.1% Pro+. Iter 2 wired hub-routing path and removed legacy 1% fee_collector to prevent 1.3%/1.1% double-charge.dd29657): tier enforcement middleware (FREE 100 tx/month → 429 with upgrade hint),/v2/me/upgrade|downgrade|tierendpoints, atomic stats counter wired only to commission path959377a): XMR subscription billing cron (1st of month), 7-day grace period, mid-month proration + refund, CoinGecko rate cache (5-min TTL, 24h fallback thenRateUnavailableError)Phase 3 — TEE migration (3 sprints)
ed3821c): GCP Confidential VM artifacts — Dockerfile, payment-only service, import_guard, setup-vm.sh, mTLS cert generation. No live GCP resources provisioned in this PR — operator runs deploy per CUTOVER.md.6fee072): Railway → GCP payment dispatch proxy withSTHRIP_PAYMENT_VIA_TEEfeature flag (default false), 4xx vs 5xx routing distinction, fall-back on TEEUnreachable/TEEServerError, HubRoute row written Railway-side after TEE 2xx (M-1 fix from Sprint 5)1ffb77c): SEV-SNP attestation service + SDKverify_tee=Trueparameter +gcp/payment_tee_deploy/CUTOVER.mdoperator runbook (13 steps)Test plan
pytest tests/ -q --ignore=tests/test_cli_client.py --ignore=tests/test_cli_commands.py→ expect 2899 passed / 24 pre-existing failed, 0 regressionsalembic upgrade head; alembic downgrade -4; alembic upgrade head/.well-known/canary.txtreturns signed JSON; daily purge cron logs visiblefee_collectionsrow appears withrate_applied_bps=30GET /v2/me/tierreturns current usage; FREE agent gets 429 at 101st transferagent_billing_historypopulated on 1st-of-month crongcp/payment_tee_deploy/CUTOVER.md. Soak withSTHRIP_PAYMENT_VIA_TEE=false24-48h. Flip in staging first, then prod.Key saves during harness
fee_collector. Independent Evaluator read the production caller end-to-end after the additive method passed unit tests.TEEUserErrorclient-side + dispatcherHTTPExceptiontranslation.Deferred to future
sthrip/payment_core/submodule to removeALLOWED_OVERRIDEScarve-out (Sprint 5 deferred)/admin/revenuedashboard (mentioned in product-spec, not in harness scope)Operator action items (post-merge, sequential)
See
gcp/payment_tee_deploy/CUTOVER.md. TL;DR:CANARY_SIGNING_KEY(Ed25519 base64) +STHRIP_DATA_RETENTION_DAYSon Railwayattestation_anchors.pyupdateSTHRIP_PAYMENT_VIA_TEE=truein staging, monitor, then prod