gotempsh · dviejokfs · May 22, 2026 · May 19, 2026 · May 19, 2026 · May 19, 2026
diff --git a/AGENTS.md b/AGENTS.md
@@ -0,0 +1,91 @@
+# AGENTS.md
+
+Conventions for AI coding agents working on this repo (Claude Code,
+Codex, aider, etc.). The detailed engineering rules live in
+[`CLAUDE.md`](./CLAUDE.md); this file is the short list of process
+conventions that go *around* the code. Read both.
+
+## Always update `CHANGELOG.md`
+
+Every user-visible change in this repo lands with a `CHANGELOG.md`
+entry under `## [Unreleased]`, in the same commit as the code change.
+"User-visible" means anything an operator could notice: behaviour
+change, new flag, new endpoint, removed flag, UI change, performance
+characteristic, error-message format, dependency bump that changes
+the operator surface. Internal refactors with no observable impact
+don't need an entry, but when in doubt, write one.
+
+The file follows [Keep a Changelog](https://keepachangelog.com/en/1.0.0/):
+- Sections: `### Added`, `### Changed`, `### Removed`, `### Fixed`,
+  `### Tests` (last is project-specific).
+- Each bullet starts with a **bolded short headline**, then a colon,
+  then a self-contained explanation. Include *why* — not just *what*.
+- Reference migration filenames, endpoint paths, env vars, and crate
+  names by their exact identifiers so the entry is greppable later.
+- Test-only changes go under `### Tests`.
+
+If you're touching code without writing a CHANGELOG entry, you're
+either doing the wrong thing or you forgot. Stop and add the entry
+before staging the commit.
+
+## Use the generated OpenAPI SDK in `web/`
+
+The frontend has a generated TypeScript SDK at `web/src/api/client/`
+(`types.gen.ts`, `sdk.gen.ts`, `@tanstack/react-query.gen.ts`) produced
+by `bun run openapi-ts` against the running backend. **Use it.**
+
+- Do not write hand-rolled `fetch` helpers under `web/src/lib/`. There
+  used to be one (`backup-schedules.ts`) and it caused a real bug —
+  someone added a field to the backend, forgot to mirror it in the
+  shim's local type, and a UI feature silently dropped the field on
+  PATCH.
+- If a binding you need is missing from the generated SDK, the cause
+  is the backend handler isn't fully decorated for OpenAPI. Fix it
+  there: add `#[utoipa::path]`, register the schema in `ApiDoc`,
+  restart the server, regenerate. Don't paper over with a `fetch`
+  shim.
+- If you can't get the binding to generate, **ask for help** before
+  reaching for a shim. The shim creates two copies of the API surface
+  that drift apart.
+
+## Restart the server when you change the OpenAPI surface
+
+If your backend change touches handlers, request/response shapes,
+schemas, or routes, you must:
+1. Restart `temps serve` (use the `start-temps` skill).
+2. `cd web && bun run openapi-ts` to regenerate the SDK against the
+   live server.
+3. Commit the regenerated files. They're tracked in git on purpose so
+   reviewers see the API delta.
+
+The shortest way to spot a missing step: TypeScript compile errors
+in `web/src/` that say "Module ... has no exported member ...". That
+means the SDK is stale.
+
+## Pre-commit hooks run cargo fmt and cargo clippy
+
+Hooks **will** reformat your files and **will** fail the commit if
+clippy finds issues. Plan for it:
+
+- Don't fight the formatter. If `cargo fmt` modifies a file during a
+  commit, re-stage and commit again.
+- Multiple atomic commits run hooks once each. If you're committing
+  three related changes, prefer one commit so clippy/fmt run once.
+  (The wall-clock cost of clippy on this workspace is ~3–5 min.)
+- Never pass `--no-verify` unless the user explicitly asks. CLAUDE.md
+  forbids it. If a hook is broken, fix the hook, don't bypass it.
+
+## Conventional Commits
+
+Already in CLAUDE.md, but reinforced here because it's a hard rule:
+`type(scope): description` where type is one of `feat`, `fix`,
+`docs`, `style`, `refactor`, `perf`, `test`, `build`, `ci`, `chore`,
+`revert`. Scope is the affected crate or area (`backup`, `web`,
+`deployments`, etc.).
+
+## Don't sweep unrelated dirty files into your commits
+
+If you arrive at a working tree that's already dirty (because a
+previous session left files modified), confirm with the user whether
+to include those files before staging them. Sweeping unrelated work
+into a focused PR makes review slower and history harder to bisect.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -8,6 +8,30 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ## [Unreleased]
 
 ### Added
+-
+
+### Changed
+-
+
+### Fixed
+-
+
+
+## [0.1.0-beta.19] - 2026-05-20
+
+### Added
+- **Manual (non-git) project creation from the CLI**: `bunx @temps-sdk/cli projects create` gains `--manual`, `--source-type` (`manual`, `docker_image`, or `static_files`), `--image`, and `--port` flags so you can create Docker-image and static-files projects without linking a git repository. The git-based flow is unchanged when `--repo` is supplied.
+
+### Fixed
+- **AI Gateway returned 401 for valid API keys**: the OpenAI-compatible endpoints (`/ai/v1/chat/completions`, `/ai/v1/models`, `/ai/v1/embeddings`) were registered via `configure_public_routes`, which mounts on the no-auth router — but the handlers use the `RequireAuth` extractor, which reads the `AuthContext` injected by `auth_middleware`. Since that middleware only runs on the authenticated router, every request 401'd with "Authentication Required" *before* the `tk_` API key was ever validated, so no diagnostic was logged. The gateway routes now register via `configure_routes` alongside the admin/usage/pricing routes, so they sit on the authenticated surface: valid API keys authenticate and the `AiGatewayExecute` permission check runs as intended.
+- **Static deployments were not served until an unrelated route reload**: `mark_deployment_complete` flipped `current_deployment_id` and fired the route-table `NOTIFY` before writing `static_dir_location`/`image_name`, which `load_routes()` reads to build an environment's backend. For static deployments the `NOTIFY` fired while `static_dir_location` was still NULL, so the proxy built a route with no static directory. A new Phase 0 step persists the routing-relevant deployment fields first, so the route table sees a consistent record the moment the `NOTIFY` fires.
+- **Inflated session-engagement and bot traffic in analytics**: auto-fired view events (`page_view`, `page_leave`, `*_viewed`) — which intersection observers trigger for bots too — could mark a session "engaged" on their own. A session now counts as engaged only with ≥10s of measured wall-clock time or a genuine interaction event. Zero-duration session replays (never-finalized single-burst sessions, typically bots) are excluded from replay listings, and user-agent bot detection in the events pipeline is broadened.
+
+
+## [0.1.0-beta.18] - 2026-05-19
+
+### Added
+- **Per-schedule backup scope — pick which databases a schedule backs up, and whether the control plane is included**: backup schedules used to fan out to every external service on the host unconditionally, with an unavoidable control-plane backup attached to every run. Two new boolean fields on `backup_schedules` give operators real control: `target_all_services` (defaults `true`) auto-includes every current and future external DB so the common case stays one-click, and a new `backup_schedule_services` join table (migration `m20260519_000001`) carries the explicit list when an operator opts into "Specific databases". `include_control_plane` (defaults `true`) lets schedules that exist purely to orchestrate external-DB backups drop the control-plane row. Service-layer validators (`BackupService::{create,update}_backup_schedule`) reject states that would have nothing to back up (control plane off + target_all_services off + no attached services); flipping `target_all_services → true` clears the explicit membership ("all means all"). Four new endpoints — `GET/POST /backups/schedules/{id}/services`, `DELETE /backups/schedules/{id}/services/{service_id}`, `GET /backups/external-services/{service_id}/schedules` — with audit logging and OpenAPI registration. UI: reusable `ScheduleServicesSelector` (checkbox list with indeterminate "Select all", hides already-attached); Create and Edit pages get an "All databases (recommended) / Specific databases" radio plus an "Also back up the Temps control plane" Switch; the schedule detail page surfaces both flags in the configuration card and only renders the per-service attach/detach card in 'specific' mode. Migration backfills existing rows to `target_all_services=true` and `include_control_plane=true` so behaviour is identical on upgrade. Covered by 6 unit tests (MockDatabase, Docker-skip) + 3 integration tests against TestDatabase (attach/detach round-trip, flip-to-all clears membership, fan-out skips control plane when flag is off).
 - **S3 bucket lifecycle rules enforce backup retention even when temps is offline**: every backup upload now carries `temps-managed=true` and `temps-retention-days=N` object tags (plus `temps-schedule-id` / `temps-backup-id` for traceability), and a new `S3LifecycleService` reconciles per-bucket `BucketLifecycleConfiguration` rules from current `backup_schedules` state. One tag-filtered rule per distinct retention value (`temps-retention-7d`, `temps-retention-30d`, …) so S3 expires expired objects autonomously. Reconcile fires fire-and-forget on schedule create/update/delete (only when `retention_period` or `enabled` changes), plus an hourly drift sweep that re-pushes the desired state — manual edits in the AWS console eventually converge. Tag-based filters were chosen over per-schedule prefixes so existing backup keys are untouched and restore still works; old objects (written before this change) simply lack the tags and are ignored by the rules. App-side `enforce_retention` still runs as the primary cleanup path; providers that reject `PutBucketLifecycleConfiguration` (Cloudflare R2, Backblaze B2, or insufficient IAM permissions) return `ReconcileOutcome::Unsupported` and we silently fall back — backups are never blocked because S3 rejected a lifecycle config. Live testcontainer roundtrip coverage against MinIO and RustFS validates the full `apply_lifecycle` → `get_bucket_lifecycle_configuration` shape; skips gracefully without Docker. Solves the "control plane offline for a week → storage costs balloon" failure mode.
 - **Public/admin console listener split**: the control plane can now bind admin/management routes (auth, dashboard, CRUD, settings, SwaggerUI, the SPA) to a separate address from public ingest (analytics events, error tracking, AI gateway, worker node sync, email tracking, Sentry/OTLP). Set `TEMPS_CONSOLE_ADMIN_ADDRESS=127.0.0.1:8081` (or any private interface) to enable; leave it unset for the existing single-listener behavior. Optional defense-in-depth via `TEMPS_ADMIN_ALLOWED_IPS` (comma-separated IPs/CIDRs), `TEMPS_ADMIN_ALLOWED_HOSTS` (comma-separated Host header values), and `TEMPS_ADMIN_TRUST_FORWARDED_FOR` (honor `X-Forwarded-For` only from loopback peers, anti-spoof). Denied requests on the admin gate return `404 Not Found`, not `403 Forbidden`, so probes can't fingerprint the admin surface. Each plugin classifies its own routes via the existing `configure_routes` (admin) / `configure_public_routes` (public) hooks — analytics events, session replay, performance, error tracking (Sentry + sentry-cli), email tracking, AI gateway, and the worker-facing multi-node endpoints have been split accordingly. SwaggerUI and the embedded SPA now mount on the admin listener only. See [docs/howto/admin-listener](docs/howto/admin-listener/page.mdx).
 - **Paginated "visitors in segment" page**: clicking any non-page dimension row (e.g. "Chrome" in Browsers, "United States" in Countries, an event name, a referrer, a UTM value) now navigates to `/projects/:slug/analytics/segments/:dimension/:value` — a paginated list of visitors that match the segment in the selected date range, sorted by last action descending (25 per page). Rows link to the existing visitor detail page so you can see the full journey for any visitor. Powered by new optional `filter_*` query params on `GET /analytics/visitors` (`filter_country`, `filter_region`, `filter_city`, `filter_channel`, `filter_referrer`, `filter_event`, `filter_browser`, `filter_os`, `filter_device`, `filter_language`, `filter_utm_source`, `filter_utm_medium`, `filter_utm_campaign`, `filter_utm_term`, `filter_utm_content`); visitor-side filters resolve against `visitor` / `ip_geolocations` while event-side filters use an `EXISTS (SELECT 1 FROM events …)` semi-join scoped by `(project_id, visitor_id, timestamp)` so existing composite indexes (`idx_events_visitor_timestamp`, `idx_visitor_project_last_seen`) carry the query. Date filter (quick or custom) is preserved across overview → dimensions → segment visitors → back.
@@ -17,6 +41,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - **Postgres WAL health probe + service-detail warning panel**: detects four "silent disk-filler" conditions on managed Postgres services (WAL bloat vs `max_wal_size`, stale replication slots, archive backlog, `archive_mode=on` with empty `archive_command`) and surfaces them on the service detail page with copy-to-clipboard remediation SQL. New `GET /external-services/{id}/wal-health` endpoint; snapshot persisted under the generic new `external_services.health_metadata` JSONB column so future engines can add sibling signals without further migrations.
 
 ### Changed
+- **`EditBackupSchedule` page uses the generated OpenAPI SDK instead of a hand-rolled fetch shim**: `web/src/lib/backup-schedules.ts` (a hand-rolled `PATCH /api/backups/schedules/{id}` helper that predated the endpoint being in the OpenAPI surface) is deleted; the Edit page now uses `updateBackupScheduleMutation` and `UpdateBackupScheduleRequest` from the generated client. Removes a maintenance hazard where new fields on the request body had to be added in two places. Convention reinforced in `AGENTS.md`: hand-rolled `fetch` helpers under `web/src/lib/` are not allowed; if a binding is missing the fix is to expose the endpoint via `utoipa::path` and regenerate, not to write a shim.
 - **`temps login` is now browser-only for interactive use; `--api-key` is the headless path.** All credential entry happens in the web UI — there is no terminal password prompt anymore. Headless / CI authentication uses a pre-minted API key from the dashboard's **Settings → API Keys** page, passed via `--api-key`.
 - **Default agent turn caps raised**: committed agents now default to `max_turns: 30` (was 10), and the ephemeral dry-run cap rises to 50 (was 20). The Claude CLI invocation in `temps-agents` now treats `max_turns <= 0` as "omit the `--max-turns` flag entirely", letting a reviewed YAML opt into unlimited turns while `timeout_seconds` + `daily_budget_cents` still bound the run.
 
@@ -25,6 +50,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - **CLI flags `--email` / `--password` / `--magic` / `--mfa` / `--device`** on `temps login`. The interactive flow is the browser device flow unconditionally; `--api-key` is preserved for headless / CI. Magic-link login through the CLI is no longer supported (magic links still work for browser logins from the web `/login` page).
 
 ### Fixed
+- **Backup uploads to Cloudflare R2 no longer fail with `service error`**: every backup against an R2 bucket failed with `create_multipart_upload failed: service error` (5+ minute wall-clock, no diagnostic detail). Two root causes: (1) every S3 SDK call site rendered errors via `format!("...: {}", e)`, which for any 4xx/5xx collapses to the string "service error" — the HTTP status, service code, request id, and XML body were all thrown away; (2) the AWS SDK sends `x-amz-tagging` as a request header on `PutObject` and `CreateMultipartUpload`, and R2 returns `501 NotImplemented` on that header. Moving tagging to a follow-up `PutObjectTagging` call still failed — R2 also returns `501 NotImplemented` on that endpoint. Object tagging is simply not implemented on R2. Fix: added `describe_sdk_error` in `engines::v2_common` that pattern-matches every `SdkError` variant and surfaces HTTP status / service code / request_id / x-amz-id-2 / a truncated response body; all upload sites (single-part, create/upload/complete multipart, metadata companion, `head_bucket`) and the three `From<SdkError> for BackupError` impls now use it, so future S3 failures will say *what* actually went wrong. Tags are still applied via `PutObjectTagging` after every successful upload, but `apply_object_tags` now treats failures matching `is_unsupported_error` (NotImplemented, MethodNotAllowed, MalformedXML, AccessDenied, lifecycle-specific InvalidArgument) as best-effort — it logs a warn under target `temps_backup::tagging` and returns Ok so the backup itself succeeds. AWS S3 / MinIO / compliant stores still tag normally; tag-driven bucket lifecycle is unavailable on R2 (already best-effort in the reconciler) so app-side `BackupService::enforce_retention` is the retention source of truth there. Two regression tests pin the exact R2 error shapes for both the `x-amz-tagging` upload-header path and the `PutObjectTagging` path so a future SDK upgrade can't silently regress the matcher.
 - **GitHub App scoped token mint failures are now logged with context**: each fallible step of the GitHub App installation token flow (private key parse, JWT creation, octocrab client build, installation fetch, `access_tokens_url` parse, GitHub `access_tokens` POST) now emits an `error!` line with `installation_id` and `app_id` so a "GitHub rejected access_tokens" failure can be traced back to the specific installation. The new logs call out the two common causes — requested repo not selected on the installation, or the App lacks the requested permission — so operators stop having to re-derive context from the call site. Pure observability change; no behavior change to the token mint itself.
 - **Sandbox bring-up now runs a dedicated `normalize_ownership` step on both create and recover.** The container post-start chown is factored into a separate method that does `chown -R temps:temps` on both the home volume (best-effort: warns on non-zero exit, continues) and the bind-mounted `/home/temps/workspace` (fatal with `stat`-based verification so dev-machine bind-mount backends that return EPERM for logical no-ops don't abort, but real prod permission failures do). This is the in-container defense-in-depth that complements the host-side `chown_workdir_to_sandbox_user` from beta.9 — fixes the residual "Permission denied" failures on `mkdir reports/`, `git commit`, and lockfile creation under workspace.
 - **Postgres `archive_mode=on` with empty `archive_command` no longer causes runaway `pg_wal` growth.** Earlier versions baked `archive_mode=on` into the container CMD unconditionally, so any Postgres service whose `archive_command` was never set (no S3 source linked, or `enable_wal_archiving` never reached) accumulated WAL forever — we observed 191 GB `pg_wal` in production. New services now start with `archive_mode=off`; `enable_wal_archiving` recreates the container with `archive_mode=on` baked into CMD when WAL-G is configured. `PostgresService::start` reconciles by probing the volume for `walg.env` and comparing against the running container's CMD, recreating if they disagree — operator-initiated Stop/Start auto-repairs existing services with the bad combo. The bad combo is now unrepresentable for any service that's been restarted at least once. `start_service` also refreshes the WAL health snapshot inline after a recreate so the UI reflects the new state within ~1s instead of waiting for the next 30s probe cycle.