cmd/seed: synthetic dataset for local development#70
Conversation
Add a deterministic ~6-month synthetic dataset (~9.8k rows, gentle sinusoid with occasional spikes and quiet days) for exercising the dashboard locally without needing real production exports. The generator deliberately spans every period (7d / 30d / 3m / 6m / 1y) so the chart UI has data to render at any range. Safety properties: - Refuses to run unless Config.Environment == "development". - INSERT … ON CONFLICT (id) DO NOTHING, so re-running is a no-op. - Steam IDs use a clearly-synthetic 76561198000000000 prefix. - Snowflake IDs encode the same created_at + sequence layout as the production generator, so synthetic rows sort chronologically alongside any real rows already in the DB. internal-docs/ and internal/devseed/fixtures/ are added to .gitignore to keep author scratch space and any future local CSV fixtures out of the public repo. Co-authored-by: Cursor <cursoragent@cursor.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit b067b6b. Configure here.
| } | ||
| }() | ||
|
|
||
| reversals := devseed.GenerateSynthetic(time.Now().UTC()) |
There was a problem hiding this comment.
Seed not idempotent due to time-dependent snowflake IDs
Medium Severity
GenerateSynthetic receives time.Now().UTC(), and all generated snowflake IDs embed createdAt timestamps derived from that value. Since today shifts daily and the nowMs cap changes every millisecond, re-running the seed at a different time produces entirely different snowflake IDs. Because ON CONFLICT (id) DO NOTHING keys on these IDs, a second run on a different day inserts ~9.8k additional rows instead of being a no-op, contradicting the documented idempotency guarantee. Anchoring to a fixed reference time instead of time.Now() would make the output truly deterministic.
Additional Locations (2)
Reviewed by Cursor Bugbot for commit b067b6b. Configure here.


Summary
Add a deterministic ~6-month synthetic dataset (~9.8k rows, gentle sinusoid with occasional spikes and quiet days) for exercising the dashboard locally without needing a real production export. The generator deliberately spans every period the chart picker offers (7d / 30d / 3m / 6m / 1y), so any range renders meaningful data.
Safety properties
.gitignore
Adds `internal-docs/` (author scratch space) and `internal/devseed/fixtures/` (room for any future local-only CSV fixtures) so neither leaks into the public repo.
Test plan
Made with Cursor
Note
Low Risk
Dev-only CLI with an environment guard and idempotent inserts; not linked to the production server binary.
Overview
Adds a dev-only seed path so local Postgres can hold a deterministic ~6-month reversal history (~9.8k rows) without production exports.
go run ./cmd/seedloads config, exits unlessEnvironmentisdevelopment, then bulk-inserts generated rows into the public DB viaON CONFLICT (id) DO NOTHING(safe to re-run).internal/devseedbuilds daily volume with variance, marketplace mix, sources, optional expungements, synthetic Steam IDs, and snowflake IDs aligned with production ordering.README documents the workflow and dashboard period coverage;
.gitignoreexcludesinternal-docs/and optionalinternal/devseed/fixtures/.Reviewed by Cursor Bugbot for commit b067b6b. Bugbot is set up for automated code reviews on this repo. Configure here.