Skip to content

Release: Phase 10 Dashboard + The Forge (Randomized Database Seeder)#67

Merged
w7-mgfcode merged 8 commits into
mainfrom
dev
Feb 2, 2026
Merged

Release: Phase 10 Dashboard + The Forge (Randomized Database Seeder)#67
w7-mgfcode merged 8 commits into
mainfrom
dev

Conversation

@w7-mgfcode
Copy link
Copy Markdown
Owner

@w7-mgfcode w7-mgfcode commented Feb 2, 2026

Summary

  • Phase 10 Dashboard: Complete React frontend implementation with shadcn/ui
  • The Forge: Randomized database seeder for generating realistic synthetic retail data
  • Documentation and sync updates

Changes Included

Phase 10 - Dashboard (#61)

  • React + Vite + TypeScript frontend
  • shadcn/ui components (26 components)
  • TanStack Query for data fetching
  • TanStack Table for data grids
  • Recharts for visualizations

The Forge - Database Seeder (#66)

  • Dimension generators (store, product, calendar)
  • Fact generators with time-series patterns (sales, inventory, price, promotion)
  • Pre-built scenarios (retail_standard, holiday_rush, high_variance, stockout_heavy, sparse)
  • RAG + Agent E2E validation scenario
  • CLI script with YAML config support
  • 77 tests (61 unit + 16 integration)

Test Plan

  • All CI checks pass on dev
  • Unit tests pass
  • Integration tests pass
  • Type checking passes (mypy + pyright)
  • Linting passes (ruff)

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Full-featured randomized data seeder with multiple presets and reproducible scenarios; seeding operations: full-new, append, delete, status, verify, plus an end-to-end RAG+Agent validation scenario.
    • CLI for all seeder operations with safety guards and dry-run support; YAML config support and defaults.
  • Documentation

    • Seeder README, example seed configs (holiday, sparse) and a Git/GitHub guide.
  • Tests

    • Extensive unit and PostgreSQL integration test suites covering generators, orchestration, scenarios.
  • Chores

    • .gitignore expanded to exclude local envs, build caches, and generated artifacts.

w7-mgfcode and others added 4 commits February 2, 2026 05:57
* docs(frontend): split INITIAL-11 into three parts with shadcn/ui enhancements

Split the dashboard specification into manageable documents:
- INITIAL-11A.md: Overview, Tech Stack, Feature descriptions
- INITIAL-11B.md: Page Structure, UX flows, wireframes
- INITIAL-11C.md: Components, Hooks, Config, Success Criteria

Key enhancements:
- Simplified app shell to Top Navigation + Tabs pattern
- Added validated shadcn/ui component recommendations per section
- Updated code snippets to use shadcn primitives (table, chart, collapsible)
- Added installation commands and theme configuration
- Cross-referenced all three parts with navigation links

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs(frontend): restructure INITIAL-11 documents for developer workflow

Reorganize the three-part frontend specification so developers can start
immediately:
- 11A: Setup & Config (installation, dependencies, environment)
- 11B: Architecture & Features (tech stack, app shell, feature descriptions)
- 11C: Pages & Components (routes, wireframes, code patterns, hooks)

All documents now include Documentation Links and Other Considerations sections.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat(frontend): scaffold Vite + React 19 + Tailwind CSS 4 + shadcn/ui project

Complete frontend scaffolding per PRP-11A:
- Vite + React 19 + TypeScript with strict mode
- Tailwind CSS 4 via @tailwindcss/vite plugin
- shadcn/ui initialized with New York style (26 components)
- Path aliases (@/) configured in vite.config.ts and tsconfig
- TanStack Query, TanStack Table, React Router installed
- API proxy configured for backend integration
- Environment configuration template (.env.example)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs: update documentation for frontend scaffolding

- Add Dashboard to Features list in README
- Add Node.js/pnpm to prerequisites
- Add frontend setup instructions (steps 8-9)
- Add Frontend Commands section with dev/build/lint commands
- Update project structure to include frontend/ directory
- Add Frontend Stack section with technology table
- Update ADR-0002 status to Implemented with tech details

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs(phase): add Phase 10 Dashboard documentation

- Create docs/PHASE/10-DASHBOARD.md with frontend scaffolding details
- Document technology stack (React 19, Vite 7, Tailwind CSS 4, shadcn/ui)
- Include configuration files, commands, and validation results
- Update PHASE-index.md with Phase 10 status (In Progress)
- Add sub-phase tracking (10A complete, 10B/10C pending)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs(prp): add PRP-11B dashboard architecture specification

Comprehensive PRP for implementing ForecastLab Dashboard architecture
including TypeScript types, TanStack Query hooks, DataTable with
server-side pagination, charts, WebSocket chat, and theme toggle.

Key components:
- Full TypeScript types matching backend API schemas
- API client with RFC 7807 error handling
- TanStack Query v5 hooks with keepPreviousData pattern
- DataTable with 1-indexed pagination conversion
- WebSocket hook with exponential backoff reconnection
- shadcn/ui chart integration

Confidence: 8.5/10

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat(frontend): implement dashboard architecture (PRP-11B)

Add complete React dashboard implementation including:
- TypeScript types for all API responses
- API client with typed fetch wrapper and RFC 7807 error handling
- TanStack Query hooks for stores, products, KPIs, drilldowns, runs, jobs, RAG sources
- WebSocket hook with reconnection and exponential backoff
- Theme provider with dark/light mode toggle
- App shell with top navigation and mobile drawer
- DataTable component with server-side pagination
- Common components: StatusBadge, DateRangePicker, ErrorDisplay, LoadingState
- Chart components: KPICard, TimeSeriesChart, BacktestFoldsChart
- Chat components for agent interaction with tool call display
- All page components: Dashboard, Explorer (stores/products/runs/jobs/sales),
  Visualize (forecast/backtest), Chat, Admin
- React Router v7 setup with lazy loading for code splitting

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs: update frontend README and Phase 10 documentation

- Replace generic Vite template README with comprehensive frontend docs
- Document project structure, routes, API integration, and component patterns
- Update Phase 10 docs to mark 10B and 10C as complete
- Add implementation details for architecture and page deliverables

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs(prp): add PRP-11C validation specification

Add comprehensive validation PRP for INITIAL-11C (Pages & Components):
- 8 validation tasks covering build, routes, components, and UX
- 4-level validation loop (build, visual, integration, cross-browser)
- 30+ checklist items for thorough testing
- Validation report template for documentation
- Documents all implemented routes and components

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat(frontend): enable LAN access for dev server

Set host: true to bind Vite to 0.0.0.0 for mobile/LAN testing.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: add CORS middleware to enable frontend-backend communication

Enable cross-origin requests from Vite dev server to FastAPI backend. This allows the React dashboard to successfully fetch data from API endpoints during development, resolving CORS preflight failures that blocked all frontend integration.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: center layout containers for proper alignment on wide screens

- Add mx-auto and px-4 to container classes in app-shell and top-nav
- Add LAN IP (10.0.0.121) to CORS allowed origins for development

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* style: format main.py

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Gabe@w7dev <gabor@w7-7.net>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
- Phase 10 marked as Completed (was "In Progress")
- Sub-phases 10B and 10C now ✅ Completed (were 🔲 Pending)
- Added comprehensive Phase 10 summary with deliverables
- Added version history entries for 10B, 10C completion (2026-02-02)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…#66)

* feat(data-platform): add INITIAL-12 randomized database seeder spec

Add specification for synthetic data generation utility ("The Forge"):
- Full new: complete dataset generation from scratch
- Delete: safe removal with confirmation guards
- Append: add data without violating constraints
- RAG + Agent scenario: E2E workflow validation

Additional features brainstormed:
- Realistic time-series patterns (trend, seasonality, noise)
- Retail-specific patterns (promotions, stockouts, price elasticity)
- Pre-built scenarios (holiday_rush, sparse, new_launches)
- YAML configuration support
- CLI with --dry-run and --confirm safety flags

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* chore: update gitignore and add INITIAL-11C validation report

- Add .venv, .playwright-mcp/, frontend/.vite/, artifacts/ to gitignore
- Add validation report documenting frontend dashboard testing results

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs: add Git & GitHub guide with branch rules and CI/CD workflows

Comprehensive guide covering:
- Branching strategy (main, dev, feat/*, phase-*)
- GitHub Actions workflows (CI, schema validation, releases)
- Contributor use cases and quick-reference commands

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat(data-platform): implement randomized database seeder (The Forge)

Add comprehensive synthetic data generation system for development and testing:

- Core seeder module (app/shared/seeder/) with dimension and fact generators
- Time-series patterns: trend, seasonality, noise, anomalies, promotion lift
- 6 pre-built scenarios: retail_standard, holiday_rush, high_variance, etc.
- CLI script (scripts/seed_random.py) with --full-new, --delete, --append
- Safety guards: production env blocked, --confirm required, --dry-run
- YAML configuration support for custom scenarios
- 40 unit tests covering all generators and configurations
- Examples and documentation in examples/seed/

Implements INITIAL-12 specification.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs: add PRP-12 for completing randomized database seeder

Add comprehensive PRP documenting remaining tasks to complete INITIAL-12:
- RAG + Agent E2E scenario (--scenario rag-agent)
- Integration tests for seeder module
- config_sparse.yaml example file
- Unit tests for DataSeeder orchestration class

Includes detailed implementation blueprints, validation gates, and
code examples for all 6 tasks. Confidence score: 8/10.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat(seeder): add RAG scenario and comprehensive tests (PRP-12)

- Add RAG + Agent E2E scenario module (rag_scenario.py) for validating
  the complete stack: document indexing, agent sessions, and citations
- Add --run-scenario mode to seed_random.py with rag-agent scenario
- Add unit tests for DataSeeder core orchestration (21 tests)
- Add integration tests requiring PostgreSQL (16 tests)
- Export new types from seeder __init__.py

Total: 77 tests passing (61 unit + 16 integration)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(seeder): add mypy disable directive for test_generators.py

Tests access dict values with known runtime types but mypy infers
overly broad union types. Added targeted disable-error-code directive
for union-attr, arg-type, operator, return-value error codes.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Gabe@w7dev <gabor@w7-7.net>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @w7-mgfcode, your pull request is larger than the review limit of 150000 diff characters

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Feb 2, 2026

📝 Walkthrough

Walkthrough

Adds a randomized database seeder subsystem: configuration dataclasses/enums, multiple realistic data generators, core orchestration (generate/append/delete/verify), RAG+Agent end‑to‑end scenario, an async CLI, example configs/docs, and a comprehensive unit + integration test suite; includes safety guards and deterministic RNG seeding.

Changes

Cohort / File(s) Summary
Project & Settings
/.gitignore, app/core/config.py
Add ignore rules for local env/build artifacts; add seven seeder_* settings to Settings (defaults for seed, stores/products, batch size, progress flags and production/confirm guards).
Seeder Config & Package Init
app/shared/seeder/config.py, app/shared/seeder/__init__.py
New SeederConfig and related dataclasses/enums (ScenarioPreset, TimeSeriesConfig, RetailPatternConfig, DimensionConfig, SparsityConfig, HolidayConfig) with from_scenario; package __init__ re-exports public API.
Generators
app/shared/seeder/generators/..., app/shared/seeder/generators/__init__.py
Add CalendarGenerator (US holidays), StoreGenerator, ProductGenerator, SalesDailyGenerator, PriceHistoryGenerator, PromotionGenerator, InventorySnapshotGenerator; centralize exports via generators __init__.
Core Orchestration
app/shared/seeder/core.py
New DataSeeder and SeederResult orchestrating dimension/fact generation, batched DB inserts (ON CONFLICT DO NOTHING), generate_full/append/delete/verify methods, transactional behavior and integrity checks.
RAG + Agent Scenario
app/shared/seeder/rag_scenario.py
New RAGScenarioRunner and RAGScenarioResult implementing E2E RAG+Agent flow (health, index documents, create agent session, chat, verify citations, cleanup) with dry‑run support.
CLI
scripts/seed_random.py
New async CLI providing YAML config loading, modes (--full-new, --append, --delete, --status, --verify, --run-scenario including rag-agent), production safety checks, session management, and result reporting; exposes helper functions for testing.
Tests
app/shared/seeder/tests/...
Comprehensive tests: fixtures, unit tests for config/core/generators, and PostgreSQL integration tests covering generation, idempotency, FK integrity, sparsity, reproducibility, destructive-test guard and verification.
Examples & Docs
examples/seed/*, docs/GIT-GITHUB-GUIDE.md, INITIAL-12.md, PRPs/*
Add example YAMLs (holiday, sparse), README for seeder, design doc INITIAL-12, validation report and other docs.
Misc
manifest_file, requirements.txt, pyproject.toml
Project manifest / dependency files updated (new entries referenced).

Sequence Diagram(s)

sequenceDiagram
    actor User as CLI User
    participant CLI as seed_random.py
    participant DS as DataSeeder
    participant G as Generators
    participant DB as PostgreSQL

    User->>CLI: --full-new --config config.yaml
    CLI->>CLI: Load SeederConfig from YAML
    CLI->>CLI: Validate production safety / confirm
    CLI->>DS: Create DataSeeder(config)
    DS->>DS: Seed RNG with config.seed

    CLI->>DS: generate_full(session)
    DS->>G: CalendarGenerator.generate()
    G-->>DS: dates
    DS->>G: StoreGenerator.generate()
    G-->>DS: stores
    DS->>G: ProductGenerator.generate()
    G-->>DS: products
    DS->>DB: Batch insert dimensions
    DB-->>DS: inserted ids

    DS->>G: SalesDailyGenerator.generate()
    G-->>DS: sales
    DS->>G: PriceHistoryGenerator.generate()
    G-->>DS: prices
    DS->>G: PromotionGenerator.generate()
    G-->>DS: promotions
    DS->>G: InventorySnapshotGenerator.generate()
    G-->>DS: inventory
    DS->>DB: Batch insert facts (ON CONFLICT DO NOTHING)
    DB-->>DS: counts

    DS->>DB: verify_data_integrity()
    DB-->>DS: validation results
    DS-->>CLI: SeederResult
    CLI->>User: Print summary
Loading
sequenceDiagram
    actor User as CLI User
    participant CLI as seed_random.py
    participant RAG as RAGScenarioRunner
    participant API as ForecastLabAI API

    User->>CLI: --run-scenario rag-agent
    CLI->>RAG: RAGScenarioRunner(api_base_url, seed)
    RAG->>RAG: Seed RNG

    RAG->>API: GET /health
    API-->>RAG: 200 OK
    RAG->>RAG: _generate_test_documents()

    loop for each test document
      RAG->>API: POST /rag/index {content, source_path}
      API-->>RAG: 200 OK
    end

    RAG->>API: POST /agents/sessions
    API-->>RAG: {session_id}
    RAG->>API: POST /agents/sessions/{id}/chat {message}
    API-->>RAG: {response, citations}
    RAG->>RAG: verify citations_found

    RAG->>API: DELETE /agents/sessions/{id}
    API-->>RAG: 200 OK

    RAG-->>CLI: RAGScenarioResult
    CLI->>User: Print results
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

autorelease: tagged

Suggested reviewers

  • w7-learn

Poem

🐰 I hopped through configs, seeds in paw,
Calendars, stores, and sales I saw.
I indexed docs and made an agent chat,
Checked citations, cleaned up this and that.
The Forge is warm — come seed with me, tata!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the two major features delivered: Phase 10 Dashboard and The Forge (Randomized Database Seeder), providing specific information about the changeset's main components.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch dev

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

🤖 Fix all issues with AI agents
In `@app/shared/seeder/core.py`:
- Around line 483-491: When scope == "dimensions" the code deletes fact tables
for FK reasons but never logs or returns their counts; update the deletion block
in the function that uses variables scope, fact_tables, dimension_tables,
counts, db.execute(delete(model)) and logger.info so that you first iterate
fact_tables to record and log each fact table's count (e.g., use
counts.get(_name, 0) and a logger.info call per fact table) before executing the
delete, then preserve those entries in the counts dict returned to the caller so
implicit fact deletions are visible in the output.
- Around line 112-114: The current logic treats a row_count of 0 as falsy and
falls back to len(batch), inflating total_inserted; update the check to detect
presence explicitly: use getattr(cursor_result, "rowcount", None) into row_count
and then add row_count when row_count is not None, otherwise fall back to
len(batch) — reference variables: cursor_result, row_count, total_inserted,
batch.

In `@app/shared/seeder/generators/calendar.py`:
- Around line 12-19: The DEFAULT_US_HOLIDAYS mapping is missing Juneteenth;
update the dictionary DEFAULT_US_HOLIDAYS to include the fixed-date entry for
Juneteenth by adding the tuple key (6, 19) with the value "Juneteenth" alongside
the other fixed-date holidays so the calendar generator correctly recognizes
June 19 as a US federal holiday.

In `@app/shared/seeder/generators/facts.py`:
- Around line 106-110: Guard against division by zero when computing
ramp_factor: before computing ramp_factor = days_since_launch /
self.retail_config.new_product_ramp_days inside the block that checks
product_launch_date, ensure self.retail_config.new_product_ramp_days is > 0 (or
handle the 0 case explicitly by skipping the ramp or using a safe default factor
like 1.0); update the logic around days_since_launch, ramp_factor and the demand
multiplication so that when new_product_ramp_days == 0 you do not perform the
division and demand remains unchanged (or uses the chosen default factor).

In `@app/shared/seeder/generators/product.py`:
- Around line 136-142: The _generate_unique_sku function can loop forever if SKU
space is exhausted; modify _generate_unique_sku to guard against exhaustion by
adding a maximum-attempts cap (e.g., attempt_count with a MAX_ATTEMPTS constant)
and after that either widen the random range (increase digits) or raise a clear
exception, and also consider checking len(self._used_skus) against the SKU space
(current_space = 90000 or computed from rng bounds) before entering the loop;
reference _generate_unique_sku, self._used_skus, and the RNG range to implement
the attempt cap and the fallback behavior.

In `@app/shared/seeder/generators/store.py`:
- Around line 76-93: The `_generate_unique_code` method can loop forever if you
need more than 9,999 unique codes; add a fail-fast guard that checks the
available code space before attempting generation (e.g., compute max_codes =
9999 and if len(self._used_codes) >= max_codes or self.config.stores > max_codes
raise a ValueError with a clear message). Also validate in `__init__` (or at
start of `_generate_unique_code`) that `config.stores` does not exceed the
maximum to prevent trying to generate more codes than possible; reference
`_generate_unique_code`, `_used_codes`, `__init__`, and `config.stores` when
adding the checks and the descriptive exception.

In `@app/shared/seeder/rag_scenario.py`:
- Around line 291-319: Before contacting the API, add a preflight RAG
configuration check (e.g., a helper like is_rag_configured() or inspect whatever
config/provider/api key fields exist) at the start of the async block in run()
(before calling _check_api_health); if RAG is not configured set a “skipped”
state on the result (e.g., self.result.skipped = True or self.result.status =
"skipped"), append no error, and return self.result so the flow exits cleanly
without treating it as a failure; keep the existing _check_api_health,
_generate_test_documents, _index_document, and _create_agent_session logic
unchanged for the configured case.

In `@app/shared/seeder/tests/test_integration.py`:
- Around line 6-79: The db_session fixture currently runs destructive deletes on
production-like DBs without a safety guard; add an explicit opt-in check before
executing any delete statements: in the db_session fixture (symbols: db_session,
settings, session_maker, settings.database_url) verify a test-only flag (e.g.,
settings.testing is True) or an env var (e.g., ALLOW_DESTRUCTIVE_TEST_DB=true)
and raise a clear RuntimeError if the guard is not present; only proceed to run
the delete(...) calls and commits when the guard passes. Ensure the check occurs
before the initial "Pre-test cleanup" block and before the "Post-test cleanup"
block so deletes cannot run unless explicitly enabled.

In `@examples/seed/config_holiday.yaml`:
- Around line 1-5: The YAML file contains CRLF line endings which fail yamllint;
convert the file to use LF line endings (Unix style) so the YAML (including the
top-level keys like "dimensions" and "stores") passes yamllint. Ensure your
editor or commit normalizes line endings (or run a tool to replace CRLF with LF)
and re-commit the updated config_holiday.yaml.

In `@examples/seed/config_sparse.yaml`:
- Around line 1-5: The YAML file that begins with the comment "Sparse data
scenario configuration" currently uses CRLF line endings which yamllint flags;
convert the file's line endings from CRLF to Unix LF (e.g., via dos2unix or your
editor's line-ending setting) so all lines (including the top comment block) use
LF to satisfy yamllint.

In `@scripts/seed_random.py`:
- Around line 355-357: The code calls ScenarioPreset(args.scenario) directly
which will raise ValueError for invalid names like "rag-agent"; update the
branch that constructs config via SeederConfig.from_scenario to validate or map
args.scenario before creating the enum: either translate known aliases (e.g.,
map "rag-agent" to the correct ScenarioPreset member) or wrap the conversion in
a safe lookup that raises a friendly error (or falls back) instead of letting
ScenarioPreset(...) throw; adjust the logic around SeederConfig.from_scenario
and ScenarioPreset to use the safe mapping/validation so passing "--scenario
rag-agent" no longer crashes.
🧹 Nitpick comments (4)
app/shared/seeder/rag_scenario.py (1)

264-279: Reset scenario state at the start of each run

Reusing a single RAGScenarioResult can leak errors/flags across multiple run() calls. Reinitialize at the start so each execution is clean.

♻️ Proposed fix
 async def run(self, dry_run: bool = False) -> RAGScenarioResult:
     """Execute the RAG + Agent E2E scenario.
@@
-        logger.info("seeder.rag_scenario.started", dry_run=dry_run)
+        self.result = RAGScenarioResult()
+        logger.info("seeder.rag_scenario.started", dry_run=dry_run)
app/shared/seeder/tests/test_config.py (1)

52-92: Add coverage for NEW_LAUNCHES preset.
ScenarioPreset.NEW_LAUNCHES is defined but not exercised in TestSeederConfig.

✅ Suggested test
     def test_from_scenario_stockout_heavy(self):
         """Test stockout_heavy scenario preset."""
         config = SeederConfig.from_scenario(ScenarioPreset.STOCKOUT_HEAVY)
 
         assert config.retail.stockout_probability == 0.25
         assert config.retail.stockout_behavior == "zero"
 
+    def test_from_scenario_new_launches(self):
+        """Test new_launches scenario preset."""
+        config = SeederConfig.from_scenario(ScenarioPreset.NEW_LAUNCHES)
+
+        assert config.dimensions.products == 100
+        assert config.retail.new_product_ramp_days == 45
+
     def test_from_scenario_sparse(self):
         """Test sparse scenario preset."""
         config = SeederConfig.from_scenario(ScenarioPreset.SPARSE)
app/shared/seeder/config.py (1)

97-112: Add validation for sparsity ranges.
missing_combinations_pct outside [0,1] or gap_min_days > gap_max_days can lead to invalid behavior later.

🧰 Suggested validation
 `@dataclass`
 class SparsityConfig:
@@
     missing_combinations_pct: float = 0.0
     random_gaps_per_series: int = 0
     gap_min_days: int = 1
     gap_max_days: int = 7
+
+    def __post_init__(self) -> None:
+        if not 0.0 <= self.missing_combinations_pct <= 1.0:
+            raise ValueError("missing_combinations_pct must be between 0.0 and 1.0")
+        if self.random_gaps_per_series < 0:
+            raise ValueError("random_gaps_per_series must be >= 0")
+        if self.gap_min_days < 1 or self.gap_max_days < self.gap_min_days:
+            raise ValueError("gap_min_days must be >= 1 and <= gap_max_days")
scripts/seed_random.py (1)

303-308: Engine resource not disposed after use.

The AsyncEngine created here is never explicitly disposed. While the session is closed in main()'s finally block, the engine's connection pool persists until process exit.

For a CLI script that exits immediately, this is acceptable, but for completeness consider returning both or disposing the engine.

♻️ Optional: Return engine for explicit disposal
-async def get_session() -> AsyncSession:
+async def get_session() -> tuple[AsyncSession, create_async_engine]:
     """Create database session."""
     settings = get_settings()
     engine = create_async_engine(settings.database_url)
     session_maker = async_sessionmaker(engine, class_=AsyncSession, expire_on_commit=False)
-    return session_maker()
+    return session_maker(), engine

Then in main():

session, engine = await get_session()
try:
    # ...
finally:
    await session.close()
    await engine.dispose()

Comment thread app/shared/seeder/core.py Outdated
Comment thread app/shared/seeder/core.py
Comment thread app/shared/seeder/generators/calendar.py
Comment thread app/shared/seeder/generators/facts.py
Comment thread app/shared/seeder/generators/product.py Outdated
Comment thread app/shared/seeder/rag_scenario.py
Comment thread app/shared/seeder/tests/test_integration.py
Comment thread examples/seed/config_holiday.yaml Outdated
Comment on lines +1 to +5
# Holiday rush scenario configuration
# Use with: uv run python scripts/seed_random.py --full-new --config examples/seed/config_holiday.yaml --confirm

dimensions:
stores:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use LF line endings to satisfy yamllint

Same CRLF line-ending issue flagged by YAMLlint. Convert to LF.

🧰 Tools
🪛 YAMLlint (1.38.0)

[error] 1-1: wrong new line character: expected \n

(new-lines)

🤖 Prompt for AI Agents
In `@examples/seed/config_holiday.yaml` around lines 1 - 5, The YAML file contains
CRLF line endings which fail yamllint; convert the file to use LF line endings
(Unix style) so the YAML (including the top-level keys like "dimensions" and
"stores") passes yamllint. Ensure your editor or commit normalizes line endings
(or run a tool to replace CRLF with LF) and re-commit the updated
config_holiday.yaml.

Comment thread examples/seed/config_sparse.yaml Outdated
Comment thread scripts/seed_random.py
Core improvements:
- Fix rowcount detection to use explicit None check (avoids treating 0 as falsy)
- Log and track fact table counts when scope="dimensions" for visibility
- Add Juneteenth to DEFAULT_US_HOLIDAYS

Safety guards:
- Add max attempts cap for SKU generation to prevent infinite loops
- Add max attempts cap for store code generation to prevent infinite loops
- Validate SKU/code space capacity in constructors
- Add safety guard for destructive test operations (requires opt-in)
- Add preflight RAG configuration check before running scenario

Bug fixes:
- Guard against division by zero in new_product_ramp_days calculation
- Validate scenario argument before creating ScenarioPreset enum
- Convert YAML files from CRLF to LF line endings

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@app/shared/seeder/core.py`:
- Around line 1-117: Ruff formatting is failing for app/shared/seeder/core.py;
run the formatter and commit the changes. Reformat the file (e.g., run the
project task uv run ruff format or locally run ruff format) and ensure
DataSeeder and the _batch_insert method follow the project's ruff/black style so
CI's ruff format --check passes; then stage and commit the reformatted core.py.

In `@app/shared/seeder/generators/store.py`:
- Around line 101-123: The _generate_unique_code method currently relies solely
on randomized retries and can raise RuntimeError near capacity; change it to
first check remaining = MAX_CODE_SPACE - len(self._used_codes) and if remaining
is small (e.g., <= MAX_CODE_ATTEMPTS) or after failing the randomized loop,
compute the set of available codes (all codes S0001..S9999 minus
self._used_codes) and pick one deterministically (e.g., pop() or sorted()[0])
and add it to self._used_codes before returning; keep the existing early
exhaustion check using MAX_CODE_SPACE and only raise RuntimeError if no
available codes remain.

In `@scripts/seed_random.py`:
- Around line 84-87: yaml.safe_load(path.open()) result stored in the local
variable data may be None or a non-dict, causing an AttributeError when later
code calls data.get(); add an explicit type check after the safe_load call
(check that data is an instance of dict) and if not raise a ValueError with a
clear message about invalid/empty YAML config so subsequent code that uses
data.get(...) is safe; reference the variable name data and the
yaml.safe_load(...) call in scripts/seed_random.py to locate the change.

Comment thread app/shared/seeder/core.py
Comment thread app/shared/seeder/generators/store.py Outdated
Comment thread scripts/seed_random.py
w7-learn and others added 2 commits February 2, 2026 11:31
- Format core.py with ruff
- Update test guard to also check APP_ENV=testing (used in CI)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Store/Product generators: add deterministic fallback when near capacity
  to guarantee code generation instead of raising RuntimeError
- seed_random.py: validate yaml.safe_load returns a dict before using .get()

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@w7-mgfcode w7-mgfcode merged commit 85a098c into main Feb 2, 2026
11 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants