feat(seeder): Data Seeder REST API + Dashboard UI (The Forge)#68
Conversation
- Add seeder feature to README.md (features, project structure, CLI section) - Update docs/ARCHITECTURE.md with seeder module and Phase 12 completion - Create docs/DATA-SEEDER.md with comprehensive seeder documentation - Update Phase 10 Dashboard status to implemented Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add new feature slice providing REST API access to the Data Seeder: Endpoints: - GET /seeder/status - Database row counts and date range - GET /seeder/scenarios - List 6 scenario presets - POST /seeder/generate - Generate new dataset (with dry_run) - POST /seeder/append - Append data to existing dataset - DELETE /seeder/data - Delete with scope (all/facts/dimensions) - POST /seeder/verify - Run 5 data integrity checks Features: - Production guard blocks operations in production by default - Dry run mode previews changes without executing - Integrates with existing DataSeeder from app/shared/seeder/ - 55 unit tests covering schemas, service, and routes Also adds INITIAL-13.md planning document for Forge UI frontend. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Adds visual interface for The Forge data seeder in the Admin panel, allowing users to generate, delete, and verify synthetic datasets without using the CLI. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
📝 WalkthroughWalkthroughIntroduces "The Forge," a comprehensive synthetic data generation feature spanning backend REST API endpoints (status, scenarios, generate, append, delete, verify), Pydantic schemas, service layer logic, and a frontend React admin panel with data-fetching hooks. Includes documentation, comprehensive test coverage, and integration with existing admin dashboard. Changes
Sequence DiagramsequenceDiagram
participant Admin as Admin UI<br/>(SeederPanel)
participant Hook as React Query<br/>Hooks
participant API as FastAPI Routes<br/>(/seeder/*)
participant Service as Service Layer<br/>(seeder.service)
participant Seeder as DataSeeder<br/>Orchestrator
participant DB as Database
Admin->>Hook: useGenerateData()
activate Hook
Hook->>API: POST /seeder/generate<br/>{scenario, stores, ...}
deactivate Hook
activate API
API->>Service: service.generate_data(params, db)
deactivate API
activate Service
Service->>Service: _build_config_from_params()
Service->>Seeder: DataSeeder.generate_full()
activate Seeder
Seeder->>DB: Create stores, products,<br/>calendar, sales, etc.
activate DB
DB-->>Seeder: Row counts
deactivate DB
Seeder-->>Service: GenerateResult
deactivate Seeder
Service-->>API: GenerateResult<br/>{success, records_created}
deactivate Service
activate API
API-->>Hook: 200 OK + Result
deactivate API
Hook->>Hook: Invalidate caches<br/>(status, analytics, KPI)
Hook-->>Admin: Update UI<br/>with result
Note over Admin,DB: SeederPanel refreshes<br/>status & displays results
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Sorry @w7-mgfcode, your pull request is larger than the review limit of 150000 diff characters
There was a problem hiding this comment.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
docs/ARCHITECTURE.md (1)
850-861:⚠️ Potential issue | 🟡 MinorDuplicate section numbering: two sections labeled "13".
Section 13 appears twice — once for "Quality, CI, and Review Rules" (line 850) and again for "Roadmap" (line 861). Renumber one of these sections.
Proposed fix
-## 13) Quality, CI, and Review Rules +## 14) Quality, CI, and Review Rules The repo standards live in `docs/validation/` and are treated as merge gates:Or alternatively, renumber the Roadmap section to 14.
🤖 Fix all issues with AI agents
In `@app/features/seeder/routes.py`:
- Around line 107-117: The handler currently returns raw exception text in the
HTTPException detail (see the block using logger.error and raising HTTPException
with status.HTTP_500_INTERNAL_SERVER_ERROR); change the raised HTTPException to
use a generic message (e.g., "Internal server error" or "Generation failed")
instead of f"Generation failed: {e}" and keep the existing structured
logger.error(exc_info=True, error=str(e), error_type=type(e).__name__) so
diagnostics remain in logs; apply the same change to the other handlers
referenced (append/delete/verify) that raise HTTPException with interpolated
exception messages.
In `@app/features/seeder/schemas.py`:
- Around line 44-98: Add validation to GenerateParams and AppendParams to ensure
end_date is not earlier than start_date: implement a Pydantic validator (e.g., a
`@root_validator` or a `@validator` that inspects both fields) inside each class to
check start_date <= end_date and raise a ValueError with a clear message (e.g.,
"end_date must be on or after start_date") when the check fails; reference the
GenerateParams and AppendParams classes and their start_date/end_date fields so
the validation runs during model instantiation and prevents negative day counts
downstream.
In `@app/features/seeder/service.py`:
- Around line 53-78: The preset returned by _get_scenario_preset is being fully
overwritten by GenerateParams defaults; change the merge logic so only fields
explicitly provided on params override the preset: call
SeederConfig.from_scenario(preset, seed=...) to get base config, then check
params.model_fields_set (Pydantic v2) for each explicit field (e.g., "seed",
"start_date", "end_date", "stores", "products", "sparsity") and only assign into
config when present — for dimensions, update DimensionConfig(stores=...,
products=...) only if respective fields are in model_fields_set (merge with
existing preset dimensions otherwise), and only set config.sparsity =
SparsityConfig(...) when "sparsity" is explicitly provided and > 0; leave all
other preset values intact.
In `@app/features/seeder/tests/test_routes.py`:
- Around line 29-35: The current mock_db fixture replaces get_db with a regular
return_value instead of an async generator and is unused by tests; replace it by
implementing an async override for get_db (e.g., override_get_db) that yields an
AsyncMock AsyncSession and register it via app.dependency_overrides[get_db]
before creating the test AsyncClient (using ASGITransport) and clear
app.dependency_overrides after the test; update or remove the old mock_db
fixture and ensure the client fixture is an async generator that yields the
AsyncClient so FastAPI dependency injection receives the mocked async DB
session.
In `@app/features/seeder/tests/test_service.py`:
- Around line 171-187: test_empty_database's mock_db.execute.side_effect only
supplies 7 scalar-returning mocks but service.get_status calls additional
queries for date range and last-updated; extend the side_effect to include two
more AsyncMock/MagicMock results for those queries (e.g., mocks whose scalar()
return_value is None for date_range_start/date_range_end and updated_at) so
mock_db.execute has the full set of responses the get_status function expects;
update the list creation in test_empty_database (the side_effect passed to
mock_db.execute) to include these two extra mocks to avoid index/stop iteration
errors when service.get_status runs.
In `@frontend/src/pages/admin.tsx`:
- Around line 391-399: The UI currently disables the Delete/Verify actions based
only on the sales count; change the logic to compute a boolean hasData that
aggregates all relevant counts (e.g., salesCount, dimensionCounts, factCounts or
whatever props/state represent dimensions/facts) and use hasData instead of the
sales-only check to control the disabled state of the Delete and Verify controls
(update the button/trigger usages referenced around selectedScenario,
deleteDialogOpen and verifyResult and also the other usage in the block
originally noted at lines ~498-533). Ensure the new hasData is computed where
the component derives its summary counts so both Delete and Verify reference the
same aggregated flag.
- Around line 401-422: handleGenerate and handleDelete don't clear the existing
verifyResult after modifying data; after a successful
generateMutation.mutateAsync and deleteMutation.mutateAsync respectively, reset
the verification state (e.g., call setVerifyResult(null) or
setVerifyResult(undefined)) so stale verifyResult isn't shown; update both
handlers immediately after the successful result handling (before showing
success toast) and reference generateMutation, deleteMutation, handleGenerate,
handleDelete, and verifyResult (and its setter) when making the change.
In `@INITIAL-13.md`:
- Around line 122-135: The fenced ASCII diagram blocks use bare triple backticks
and should specify a language to satisfy markdownlint MD040; update each diagram
block (e.g., the block that begins with the box showing "Current Data Summary"
and the other similar ASCII-art blocks) to use ```text instead of ``` so the
fence becomes ```text ... ```; apply this change consistently to all
diagram/code fences mentioned in the review.
- Around line 27-44: Add blank lines before and after the Markdown table that
starts with the header row "| Component | Use Case in Forge UI |" so the table
is separated from surrounding text (fixes markdownlint MD058); edit the
README/markdown content where the table is defined and insert a single empty
line above the table header and a single empty line after the table's closing
row.
🧹 Nitpick comments (6)
docs/ARCHITECTURE.md (2)
739-747: Add language specifier to fenced code block.The directory structure code block should specify a language (e.g.,
textorplaintext) for consistency and to satisfy markdown linting.Proposed fix
-``` +```text frontend/src/components/ ├── ui/ # shadcn/ui components (26 components)
795-808: Add language specifier to fenced code block.Same as above — specify
textorplaintextfor the directory tree.Proposed fix
-``` +```text app/shared/seeder/ ├── config.py # Configuration dataclasses, scenario presetsapp/features/seeder/tests/test_routes.py (1)
156-163: Validation test forstores=0is good coverage.This correctly verifies that the API rejects invalid parameters with 422. Consider adding a similar test for
products=0or boundary values likestores=101(above max).app/features/seeder/tests/test_service.py (1)
297-298: Hardcoded assertiontotal_checks == 5couples test to implementation.If the number of verification checks changes, this test will fail. Consider a more flexible assertion or document why exactly 5 checks are expected.
Suggestion
- assert result.total_checks == 5 + assert result.total_checks >= 1 # At least one check should run + # Or document: 5 checks = FK integrity, date ranges, etc.Alternatively, keep the exact assertion but add a comment explaining what the 5 checks are.
PRPs/PRP-13-data-seeder-dashboard.md (2)
410-410: Unused import:formatfromdate-fnsis imported but not used in the shown code.The import is listed but the
formatfunction doesn't appear in the SeederPanel implementation. Either remove it or document where it should be used (e.g., formatting date range display).Suggestion
If date formatting is needed for the date range display, use it in the CardDescription:
<CardDescription> {status?.date_range_start && status?.date_range_end ? `${format(new Date(status.date_range_start), 'MMM d, yyyy')} → ${format(new Date(status.date_range_end), 'MMM d, yyyy')}` : 'No data yet'} </CardDescription>Otherwise, remove the import.
649-656: SelectItem with nested<div>elements may cause accessibility issues.Radix Select expects specific content structure. Nested block elements inside SelectItem can break keyboard navigation. Consider using
<span>with flexbox instead.Proposed fix
<SelectItem key={s.name} value={s.name}> - <div className="flex flex-col"> - <span>{formatScenarioLabel(s.name)}</span> - <span className="text-xs text-muted-foreground">{s.description}</span> - </div> + <span className="flex flex-col items-start"> + <span>{formatScenarioLabel(s.name)}</span> + <span className="text-xs text-muted-foreground">{s.description}</span> + </span> </SelectItem>
| except Exception as e: | ||
| logger.error( | ||
| "seeder.generate.failed", | ||
| error=str(e), | ||
| error_type=type(e).__name__, | ||
| exc_info=True, | ||
| ) | ||
| raise HTTPException( | ||
| status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, | ||
| detail=f"Generation failed: {e}", | ||
| ) from e |
There was a problem hiding this comment.
Avoid leaking internal exception details in 500 responses.
Line 115/Line 170/Line 225/Line 261 include raw exception messages in detail, which can expose internals. Prefer a generic error message and rely on structured logs for diagnostics.
🔒 Suggested fix (apply similarly to append/delete/verify)
- raise HTTPException(
- status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
- detail=f"Generation failed: {e}",
- ) from e
+ raise HTTPException(
+ status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
+ detail="Generation failed.",
+ ) from eAlso applies to: 161-171, 216-226, 252-262
🤖 Prompt for AI Agents
In `@app/features/seeder/routes.py` around lines 107 - 117, The handler currently
returns raw exception text in the HTTPException detail (see the block using
logger.error and raising HTTPException with
status.HTTP_500_INTERNAL_SERVER_ERROR); change the raised HTTPException to use a
generic message (e.g., "Internal server error" or "Generation failed") instead
of f"Generation failed: {e}" and keep the existing structured
logger.error(exc_info=True, error=str(e), error_type=type(e).__name__) so
diagnostics remain in logs; apply the same change to the other handlers
referenced (append/delete/verify) that raise HTTPException with interpolated
exception messages.
| class GenerateParams(BaseModel): | ||
| """Parameters for generating a new dataset.""" | ||
|
|
||
| scenario: str = Field( | ||
| default="retail_standard", | ||
| description="Scenario preset name", | ||
| ) | ||
| seed: int = Field( | ||
| default=42, | ||
| ge=0, | ||
| description="Random seed for reproducibility", | ||
| ) | ||
| stores: int = Field( | ||
| default=10, | ||
| ge=1, | ||
| le=100, | ||
| description="Number of stores to generate", | ||
| ) | ||
| products: int = Field( | ||
| default=50, | ||
| ge=1, | ||
| le=500, | ||
| description="Number of products to generate", | ||
| ) | ||
| start_date: date = Field( | ||
| default_factory=lambda: date(2024, 1, 1), | ||
| description="Start of date range", | ||
| ) | ||
| end_date: date = Field( | ||
| default_factory=lambda: date(2024, 12, 31), | ||
| description="End of date range", | ||
| ) | ||
| sparsity: float = Field( | ||
| default=0.0, | ||
| ge=0.0, | ||
| le=1.0, | ||
| description="Fraction of missing store/product combinations", | ||
| ) | ||
| dry_run: bool = Field( | ||
| default=False, | ||
| description="Preview only, do not execute", | ||
| ) | ||
|
|
||
|
|
||
| class AppendParams(BaseModel): | ||
| """Parameters for appending data to existing dataset.""" | ||
|
|
||
| start_date: date = Field(description="Start of new date range") | ||
| end_date: date = Field(description="End of new date range") | ||
| seed: int = Field( | ||
| default=43, | ||
| ge=0, | ||
| description="Random seed for reproducibility", | ||
| ) | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
rg -n "pydantic" -g "pyproject.toml" -g "requirements*.txt" -g "poetry.lock" -g "Pipfile*" -g "setup.cfg"Repository: w7-mgfcode/ForecastLabAI
Length of output: 291
🏁 Script executed:
cat -n app/features/seeder/schemas.py | head -120Repository: w7-mgfcode/ForecastLabAI
Length of output: 4812
🏁 Script executed:
find app/features/seeder -name "*.py" -type f | grep -v __pycache__Repository: w7-mgfcode/ForecastLabAI
Length of output: 392
🏁 Script executed:
cat -n app/features/seeder/service.pyRepository: w7-mgfcode/ForecastLabAI
Length of output: 20823
🏁 Script executed:
cat -n app/features/seeder/tests/test_schemas.pyRepository: w7-mgfcode/ForecastLabAI
Length of output: 10543
Add date ordering validation to GenerateParams and AppendParams.
Both classes can accept end_date < start_date, causing negative day counts in dry-run (line 252 in service.py) and breaking generation/append flows.
Suggested fix
-from pydantic import BaseModel, Field
+from pydantic import BaseModel, Field, model_validator
class GenerateParams(BaseModel):
"""Parameters for generating a new dataset."""
scenario: str = Field(...)
# ... other fields ...
dry_run: bool = Field(
default=False,
description="Preview only, do not execute",
)
+
+ `@model_validator`(mode="after")
+ def _validate_date_range(self):
+ if self.end_date < self.start_date:
+ raise ValueError("end_date must be on or after start_date")
+ return self
class AppendParams(BaseModel):
"""Parameters for appending data to existing dataset."""
start_date: date = Field(description="Start of new date range")
end_date: date = Field(description="End of new date range")
seed: int = Field(...)
+
+ `@model_validator`(mode="after")
+ def _validate_date_range(self):
+ if self.end_date < self.start_date:
+ raise ValueError("end_date must be on or after start_date")
+ return self🤖 Prompt for AI Agents
In `@app/features/seeder/schemas.py` around lines 44 - 98, Add validation to
GenerateParams and AppendParams to ensure end_date is not earlier than
start_date: implement a Pydantic validator (e.g., a `@root_validator` or a
`@validator` that inspects both fields) inside each class to check start_date <=
end_date and raise a ValueError with a clear message (e.g., "end_date must be on
or after start_date") when the check fails; reference the GenerateParams and
AppendParams classes and their start_date/end_date fields so the validation runs
during model instantiation and prevents negative day counts downstream.
| preset = _get_scenario_preset(params.scenario) | ||
|
|
||
| if preset: | ||
| # Start from scenario preset and override with explicit params | ||
| config = SeederConfig.from_scenario(preset, seed=params.seed) | ||
| # Override dimensions if explicitly set (different from defaults) | ||
| config.dimensions = DimensionConfig( | ||
| stores=params.stores, | ||
| products=params.products, | ||
| ) | ||
| config.start_date = params.start_date | ||
| config.end_date = params.end_date | ||
| if params.sparsity > 0: | ||
| config.sparsity = SparsityConfig(missing_combinations_pct=params.sparsity) | ||
| else: | ||
| # Use default config with provided params | ||
| config = SeederConfig( | ||
| seed=params.seed, | ||
| start_date=params.start_date, | ||
| end_date=params.end_date, | ||
| dimensions=DimensionConfig( | ||
| stores=params.stores, | ||
| products=params.products, | ||
| ), | ||
| sparsity=SparsityConfig(missing_combinations_pct=params.sparsity), | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's locate and examine the GenerateParams and SeederConfig classes
fd "service.py" app/features/seeder/ -x cat {}Repository: w7-mgfcode/ForecastLabAI
Length of output: 29067
🏁 Script executed:
# Search for GenerateParams definition
rg "class GenerateParams" -A 20Repository: w7-mgfcode/ForecastLabAI
Length of output: 2272
🏁 Script executed:
# Search for SeederConfig definition and from_scenario method
rg "class SeederConfig" -A 30Repository: w7-mgfcode/ForecastLabAI
Length of output: 2342
🏁 Script executed:
# Look for scenario preset definitions
rg "scenario.*preset\|_get_scenario_preset" -A 10Repository: w7-mgfcode/ForecastLabAI
Length of output: 50
🏁 Script executed:
# Check if Pydantic is being used and what version
fd "requirements.txt|setup.py|pyproject.toml" -x cat {}Repository: w7-mgfcode/ForecastLabAI
Length of output: 5244
🏁 Script executed:
# Get full GenerateParams definition
rg "class GenerateParams" -A 30 app/features/seeder/schemas.pyRepository: w7-mgfcode/ForecastLabAI
Length of output: 932
🏁 Script executed:
# Get SeederConfig.from_scenario implementation
rg "def from_scenario" -A 50 app/shared/seeder/config.pyRepository: w7-mgfcode/ForecastLabAI
Length of output: 2039
🏁 Script executed:
# Check if SeederConfig is a dataclass and what its defaults are
rg "class SeederConfig|^from dataclasses|^@dataclass" -B 2 -A 30 app/shared/seeder/config.pyRepository: w7-mgfcode/ForecastLabAI
Length of output: 5916
🏁 Script executed:
# Look for any tests that verify scenario preset behavior
rg "holiday_rush|new_launches|from_scenario" -B 3 -A 10 app/features/seeder/tests/Repository: w7-mgfcode/ForecastLabAI
Length of output: 11126
🏁 Script executed:
# Check if GenerateParams actually exposes model_fields_set in Pydantic v2
rg "model_fields_set|__fields_set__" app/features/seeder/Repository: w7-mgfcode/ForecastLabAI
Length of output: 50
🏁 Script executed:
# Look at actual test for _build_config_from_params to see if the issue is caught
rg "test_custom_scenario|test_date_range_override" -A 20 app/features/seeder/tests/test_service.pyRepository: w7-mgfcode/ForecastLabAI
Length of output: 1533
🏁 Script executed:
# Check if there's any test that creates holiday_rush without explicit dates
rg "GenerateParams.*scenario.*holiday_rush" -B 2 -A 15 app/features/seeder/tests/Repository: w7-mgfcode/ForecastLabAI
Length of output: 50
Scenario presets are unconditionally overridden by GenerateParams defaults, silencing preset-specific configuration.
Lines 55–64 unconditionally rewrite preset values (e.g., holiday_rush sets start_date=2024-10-01, end_date=2024-12-31, exponential trend) with GenerateParams defaults. When a client requests a scenario without explicitly setting start_date, end_date, stores, or products, the generated config ignores the preset values and uses param defaults instead. This defeats the purpose of scenario presets.
Recommended fix: Use Pydantic v2's model_fields_set to detect which fields were explicitly provided, and only override preset values for those fields:
🛠️ Suggested fix (only override explicitly provided fields)
if preset:
# Start from scenario preset and override with explicit params
config = SeederConfig.from_scenario(preset, seed=params.seed)
- # Override dimensions if explicitly set (different from defaults)
- config.dimensions = DimensionConfig(
- stores=params.stores,
- products=params.products,
- )
- config.start_date = params.start_date
- config.end_date = params.end_date
- if params.sparsity > 0:
- config.sparsity = SparsityConfig(missing_combinations_pct=params.sparsity)
+ # Only override fields explicitly provided by client
+ fields_set = params.model_fields_set
+ if "stores" in fields_set or "products" in fields_set:
+ config.dimensions = DimensionConfig(
+ stores=params.stores if "stores" in fields_set else config.dimensions.stores,
+ products=params.products if "products" in fields_set else config.dimensions.products,
+ )
+ if "start_date" in fields_set:
+ config.start_date = params.start_date
+ if "end_date" in fields_set:
+ config.end_date = params.end_date
+ if "sparsity" in fields_set and params.sparsity > 0:
+ config.sparsity = SparsityConfig(missing_combinations_pct=params.sparsity)🤖 Prompt for AI Agents
In `@app/features/seeder/service.py` around lines 53 - 78, The preset returned by
_get_scenario_preset is being fully overwritten by GenerateParams defaults;
change the merge logic so only fields explicitly provided on params override the
preset: call SeederConfig.from_scenario(preset, seed=...) to get base config,
then check params.model_fields_set (Pydantic v2) for each explicit field (e.g.,
"seed", "start_date", "end_date", "stores", "products", "sparsity") and only
assign into config when present — for dimensions, update
DimensionConfig(stores=..., products=...) only if respective fields are in
model_fields_set (merge with existing preset dimensions otherwise), and only set
config.sparsity = SparsityConfig(...) when "sparsity" is explicitly provided and
> 0; leave all other preset values intact.
| @pytest.fixture | ||
| def mock_db(): | ||
| """Mock database session.""" | ||
| with patch("app.features.seeder.routes.get_db") as mock: | ||
| mock_session = AsyncMock() | ||
| mock.return_value = mock_session | ||
| yield mock_session |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify how get_db is defined and used in routes
echo "=== Checking get_db definition ==="
rg -n "def get_db" --type py -A 5
echo "=== Checking route dependency injection ==="
ast-grep --pattern 'Depends(get_db)'
echo "=== Looking for get_db in seeder routes ==="
rg -n "get_db" app/features/seeder/routes.py -B 2 -A 2Repository: w7-mgfcode/ForecastLabAI
Length of output: 4815
🏁 Script executed:
#!/bin/bash
# Check how mock_db fixture is used in tests
echo "=== Viewing test file to see mock_db usage ==="
cat -n app/features/seeder/tests/test_routes.py | head -80
echo ""
echo "=== Checking if tests reference or use mock_db ==="
rg -n "mock_db" app/features/seeder/tests/test_routes.py
echo ""
echo "=== Check for app.dependency_overrides usage elsewhere in tests ==="
rg -n "dependency_overrides" --type pyRepository: w7-mgfcode/ForecastLabAI
Length of output: 4794
🏁 Script executed:
#!/bin/bash
# Check one of the proper implementations
echo "=== Check agents conftest for dependency_overrides pattern ==="
cat -n app/features/agents/tests/conftest.py | sed -n '60,85p'
echo ""
echo "=== Check if override_get_db is an async generator ==="
rg -B 5 "override_get_db" app/features/agents/tests/conftest.py | head -20Repository: w7-mgfcode/ForecastLabAI
Length of output: 1546
🏁 Script executed:
#!/bin/bash
# Let's look at how mock_db is actually used in one of the tests
echo "=== View a test that uses mock_db to see how it's used ==="
cat -n app/features/seeder/tests/test_routes.py | sed -n '98,120p'
echo ""
echo "=== Check if mock_db is used directly in test or just passed ==="
cat -n app/features/seeder/tests/test_routes.py | sed -n '179,215p'
echo ""
echo "=== Check imports in test file ==="
head -15 app/features/seeder/tests/test_routes.pyRepository: w7-mgfcode/ForecastLabAI
Length of output: 3298
🏁 Script executed:
#!/bin/bash
# Check one route implementation to see how db is used
echo "=== Check get_status route implementation ==="
cat -n app/features/seeder/routes.py | sed -n '36,65p'
echo ""
echo "=== Check generate_data route implementation ==="
cat -n app/features/seeder/routes.py | sed -n '68,105p'Repository: w7-mgfcode/ForecastLabAI
Length of output: 2845
🏁 Script executed:
#!/bin/bash
# Let's check if tests are actually passing with the current setup
echo "=== Looking for test runner output or pytest results ==="
if [ -f pytest.ini ]; then cat pytest.ini; fi
echo ""
echo "=== Check if there are other seeder tests or if these are all the tests ==="
find app/features/seeder/tests -name "*.py" -type f
echo ""
echo "=== Verify the actual async generator behavior by checking service functions ==="
echo "Do service functions actually use the db parameter?"
rg -n "async def get_status" app/features/seeder/ -A 3
echo ""
echo "=== Check if db is actually used in service functions ==="
rg -n "def generate_data" app/features/seeder/service.py -A 5Repository: w7-mgfcode/ForecastLabAI
Length of output: 1326
🏁 Script executed:
#!/bin/bash
# Critical check: Do the route tests actually use the mocked db or just mock the service?
echo "=== Detailed view of test that accepts mock_db ==="
cat -n app/features/seeder/tests/test_routes.py | sed -n '41,63p'
echo ""
echo "=== Another test with mock_db to confirm pattern ==="
cat -n app/features/seeder/tests/test_routes.py | sed -n '275,305p'
echo ""
echo "=== Check if mock_db is ACTUALLY used or just accepted as parameter ==="
rg -n "mock_db\." app/features/seeder/tests/test_routes.py
echo ""
echo "=== Check conftest to see if there's anything that auto-uses mock_db ==="
cat app/features/seeder/tests/conftest.pyRepository: w7-mgfcode/ForecastLabAI
Length of output: 2918
🏁 Script executed:
#!/bin/bash
# Final verification: Check if the patched get_db is actually called by FastAPI
# when using TestClient by looking at how other tests properly implement this
echo "=== Compare with registry tests that use app.dependency_overrides ==="
cat -n app/features/registry/tests/conftest.py | sed -n '55,82p'
echo ""
echo "=== Check if TestClient actually invokes dependencies ==="
echo "Looking for any test that verifies mock_db was called..."
rg -n "mock\." app/features/seeder/tests/test_routes.py | grep -E "assert|call"Repository: w7-mgfcode/ForecastLabAI
Length of output: 1213
🏁 Script executed:
#!/bin/bash
# Final verification: Confirm that TestClient doesn't actually invoke the dependency
echo "=== Check what happens when you DON'T mock service (to confirm db is never used) ==="
cat -n app/features/seeder/tests/test_routes.py | sed -n '65,80p'
echo ""
echo "=== One more confirmation: Do any tests actually use the mocked db object? ==="
# The mock_db is a parameter but is it used?
python3 << 'EOF'
import re
with open('app/features/seeder/tests/test_routes.py', 'r') as f:
content = f.read()
# Find all test methods that have mock_db parameter
tests_with_mock_db = re.findall(r'def (test_\w+)\(self.*mock_db', content)
print(f"Tests with mock_db parameter: {len(tests_with_mock_db)}")
print(tests_with_mock_db[:5])
# Check if mock_db is ever accessed with dot notation in tests
if re.search(r'mock_db\.', content):
print("\nmock_db is used in tests: YES")
matches = re.findall(r'.*mock_db\..*', content)
for match in matches[:5]:
print(f" {match.strip()}")
else:
print("\nmock_db is used in tests: NO (accepted as parameter but never used)")
EOFRepository: w7-mgfcode/ForecastLabAI
Length of output: 1131
The mock_db fixture doesn't properly implement the async generator protocol and is inconsistent with how dependencies are mocked elsewhere in the codebase.
get_db is an async generator (async def get_db() -> AsyncGenerator[AsyncSession, None]), but the current fixture patches it with a simple return_value. This doesn't implement the generator protocol correctly. More importantly, the mock_db parameter is accepted by 11 tests but never actually used—tests work only because they mock the service layer functions directly, never invoking FastAPI's dependency injection.
Other feature tests (registry, agents, rag, backtesting, ingest) use the correct pattern with app.dependency_overrides and a proper async generator override. Align this with that standard:
`@pytest.fixture`
async def client() -> AsyncGenerator[AsyncClient, None]:
"""Create test client with database dependency override."""
async def override_get_db() -> AsyncGenerator[AsyncSession, None]:
mock_session = AsyncMock()
yield mock_session
app.dependency_overrides[get_db] = override_get_db
async with AsyncClient(
transport=ASGITransport(app=app),
base_url="http://test",
) as ac:
yield ac
app.dependency_overrides.clear()This ensures proper test isolation and dependency injection behavior.
🤖 Prompt for AI Agents
In `@app/features/seeder/tests/test_routes.py` around lines 29 - 35, The current
mock_db fixture replaces get_db with a regular return_value instead of an async
generator and is unused by tests; replace it by implementing an async override
for get_db (e.g., override_get_db) that yields an AsyncMock AsyncSession and
register it via app.dependency_overrides[get_db] before creating the test
AsyncClient (using ASGITransport) and clear app.dependency_overrides after the
test; update or remove the old mock_db fixture and ensure the client fixture is
an async generator that yields the AsyncClient so FastAPI dependency injection
receives the mocked async DB session.
| @pytest.mark.asyncio | ||
| async def test_empty_database(self): | ||
| """Test status for empty database.""" | ||
| mock_db = AsyncMock() | ||
|
|
||
| # Mock empty counts | ||
| mock_db.execute.side_effect = [ | ||
| *[MagicMock(scalar=MagicMock(return_value=0)) for _ in range(7)], | ||
| ] | ||
|
|
||
| status = await service.get_status(mock_db) | ||
|
|
||
| assert status.stores == 0 | ||
| assert status.products == 0 | ||
| assert status.sales == 0 | ||
| assert status.date_range_start is None | ||
| assert status.date_range_end is None |
There was a problem hiding this comment.
Empty database test mock setup appears incomplete.
The mock for test_empty_database only provides 7 count query results, but test_returns_status shows the service also queries date range and updated_at. If the service expects 9 total queries but only 7 are mocked, this could cause issues.
Proposed fix
`@pytest.mark.asyncio`
async def test_empty_database(self):
"""Test status for empty database."""
mock_db = AsyncMock()
# Mock empty counts
mock_db.execute.side_effect = [
*[MagicMock(scalar=MagicMock(return_value=0)) for _ in range(7)],
+ # Date range query (returns None for empty)
+ MagicMock(fetchone=MagicMock(return_value=None)),
+ # Updated_at query
+ MagicMock(scalar=MagicMock(return_value=None)),
]
status = await service.get_status(mock_db)🤖 Prompt for AI Agents
In `@app/features/seeder/tests/test_service.py` around lines 171 - 187,
test_empty_database's mock_db.execute.side_effect only supplies 7
scalar-returning mocks but service.get_status calls additional queries for date
range and last-updated; extend the side_effect to include two more
AsyncMock/MagicMock results for those queries (e.g., mocks whose scalar()
return_value is None for date_range_start/date_range_end and updated_at) so
mock_db.execute has the full set of responses the get_status function expects;
update the list creation in test_empty_database (the side_effect passed to
mock_db.execute) to include these two extra mocks to avoid index/stop iteration
errors when service.get_status runs.
| const [selectedScenario, setSelectedScenario] = useState('retail_standard') | ||
| const [deleteDialogOpen, setDeleteDialogOpen] = useState(false) | ||
| const [verifyResult, setVerifyResult] = useState<{ | ||
| passed: boolean | ||
| checks: VerifyCheck[] | ||
| passed_count: number | ||
| warning_count: number | ||
| failed_count: number | ||
| } | null>(null) |
There was a problem hiding this comment.
Don’t disable Delete/Verify solely on sales count.
If only dimensions exist (or facts are empty), the UI blocks delete/verify even though there’s still data to manage. Consider a hasData flag that aggregates all counts.
🛠️ Suggested update
const [verifyResult, setVerifyResult] = useState<{
passed: boolean
checks: VerifyCheck[]
passed_count: number
warning_count: number
failed_count: number
} | null>(null)
+ const hasData =
+ (status?.stores ?? 0) +
+ (status?.products ?? 0) +
+ (status?.calendar ?? 0) +
+ (status?.sales ?? 0) +
+ (status?.inventory ?? 0) +
+ (status?.price_history ?? 0) +
+ (status?.promotions ?? 0) >
+ 0
@@
<Button
variant="destructive"
- disabled={deleteMutation.isPending || (status?.sales ?? 0) === 0}
+ disabled={deleteMutation.isPending || !hasData}
>
@@
<Button
variant="outline"
onClick={handleVerify}
- disabled={verifyMutation.isPending || (status?.sales ?? 0) === 0}
+ disabled={verifyMutation.isPending || !hasData}
>Also applies to: 498-533
🤖 Prompt for AI Agents
In `@frontend/src/pages/admin.tsx` around lines 391 - 399, The UI currently
disables the Delete/Verify actions based only on the sales count; change the
logic to compute a boolean hasData that aggregates all relevant counts (e.g.,
salesCount, dimensionCounts, factCounts or whatever props/state represent
dimensions/facts) and use hasData instead of the sales-only check to control the
disabled state of the Delete and Verify controls (update the button/trigger
usages referenced around selectedScenario, deleteDialogOpen and verifyResult and
also the other usage in the block originally noted at lines ~498-533). Ensure
the new hasData is computed where the component derives its summary counts so
both Delete and Verify reference the same aggregated flag.
| const handleGenerate = async () => { | ||
| try { | ||
| const result = await generateMutation.mutateAsync({ | ||
| scenario: selectedScenario, | ||
| }) | ||
| toast.success( | ||
| `Generated ${result.records_created.sales?.toLocaleString() ?? 0} sales records in ${result.duration_seconds.toFixed(1)}s` | ||
| ) | ||
| } catch (err) { | ||
| toast.error(err instanceof Error ? err.message : 'Generation failed') | ||
| } | ||
| } | ||
|
|
||
| const handleDelete = async () => { | ||
| try { | ||
| const result = await deleteMutation.mutateAsync({ scope: 'all' }) | ||
| setDeleteDialogOpen(false) | ||
| toast.success(result.message) | ||
| } catch (err) { | ||
| toast.error(err instanceof Error ? err.message : 'Delete failed') | ||
| } | ||
| } |
There was a problem hiding this comment.
Clear stale verification results after data changes.
verifyResult remains visible after generate/delete, which can mislead users. Reset it on successful mutations.
🛠️ Suggested update
try {
const result = await generateMutation.mutateAsync({
scenario: selectedScenario,
})
+ setVerifyResult(null)
toast.success(
`Generated ${result.records_created.sales?.toLocaleString() ?? 0} sales records in ${result.duration_seconds.toFixed(1)}s`
)
@@
try {
const result = await deleteMutation.mutateAsync({ scope: 'all' })
+ setVerifyResult(null)
setDeleteDialogOpen(false)
toast.success(result.message)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const handleGenerate = async () => { | |
| try { | |
| const result = await generateMutation.mutateAsync({ | |
| scenario: selectedScenario, | |
| }) | |
| toast.success( | |
| `Generated ${result.records_created.sales?.toLocaleString() ?? 0} sales records in ${result.duration_seconds.toFixed(1)}s` | |
| ) | |
| } catch (err) { | |
| toast.error(err instanceof Error ? err.message : 'Generation failed') | |
| } | |
| } | |
| const handleDelete = async () => { | |
| try { | |
| const result = await deleteMutation.mutateAsync({ scope: 'all' }) | |
| setDeleteDialogOpen(false) | |
| toast.success(result.message) | |
| } catch (err) { | |
| toast.error(err instanceof Error ? err.message : 'Delete failed') | |
| } | |
| } | |
| const handleGenerate = async () => { | |
| try { | |
| const result = await generateMutation.mutateAsync({ | |
| scenario: selectedScenario, | |
| }) | |
| setVerifyResult(null) | |
| toast.success( | |
| `Generated ${result.records_created.sales?.toLocaleString() ?? 0} sales records in ${result.duration_seconds.toFixed(1)}s` | |
| ) | |
| } catch (err) { | |
| toast.error(err instanceof Error ? err.message : 'Generation failed') | |
| } | |
| } | |
| const handleDelete = async () => { | |
| try { | |
| const result = await deleteMutation.mutateAsync({ scope: 'all' }) | |
| setVerifyResult(null) | |
| setDeleteDialogOpen(false) | |
| toast.success(result.message) | |
| } catch (err) { | |
| toast.error(err instanceof Error ? err.message : 'Delete failed') | |
| } | |
| } |
🤖 Prompt for AI Agents
In `@frontend/src/pages/admin.tsx` around lines 401 - 422, handleGenerate and
handleDelete don't clear the existing verifyResult after modifying data; after a
successful generateMutation.mutateAsync and deleteMutation.mutateAsync
respectively, reset the verification state (e.g., call setVerifyResult(null) or
setVerifyResult(undefined)) so stale verifyResult isn't shown; update both
handlers immediately after the successful result handling (before showing
success toast) and reference generateMutation, deleteMutation, handleGenerate,
handleDelete, and verifyResult (and its setter) when making the change.
| **Available shadcn/ui Components (Already Installed):** | ||
| | Component | Use Case in Forge UI | | ||
| |-----------|---------------------| | ||
| | `Card` | KPI summary cards, configuration panels | | ||
| | `Tabs` | Switch between Operations/Config/Logs views | | ||
| | `Button` | Action triggers (Generate, Delete, Append) | | ||
| | `AlertDialog` | Confirmation for destructive operations | | ||
| | `Dialog` | Configuration modals | | ||
| | `Select` | Scenario picker, scope selector | | ||
| | `Input` | Seed, stores, products inputs | | ||
| | `Calendar` | Date range picker | | ||
| | `Progress` | Operation progress indicator | | ||
| | `Badge` | Status indicators | | ||
| | `Table` | Row count summary, verification results | | ||
| | `Accordion` | Collapsible log sections | | ||
| | `Skeleton` | Loading states | | ||
| | `Sonner` | Toast notifications | | ||
|
|
There was a problem hiding this comment.
Add blank lines around the table (markdownlint MD058).
The table at Line 28 isn’t surrounded by blank lines, which triggers MD058.
✍️ Suggested fix
**Available shadcn/ui Components (Already Installed):**
+
| Component | Use Case in Forge UI |
|-----------|---------------------|
| `Card` | KPI summary cards, configuration panels |
@@
| `Skeleton` | Loading states |
| `Sonner` | Toast notifications |
+📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| **Available shadcn/ui Components (Already Installed):** | |
| | Component | Use Case in Forge UI | | |
| |-----------|---------------------| | |
| | `Card` | KPI summary cards, configuration panels | | |
| | `Tabs` | Switch between Operations/Config/Logs views | | |
| | `Button` | Action triggers (Generate, Delete, Append) | | |
| | `AlertDialog` | Confirmation for destructive operations | | |
| | `Dialog` | Configuration modals | | |
| | `Select` | Scenario picker, scope selector | | |
| | `Input` | Seed, stores, products inputs | | |
| | `Calendar` | Date range picker | | |
| | `Progress` | Operation progress indicator | | |
| | `Badge` | Status indicators | | |
| | `Table` | Row count summary, verification results | | |
| | `Accordion` | Collapsible log sections | | |
| | `Skeleton` | Loading states | | |
| | `Sonner` | Toast notifications | | |
| **Available shadcn/ui Components (Already Installed):** | |
| | Component | Use Case in Forge UI | | |
| |-----------|---------------------| | |
| | `Card` | KPI summary cards, configuration panels | | |
| | `Tabs` | Switch between Operations/Config/Logs views | | |
| | `Button` | Action triggers (Generate, Delete, Append) | | |
| | `AlertDialog` | Confirmation for destructive operations | | |
| | `Dialog` | Configuration modals | | |
| | `Select` | Scenario picker, scope selector | | |
| | `Input` | Seed, stores, products inputs | | |
| | `Calendar` | Date range picker | | |
| | `Progress` | Operation progress indicator | | |
| | `Badge` | Status indicators | | |
| | `Table` | Row count summary, verification results | | |
| | `Accordion` | Collapsible log sections | | |
| | `Skeleton` | Loading states | | |
| | `Sonner` | Toast notifications | | |
🧰 Tools
🪛 markdownlint-cli2 (0.20.0)
[warning] 28-28: Tables should be surrounded by blank lines
(MD058, blanks-around-tables)
🤖 Prompt for AI Agents
In `@INITIAL-13.md` around lines 27 - 44, Add blank lines before and after the
Markdown table that starts with the header row "| Component | Use Case in Forge
UI |" so the table is separated from surrounding text (fixes markdownlint
MD058); edit the README/markdown content where the table is defined and insert a
single empty line above the table header and a single empty line after the
table's closing row.
| ``` | ||
| ┌─────────────────────────────────────────────────────────────┐ | ||
| │ Current Data Summary [Refresh] │ | ||
| ├─────────────────────────────────────────────────────────────┤ | ||
| │ ┌─────────┬─────────┬─────────┬─────────┬─────────┐ │ | ||
| │ │ Stores │Products │Calendar │ Sales │Inventory│ │ | ||
| │ │ 10 │ 50 │ 365 │ 127,450 │ 182,500 │ │ | ||
| │ │ +0% │ +0% │ +0% │ +12% │ +8% │ │ | ||
| │ └─────────┴─────────┴─────────┴─────────┴─────────┘ │ | ||
| │ │ | ||
| │ Date Range: 2024-01-01 → 2024-12-31 (365 days) │ | ||
| │ Last Updated: 2 hours ago │ | ||
| └─────────────────────────────────────────────────────────────┘ | ||
| ``` |
There was a problem hiding this comment.
Specify languages for fenced code blocks (markdownlint MD040).
Multiple fenced blocks use bare ```; lint expects a language. For ASCII diagrams, text is a good fit.
✍️ Suggested fix (apply similarly to the other blocks)
-```
+```text
┌─────────────────────────────────────────────────────────────┐
│ Current Data Summary [Refresh] │
@@
└─────────────────────────────────────────────────────────────┘
-```
+```Also applies to: 146-159, 170-188, 201-216, 228-244, 257-288
🧰 Tools
🪛 markdownlint-cli2 (0.20.0)
[warning] 122-122: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
In `@INITIAL-13.md` around lines 122 - 135, The fenced ASCII diagram blocks use
bare triple backticks and should specify a language to satisfy markdownlint
MD040; update each diagram block (e.g., the block that begins with the box
showing "Current Data Summary" and the other similar ASCII-art blocks) to use
```text instead of ``` so the fence becomes ```text ... ```; apply this change
consistently to all diagram/code fences mentioned in the review.
Summary
/seeder/*)Changes
Backend API (
app/features/seeder/)GET /seeder/status- Get database row counts and date rangeGET /seeder/scenarios- List available scenario presetsPOST /seeder/generate- Generate new synthetic datasetPOST /seeder/append- Append data to existing datasetDELETE /seeder/data- Delete data with scope selectionPOST /seeder/verify- Run data integrity checksFrontend Dashboard (
frontend/src/pages/admin.tsx)Documentation
docs/DATA-SEEDER.md- Comprehensive feature documentationREADME.mdanddocs/ARCHITECTURE.mdTest plan
uv run pytest app/features/seeder/tests/ -v)cd frontend && pnpm build)cd frontend && pnpm tsc --noEmit)/adminand click "Data Seeder" tab🤖 Generated with Claude Code
Summary by CodeRabbit
Release Notes
New Features
Documentation