fix(data): anchor seeded data window to the current date (#181)#182
Conversation
Reviewer's GuideAnchors seeded data, seeder API defaults, and demo pipelines to a moving date window ending on the current UTC date instead of hard‑coded 2024 ranges, while explicitly leaving the holiday_rush scenario calendar‑pinned; updates scripts and tests accordingly. Sequence diagram for today-anchored demo seeding flowsequenceDiagram
actor Developer
participant RunDemo as run_demo.step_seed
participant HttpClient as HttpClient.request
participant SeederAPI as POST_/seeder/generate
Developer->>RunDemo: invoke step_seed()
RunDemo->>RunDemo: datetime.now(UTC).date()
RunDemo->>RunDemo: seed_start = seed_end - DEMO_SEED_SPAN_DAYS
RunDemo->>HttpClient: request("seed", "POST", start_date, end_date)
HttpClient->>SeederAPI: POST /seeder/generate
SeederAPI-->>HttpClient: seeded data response
HttpClient-->>RunDemo: response
RunDemo-->>Developer: StepOutcome
File-Level Changes
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- Several places compute
default_seed_start_date()anddefault_seed_end_date()independently (e.g. inlist_scenarios, tests, etc.); consider deriving the start date from a singletodayvalue per call/site to avoid edge-case inconsistencies around day boundaries and to make the anchoring semantics clearer. - The
*_SPAN_DAYSconstants and comments mix the notion of a delta (timedelta(days=SPAN_DAYS)) with the inclusive calendar length (e.g. "92 days inclusive" forDEMO_MINIMAL_SPAN_DAYS = 91); it would be clearer either to rename these to reflect that they represent a backward offset, or to adjust the comments/docstrings to describe the exactend_date - start_datesemantics.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Several places compute `default_seed_start_date()` and `default_seed_end_date()` independently (e.g. in `list_scenarios`, tests, etc.); consider deriving the start date from a single `today` value per call/site to avoid edge-case inconsistencies around day boundaries and to make the anchoring semantics clearer.
- The `*_SPAN_DAYS` constants and comments mix the notion of a delta (`timedelta(days=SPAN_DAYS)`) with the inclusive calendar length (e.g. "92 days inclusive" for `DEMO_MINIMAL_SPAN_DAYS = 91`); it would be clearer either to rename these to reflect that they represent a backward offset, or to adjust the comments/docstrings to describe the exact `end_date - start_date` semantics.
## Individual Comments
### Comment 1
<location path="app/shared/seeder/tests/test_config.py" line_range="50-54" />
<code_context>
+ The default date window is anchored to *today* and runs
+ DEFAULT_SEED_SPAN_DAYS backwards, so seeded data always looks current.
+ """
config = SeederConfig()
assert config.seed == 42
- assert config.start_date == date(2024, 1, 1)
- assert config.end_date == date(2024, 12, 31)
+ assert config.end_date == default_seed_end_date()
+ assert config.start_date == default_seed_start_date()
assert config.dimensions.stores == 10
assert config.dimensions.products == 50
</code_context>
<issue_to_address>
**suggestion (testing):** Avoid time-dependent flakiness by stabilizing the "today" value used in default date range assertions
Because `SeederConfig()` computes its dates using `default_seed_*` and the assertions call `default_seed_end_date()` / `default_seed_start_date()` again, the test can fail if it crosses midnight between these calls. To avoid this flakiness, either:
- Patch `default_seed_end_date` / `default_seed_start_date` to return fixed values in this test, or
- Compute a single `today = default_seed_end_date()` (patched or real) and reuse it for both the config construction and the assertions.
Freezing time (via monkeypatch or a dedicated helper) would make this fully deterministic.
Suggested implementation:
```python
def test_default_values(self, monkeypatch):
"""Test default configuration values.
The default date window is anchored to *today* and runs
DEFAULT_SEED_SPAN_DAYS backwards, so seeded data always looks current.
To avoid time-dependent flakiness, we freeze the default date helpers
to a stable "today" value for this test.
"""
fixed_today = date(2024, 1, 31)
monkeypatch.setattr(
"app.shared.seeder.config.default_seed_end_date",
lambda: fixed_today,
)
monkeypatch.setattr(
"app.shared.seeder.config.default_seed_start_date",
lambda: fixed_today - timedelta(days=DEFAULT_SEED_SPAN_DAYS),
)
config = SeederConfig()
assert config.seed == 42
assert config.end_date == default_seed_end_date()
assert config.start_date == default_seed_start_date()
assert config.dimensions.stores == 10
assert config.dimensions.products == 50
assert config.batch_size == 1000
def test_from_scenario_demo_minimal(self):
"""Test demo_minimal scenario preset.
```
To make this compile and behave correctly you may also need to:
1. Ensure `date` and `timedelta` are imported, e.g.:
- `from datetime import date, timedelta`
2. Ensure `DEFAULT_SEED_SPAN_DAYS` is imported into this test module if it is not already, e.g.:
- `from app.shared.seeder.config import DEFAULT_SEED_SPAN_DAYS`
3. Confirm the monkeypatch target string `"app.shared.seeder.config.default_seed_end_date"` / `"app.shared.seeder.config.default_seed_start_date"` matches the actual module path where `SeederConfig` calls these helpers. If the helpers live in a different module, adjust the target accordingly.
</issue_to_address>
### Comment 2
<location path="app/features/seeder/tests/test_service.py" line_range="30-42" />
<code_context>
+ """
scenarios = service.list_scenarios()
demo = next(s for s in scenarios if s.name == "demo_minimal")
assert demo.stores == 3
assert demo.products == 10
- assert demo.start_date == date(2024, 10, 1)
- assert demo.end_date == date(2024, 12, 31)
+ assert demo.end_date == default_seed_end_date()
+ assert demo.start_date == default_seed_end_date() - timedelta(days=DEMO_MINIMAL_SPAN_DAYS)
def test_scenario_info_structure(self):
</code_context>
<issue_to_address>
**suggestion:** Use a single, stable "today" reference when asserting demo_minimal scenario dates
This test calls `service.list_scenarios()` (which uses `default_seed_end_date()` internally) and then calls `default_seed_end_date()` twice more in assertions. If the test crosses midnight, these calls can disagree and cause a flaky failure. To avoid this, either:
- Patch `default_seed_end_date` (and `DEMO_MINIMAL_SPAN_DAYS` if needed) so both the service and assertions share a fixed value, or
- Capture `today = default_seed_end_date()` once at the top (after any patching) and assert against `today`.
That preserves the “anchored to today” behavior without time-boundary flakes.
```suggestion
def test_demo_minimal_dimensions(self, monkeypatch):
"""Test that demo_minimal is the small preset for the make demo target.
The window is anchored to *today* so the scenario picker reflects what
the seeder will actually produce.
"""
# Use a single, stable "today" reference for both the service and assertions.
today = date(2024, 10, 1)
monkeypatch.setattr(service, "default_seed_end_date", lambda: today)
scenarios = service.list_scenarios()
demo = next(s for s in scenarios if s.name == "demo_minimal")
assert demo.stores == 3
assert demo.products == 10
assert demo.end_date == today
assert demo.start_date == today - timedelta(days=DEMO_MINIMAL_SPAN_DAYS)
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| config = SeederConfig() | ||
|
|
||
| assert config.seed == 42 | ||
| assert config.start_date == date(2024, 1, 1) | ||
| assert config.end_date == date(2024, 12, 31) | ||
| assert config.end_date == default_seed_end_date() | ||
| assert config.start_date == default_seed_start_date() |
There was a problem hiding this comment.
suggestion (testing): Avoid time-dependent flakiness by stabilizing the "today" value used in default date range assertions
Because SeederConfig() computes its dates using default_seed_* and the assertions call default_seed_end_date() / default_seed_start_date() again, the test can fail if it crosses midnight between these calls. To avoid this flakiness, either:
- Patch
default_seed_end_date/default_seed_start_dateto return fixed values in this test, or - Compute a single
today = default_seed_end_date()(patched or real) and reuse it for both the config construction and the assertions.
Freezing time (via monkeypatch or a dedicated helper) would make this fully deterministic.
Suggested implementation:
def test_default_values(self, monkeypatch):
"""Test default configuration values.
The default date window is anchored to *today* and runs
DEFAULT_SEED_SPAN_DAYS backwards, so seeded data always looks current.
To avoid time-dependent flakiness, we freeze the default date helpers
to a stable "today" value for this test.
"""
fixed_today = date(2024, 1, 31)
monkeypatch.setattr(
"app.shared.seeder.config.default_seed_end_date",
lambda: fixed_today,
)
monkeypatch.setattr(
"app.shared.seeder.config.default_seed_start_date",
lambda: fixed_today - timedelta(days=DEFAULT_SEED_SPAN_DAYS),
)
config = SeederConfig()
assert config.seed == 42
assert config.end_date == default_seed_end_date()
assert config.start_date == default_seed_start_date()
assert config.dimensions.stores == 10
assert config.dimensions.products == 50
assert config.batch_size == 1000
def test_from_scenario_demo_minimal(self):
"""Test demo_minimal scenario preset.To make this compile and behave correctly you may also need to:
- Ensure
dateandtimedeltaare imported, e.g.:from datetime import date, timedelta
- Ensure
DEFAULT_SEED_SPAN_DAYSis imported into this test module if it is not already, e.g.:from app.shared.seeder.config import DEFAULT_SEED_SPAN_DAYS
- Confirm the monkeypatch target string
"app.shared.seeder.config.default_seed_end_date"/"app.shared.seeder.config.default_seed_start_date"matches the actual module path whereSeederConfigcalls these helpers. If the helpers live in a different module, adjust the target accordingly.
| def test_demo_minimal_dimensions(self): | ||
| """Test that demo_minimal is the small preset for the make demo target.""" | ||
| """Test that demo_minimal is the small preset for the make demo target. | ||
|
|
||
| The window is anchored to *today* so the scenario picker reflects what | ||
| the seeder will actually produce. | ||
| """ | ||
| scenarios = service.list_scenarios() | ||
|
|
||
| demo = next(s for s in scenarios if s.name == "demo_minimal") | ||
| assert demo.stores == 3 | ||
| assert demo.products == 10 | ||
| assert demo.start_date == date(2024, 10, 1) | ||
| assert demo.end_date == date(2024, 12, 31) | ||
| assert demo.end_date == default_seed_end_date() | ||
| assert demo.start_date == default_seed_end_date() - timedelta(days=DEMO_MINIMAL_SPAN_DAYS) |
There was a problem hiding this comment.
suggestion: Use a single, stable "today" reference when asserting demo_minimal scenario dates
This test calls service.list_scenarios() (which uses default_seed_end_date() internally) and then calls default_seed_end_date() twice more in assertions. If the test crosses midnight, these calls can disagree and cause a flaky failure. To avoid this, either:
- Patch
default_seed_end_date(andDEMO_MINIMAL_SPAN_DAYSif needed) so both the service and assertions share a fixed value, or - Capture
today = default_seed_end_date()once at the top (after any patching) and assert againsttoday.
That preserves the “anchored to today” behavior without time-boundary flakes.
| def test_demo_minimal_dimensions(self): | |
| """Test that demo_minimal is the small preset for the make demo target.""" | |
| """Test that demo_minimal is the small preset for the make demo target. | |
| The window is anchored to *today* so the scenario picker reflects what | |
| the seeder will actually produce. | |
| """ | |
| scenarios = service.list_scenarios() | |
| demo = next(s for s in scenarios if s.name == "demo_minimal") | |
| assert demo.stores == 3 | |
| assert demo.products == 10 | |
| assert demo.start_date == date(2024, 10, 1) | |
| assert demo.end_date == date(2024, 12, 31) | |
| assert demo.end_date == default_seed_end_date() | |
| assert demo.start_date == default_seed_end_date() - timedelta(days=DEMO_MINIMAL_SPAN_DAYS) | |
| def test_demo_minimal_dimensions(self, monkeypatch): | |
| """Test that demo_minimal is the small preset for the make demo target. | |
| The window is anchored to *today* so the scenario picker reflects what | |
| the seeder will actually produce. | |
| """ | |
| # Use a single, stable "today" reference for both the service and assertions. | |
| today = date(2024, 10, 1) | |
| monkeypatch.setattr(service, "default_seed_end_date", lambda: today) | |
| scenarios = service.list_scenarios() | |
| demo = next(s for s in scenarios if s.name == "demo_minimal") | |
| assert demo.stores == 3 | |
| assert demo.products == 10 | |
| assert demo.end_date == today | |
| assert demo.start_date == today - timedelta(days=DEMO_MINIMAL_SPAN_DAYS) |
Summary
The data seeder,
POST /seeder/generate, and the end-to-end demo showcase all hard-coded calendar-year 2024 date ranges, so every generated dataset, forecast, and showcase run looked frozen in 2024. This anchors the seeded window to today — it now ends on the current date and runs backwards.Closes #181.
Changes
app/shared/seeder/config.pydefault_seed_start_date()/default_seed_end_date()+ spansDEFAULT_SEED_SPAN_DAYS=365,DEMO_MINIMAL_SPAN_DAYS=91.SeederConfigdefaults and thedemo_minimalscenario are now today-anchoredapp/features/seeder/schemas.pyGenerateParams.start_date/end_datedefaults →today − 365d … todayapp/features/seeder/service.pylist_scenarios()picker dates today-anchoredapp/features/demo/pipeline.pyDEMO_SEED_START/DEMO_SEED_END→DEMO_SEED_SPAN_DAYS, computed at run time (92-day window ending today)scripts/run_demo.pyscripts/seed_random.py--start-date/--end-datedefaults + JSON-config fallback today-anchoredDeliberate exception
holiday_rushis left 2024-pinned — its Black Friday / Christmas holiday dates and Q4 monthly seasonality model a specific calendar window; re-anchoring would push the holidays outside the window. Pass explicitstart_date/end_dateto shift it. (Commented in code.)Verification
ruff check+ruff format --check— cleanmypy app/+pyright app/(changed modules) — 0 errorspytest -m "not integration"— 1090 passedUses the repo's
datetime.now(UTC).date()convention (ruffDTZ011forbidsdate.today()).Summary by Sourcery
Anchor seeded data and demo windows to the current date instead of hard-coded 2024 ranges.
Bug Fixes:
Enhancements:
Tests: