Skip to content

feat(forecasting): add baseline model zoo with security validations#30

Merged
w7-mgfcode merged 7 commits into
mainfrom
dev
Feb 1, 2026
Merged

feat(forecasting): add baseline model zoo with security validations#30
w7-mgfcode merged 7 commits into
mainfrom
dev

Conversation

@w7-mgfcode
Copy link
Copy Markdown
Owner

Summary

  • Baseline model zoo: naive, seasonal_naive, moving_average
  • Unified BaseForecaster interface following scikit-learn conventions
  • ModelBundle persistence with joblib serialization
  • Security: path traversal prevention, constructor validation
  • REST endpoints: POST /forecasting/train and POST /forecasting/predict
  • 85 unit tests with comprehensive coverage

What's New

Models

  • NaiveForecaster - predicts last observed value
  • SeasonalNaiveForecaster - predicts value from same season in previous cycle
  • MovingAverageForecaster - predicts mean of last N observations

Security

  • Constructor validation for season_length >= 1 and window_size >= 1
  • Path traversal prevention in predict endpoint
  • .joblib extension validation

Documentation

  • Updated ARCHITECTURE.md with correct config parameter names
  • Fixed README example to use season_length

Test plan

  • 85 unit tests pass
  • Type checking passes (mypy, pyright)
  • Linting passes (ruff)

🤖 Generated with Claude Code

w7-mgfcode and others added 7 commits February 1, 2026 00:32
* docs: update INITIAL-4 and INITIAL-5 with additional references

- Add scikit-learn, mlforecast, and sktime documentation links
- Add considerations for imputation logic, agent tooling, and computation overhead
- Add model persistence documentation references

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat(featuresets): implement time-safe feature engineering layer

Add complete feature engineering module with:
- Pydantic schemas for feature configuration (lag, rolling, calendar, exogenous, imputation)
- FeatureEngineeringService with CRITICAL leakage prevention:
  - Lag features use positive shift() only
  - Rolling features use shift(1) BEFORE rolling to exclude current observation
  - Group-aware operations prevent cross-series leakage
  - Cutoff date filtering before any computation
- FastAPI endpoints: POST /featuresets/compute and /featuresets/preview
- Comprehensive test suite (55 tests) including leakage prevention tests
- Example demo script

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs: update INITIAL-5.md

* docs: update documentation for Phase 3 Feature Engineering

- README.md: Add featuresets module to project structure and API endpoints
- docs/ARCHITECTURE.md: Add Feature Engineering section (section 6)
- docs/PHASE-index.md: Mark Phase 3 as completed with summary
- docs/PHASE/3-FEATURE_ENGINEERING.md: Create detailed phase documentation

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(featuresets): address code review feedback and prevent data leakage

Routes:
- Validate store_id/product_id presence (no silent defaults to 0)
- Convert ValueError for unsupported date types to HTTP 400

Service:
- Add expanding_mean imputation strategy (time-safe alternative)
- Add warnings when bfill/mean strategies are used (leakage risk)
- Fix price_pct_change_7d to use shift(1) before pct_change

Schemas:
- Add expanding_mean to ImputationConfig Literal type
- Document time-safety of each imputation strategy
- Fix PreviewFeaturesRequest docstring: GET → POST

Documentation:
- Convert bare URLs to markdown links in INITIAL-4.md, INITIAL-5.md
- Fix PRP-4 to show POST for preview endpoint

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* style: format schemas.py

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Gabe@w7dev <gabor@w7-7.net>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Service:
- Fix min_periods falsy check to explicit None check (preserves 0)

Tests:
- Add expanding_mean to test_valid_strategies, expect 6 strategies

Documentation:
- Update PR reference from #24 to #25
- Fix all GET /featuresets/preview to POST in PRP-4

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* docs: update DAILY-FLOW.md for Phase 4 Forecasting

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs: add PRP-5 for Forecasting module

Comprehensive PRP including:
- Model zoo (naive, seasonal naive, moving average)
- Unified BaseForecaster interface (fit/predict/serialize)
- ModelBundle persistence with joblib
- 15 ordered implementation tasks
- 40+ test cases specified
- Integration with FeatureEngineeringService

Confidence: 8/10

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Gabe@w7dev <gabor@w7-7.net>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
…#28)

* feat(forecasting): implement baseline model zoo and unified interface

Add forecasting module (PRP-5) with:
- BaseForecaster ABC with scikit-learn-style interface (fit/predict)
- NaiveForecaster, SeasonalNaiveForecaster, MovingAverageForecaster
- ModelBundle persistence with joblib serialization
- POST /forecasting/train and /forecasting/predict endpoints
- ForecastingService for orchestration
- 81 unit tests covering schemas, models, persistence, and service
- Example scripts demonstrating each baseline model
- LightGBM placeholder (feature-flagged, not yet implemented)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs: Update documentation for forecasting module (PRP-5)

- Add forecasting API endpoints to README.md with examples
- Update ARCHITECTURE.md with forecasting implementation details
- Add scikit-learn and joblib to dependencies list
- Add forecasting config variables to .env.example
- Mark forecasting module as IMPLEMENTED in architecture docs

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: address CI lint and type check failures

- Add type: ignore for intentional type mismatch in frozen config test
- Add S101 ignore for examples/ to allow assert statements

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Gabe@w7dev <gabor@w7-7.net>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Security improvements:
- Add constructor validation for season_length >= 1 in SeasonalNaiveForecaster
- Add constructor validation for window_size >= 1 in MovingAverageForecaster
- Add path traversal prevention in ForecastingService.predict()
- Validate .joblib extension and artifacts directory containment
- Log rejection reasons for security auditing

Test improvements:
- Fix get_settings patching to wrap ForecastingService construction
- Add tests for constructor validation
- Add tests for path traversal and extension validation

Documentation fixes:
- Fix config parameter names in ARCHITECTURE.md (season_length, window_size)
- Fix README example to use season_length instead of seasonal_period
- Fix markdown issues in PRP-5 (code fences, ATX headings)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@w7-mgfcode w7-mgfcode merged commit 3da7783 into main Feb 1, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants