diff --git a/PRPs/INITIAL/INITIAL-MLZOO-A-foundation-feature-frames.md b/PRPs/INITIAL/INITIAL-MLZOO-A-foundation-feature-frames.md new file mode 100644 index 00000000..23309f57 --- /dev/null +++ b/PRPs/INITIAL/INITIAL-MLZOO-A-foundation-feature-frames.md @@ -0,0 +1,124 @@ +# INITIAL-MLZOO-A-foundation-feature-frames.md - Feature-Aware Forecasting Foundation + +## FEATURE: + +Create the foundation for feature-aware forecasting in ForecastLabAI. + +This is the first MLZOO PRP input and should become PRP-29. It must not implement LightGBM, XGBoost, Prophet-like models, frontend UI, explainability UI, hyperparameter search, or portfolio/global orchestration. Its job is to make the existing forecasting layer capable of supporting future advanced ML models without breaking current baseline forecasters. + +Goals: + +- Define a feature-aware forecasting contract that supports `fit(y, X=None)` and `predict(horizon, X=None)`. +- Preserve existing target-only baseline models: `naive`, `seasonal_naive`, and `moving_average`. +- Define historical training feature-frame requirements. +- Define future prediction feature-frame requirements. +- Add or document leakage-safe feature-frame generation rules. +- Add load-bearing leakage tests that prove future rows do not use future target values. +- Make future advanced models possible without adding their dependencies yet. + +Expected user value: + +- ForecastLabAI gains a safe foundation for serious ML forecasting. +- Future LightGBM/XGBoost/Prophet-like work can build on a tested frame contract. +- Scenario simulation and explainability can later depend on a consistent feature-frame interface. + +Recommended user story: + +As a forecasting engineer, +I want a leakage-safe feature-frame contract for training and prediction, +So that advanced ML models can be added without breaking baseline models or leaking future data. + +Out of scope: + +- LightGBM implementation. +- XGBoost implementation. +- Prophet-like implementation. +- New database migrations unless absolutely required. +- Frontend pages. +- Agent tools. +- Hyperparameter search. + +## EXAMPLES: + +Read these before PRP creation: + +- `docs/optional-features/05-advanced-ml-model-zoo.md` + - Full feature vision and risks. + +- `PRPs/INITIAL/INITIAL-5.md` + - Existing forecasting model brief. + +- `docs/PHASE/4-FORECASTING.md` + - Current forecasting layer documentation. + +- `app/features/forecasting/models.py` + - Existing `BaseForecaster` and baseline model implementations. + +- `app/features/forecasting/schemas.py` + - Existing model config schemas and discriminated union pattern. + +- `app/features/forecasting/service.py` + - Existing train/predict orchestration. + +- `app/features/forecasting/persistence.py` + - Existing `ModelBundle` persistence. + +- `app/features/featuresets/service.py` + - Existing time-safe feature computation. + +- `app/features/featuresets/schemas.py` + - Feature configuration schemas. + +- `app/features/featuresets/tests/test_leakage.py` + - Existing leakage tests to mirror and extend. + +- `app/features/backtesting/service.py` + - Current backtesting integration points. + +Potential example artifacts: + +- `examples/models/feature_frame_contract.md` + - Describes historical and future frame shape, required columns, safe/unsafe feature classes. + +## DOCUMENTATION: + +- scikit-learn estimator interface conventions: https://scikit-learn.org/stable/developers/develop.html +- scikit-learn Pipeline composition: https://scikit-learn.org/stable/modules/compose.html +- scikit-learn TimeSeriesSplit: https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.TimeSeriesSplit.html +- Pandas time series documentation: https://pandas.pydata.org/docs/user_guide/timeseries.html +- Pydantic documentation: https://docs.pydantic.dev/latest/ +- Joblib persistence documentation: https://joblib.readthedocs.io/en/stable/persistence.html + +## OTHER CONSIDERATIONS: + +This PRP is primarily about contracts and leakage safety. + +Required decisions: + +- How to represent feature-aware models without forcing every baseline model to require `X`. +- Whether to introduce a `FeatureAwareForecaster` protocol/base class or extend the existing base interface only. +- Where historical training frames are built. +- Where future prediction frames are built. +- Which feature classes are safe for future frames: + - Safe: calendar features known in advance. + - Conditionally safe: lag/rolling features generated from historical tail and prior predictions. + - Unsafe unless explicitly supplied: future price, promotion, inventory, markdown, exogenous signals. +- How to reject missing future features instead of silently filling misleading defaults. + +Validation expectations: + +- Existing baseline forecasting tests still pass. +- New feature-frame contract tests exist. +- New leakage tests prove future target values are not used. +- Backtesting remains time-safe. +- `uv run pytest -q -m "not integration"` should pass. +- `uv run ruff check app tests` should pass for touched Python code. + +Important gotchas: + +- Do not break current target-only baseline forecasters. +- Do not add LightGBM or other heavy ML dependencies in this PRP. +- Do not silently convert unknown future exogenous values into zeros. +- Do not let training frames include rows after the cutoff date. +- Do not let future prediction frames read true future targets. + diff --git a/PRPs/INITIAL/INITIAL-MLZOO-B-lightgbm-first-model.md b/PRPs/INITIAL/INITIAL-MLZOO-B-lightgbm-first-model.md new file mode 100644 index 00000000..0131295d --- /dev/null +++ b/PRPs/INITIAL/INITIAL-MLZOO-B-lightgbm-first-model.md @@ -0,0 +1,101 @@ +# INITIAL-MLZOO-B-lightgbm-first-model.md - LightGBM First Advanced Model + +## FEATURE: + +Add the first advanced feature-aware model to ForecastLabAI after the MLZOO foundation is merged. + +Preferred model: LightGBM. + +Fallback model: sklearn `HistGradientBoostingRegressor` or another sklearn-native gradient boosting model if LightGBM creates unacceptable dependency or CI risk. + +This PRP must depend on `INITIAL-MLZOO-A-foundation-feature-frames.md` being implemented first. + +Goals: + +- Add one advanced model config schema. +- Add one feature-aware model implementation. +- Support deterministic training. +- Integrate with forecasting train/predict. +- Integrate with backtesting. +- Persist model metadata needed for reproducibility. +- Preserve all existing baseline model behavior. + +Out of scope: + +- XGBoost. +- Prophet-like models. +- Hyperparameter search. +- Portfolio/global models. +- Frontend model administration. +- Explainability UI. + +## EXAMPLES: + +Read these before PRP creation: + +- `PRPs/INITIAL/INITIAL-MLZOO-A-foundation-feature-frames.md` + - Required prerequisite. + +- `docs/optional-features/05-advanced-ml-model-zoo.md` + - Full advanced model vision. + +- `app/features/forecasting/models.py` + - Model factory and baseline model patterns. + +- `app/features/forecasting/schemas.py` + - Model config schema patterns. + +- `app/features/forecasting/service.py` + - Training/prediction service integration. + +- `app/features/forecasting/persistence.py` + - Model bundle save/load behavior. + +- `app/features/backtesting/service.py` + - Backtesting orchestration. + +- `app/features/registry/service.py` + - Registry run metadata patterns. + +Potential example artifacts: + +- `examples/models/advanced_lightgbm.py` + - Minimal training/prediction example. + +## DOCUMENTATION: + +- LightGBM documentation: https://lightgbm.readthedocs.io/ +- LightGBM Python API: https://lightgbm.readthedocs.io/en/stable/Python-API.html +- LightGBM parameters: https://lightgbm.readthedocs.io/en/stable/Parameters.html +- scikit-learn HistGradientBoostingRegressor: https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.HistGradientBoostingRegressor.html +- scikit-learn model persistence: https://scikit-learn.org/stable/model_persistence.html +- Joblib persistence documentation: https://joblib.readthedocs.io/en/stable/persistence.html +- Pydantic documentation: https://docs.pydantic.dev/latest/ + +## OTHER CONSIDERATIONS: + +Dependency strategy is the main open risk. + +Required decisions: + +- Whether to add LightGBM as a hard dependency, optional dependency group, or defer to sklearn fallback. +- Exact advanced model config fields. +- How model dependency versions are captured in registry/runtime metadata. +- How prediction rejects missing future feature frames. + +Recommended defaults: + +- Use fixed `random_state` from settings. +- Start with single store/product training. +- Keep the first config conservative. +- Avoid hyperparameter search. +- Persist feature column order. + +Validation expectations: + +- Config schema tests. +- Deterministic training tests. +- Save/load persistence tests. +- Backtesting integration test comparing baseline and advanced model path. +- Tests proving baselines still work unchanged. + diff --git a/PRPs/INITIAL/INITIAL-MLZOO-C-xgboost-prophet-extensions.md b/PRPs/INITIAL/INITIAL-MLZOO-C-xgboost-prophet-extensions.md new file mode 100644 index 00000000..ee41a887 --- /dev/null +++ b/PRPs/INITIAL/INITIAL-MLZOO-C-xgboost-prophet-extensions.md @@ -0,0 +1,59 @@ +# INITIAL-MLZOO-C-xgboost-prophet-extensions.md - XGBoost and Prophet-like Extensions + +## FEATURE: + +Extend the Advanced ML Model Zoo after the feature-frame foundation and first advanced model path are stable. + +This INITIAL is for later work, not PRP-29. + +Goals: + +- Add XGBoost as a second tree-based feature-aware model. +- Add a Prophet-like additive model path or choose the real Prophet dependency if justified. +- Support holiday/regressor-style features where appropriate. +- Add model-family-specific validation and metadata. + +Out of scope: + +- Foundation feature-frame work. +- First advanced model architecture. +- Frontend/explainability polish unless explicitly needed. +- Hyperparameter search unless separately scoped. + +## EXAMPLES: + +Read these before PRP creation: + +- `PRPs/INITIAL/INITIAL-MLZOO-A-foundation-feature-frames.md` + - Foundation dependency. + +- `PRPs/INITIAL/INITIAL-MLZOO-B-lightgbm-first-model.md` + - First advanced model pattern to follow. + +- `app/features/forecasting/models.py` + - Model factory and advanced model pattern. + +- `app/features/forecasting/schemas.py` + - Config schema pattern. + +- `app/features/featuresets/service.py` + - Regressor and calendar feature source. + +## DOCUMENTATION: + +- XGBoost documentation: https://xgboost.readthedocs.io/en/stable/ +- XGBoost Python package documentation: https://xgboost.readthedocs.io/en/stable/python/ +- XGBoost parameters: https://xgboost.readthedocs.io/en/stable/parameter.html +- Prophet documentation: https://facebook.github.io/prophet/docs/quick_start.html +- Prophet seasonality, holidays, and regressors: https://facebook.github.io/prophet/docs/seasonality,_holiday_effects,_and_regressors.html +- scikit-learn model persistence: https://scikit-learn.org/stable/model_persistence.html +- Pandas time series documentation: https://pandas.pydata.org/docs/user_guide/timeseries.html + +## OTHER CONSIDERATIONS: + +- XGBoost should mirror the first advanced model path where possible. +- Prophet-like work should be carefully evaluated because dependency weight and API shape differ from sklearn-style regressors. +- Real Prophet support should be chosen only if install/runtime constraints are acceptable. +- A lightweight additive sklearn model may be safer than the real Prophet dependency. +- Holiday/regressor support must use known-in-advance or explicitly supplied future values. + diff --git a/PRPs/INITIAL/INITIAL-MLZOO-D-frontend-registry-explainability.md b/PRPs/INITIAL/INITIAL-MLZOO-D-frontend-registry-explainability.md new file mode 100644 index 00000000..e65591b7 --- /dev/null +++ b/PRPs/INITIAL/INITIAL-MLZOO-D-frontend-registry-explainability.md @@ -0,0 +1,69 @@ +# INITIAL-MLZOO-D-frontend-registry-explainability.md - Frontend, Registry, and Explainability Polish + +## FEATURE: + +Expose Advanced ML Model Zoo capabilities in the product after backend model contracts and at least one advanced model are stable. + +This INITIAL is for later work, not PRP-29. + +Goals: + +- Add model selection UI where useful. +- Surface advanced model metadata in run detail and comparison pages. +- Show feature config, feature columns, dependency versions, and model family metadata. +- Add basic feature importance or explanation hooks where available. +- Update docs/admin surfaces so operators understand advanced model constraints. + +Out of scope: + +- Core feature-frame foundation. +- First advanced model backend implementation. +- XGBoost/Prophet backend implementation. +- Full SHAP explainability unless separately scoped. + +## EXAMPLES: + +Read these before PRP creation: + +- `PRPs/INITIAL/INITIAL-MLZOO-A-foundation-feature-frames.md` + - Foundation dependency. + +- `PRPs/INITIAL/INITIAL-MLZOO-B-lightgbm-first-model.md` + - First advanced model dependency. + +- `frontend/src/pages/explorer/runs.tsx` + - Existing run table. + +- `frontend/src/pages/explorer/run-detail.tsx` + - Existing run detail surface. + +- `frontend/src/pages/explorer/run-compare.tsx` + - Existing comparison surface. + +- `frontend/src/pages/visualize/forecast.tsx` + - Forecast visualization page. + +- `frontend/src/pages/visualize/backtest.tsx` + - Backtest visualization page. + +- `app/features/registry/schemas.py` + - Backend response contracts for run metadata. + +## DOCUMENTATION: + +- React Router documentation: https://reactrouter.com/home +- TanStack Query documentation: https://tanstack.com/query/latest/docs/framework/react/overview +- TanStack Table documentation: https://tanstack.com/table/latest/docs/overview +- shadcn/ui documentation: https://ui.shadcn.com/docs +- Recharts documentation: https://recharts.org/en-US/ +- SHAP documentation: https://shap.readthedocs.io/en/stable/ +- scikit-learn permutation importance: https://scikit-learn.org/stable/modules/permutation_importance.html + +## OTHER CONSIDERATIONS: + +- Do not create frontend controls before backend contracts are stable. +- Avoid adding a large admin panel if run detail and comparison pages are enough. +- Keep advanced model metadata readable and compact. +- Feature importance must be clearly labeled as model-derived, not causal truth. +- Browser QA is required for all frontend additions. + diff --git a/PRPs/INITIAL/INITIAL-MLZOO-index.md b/PRPs/INITIAL/INITIAL-MLZOO-index.md new file mode 100644 index 00000000..2a07a99b --- /dev/null +++ b/PRPs/INITIAL/INITIAL-MLZOO-index.md @@ -0,0 +1,76 @@ +# INITIAL-MLZOO-index.md - Advanced ML Model Zoo Roadmap + +## FEATURE: + +Split the Advanced ML Model Zoo into multiple INITIAL briefs so each future PRP can remain small, reviewable, and implementation-safe. + +This index is the roadmap for the MLZOO sequence. Do not create one PRP that implements the full model zoo. The correct flow is: + +1. Use this index to understand the full architecture. +2. Use `INITIAL-MLZOO-A-foundation-feature-frames.md` as the first PRP input. +3. Implement and merge the foundation before creating PRPs for later parts. +4. Promote B, C, and D into PRPs only after their prerequisites are stable. + +Recommended PRP sequence: + +| Order | INITIAL | Intended PRP | Purpose | +| --- | --- | --- | --- | +| 1 | `INITIAL-MLZOO-A-foundation-feature-frames.md` | PRP-29 | Feature-aware forecasting foundation and leakage-safe frame contracts | +| 2 | `INITIAL-MLZOO-B-lightgbm-first-model.md` | Future PRP | First advanced model path with LightGBM or sklearn fallback | +| 3 | `INITIAL-MLZOO-C-xgboost-prophet-extensions.md` | Future PRP | XGBoost and Prophet-like extensions | +| 4 | `INITIAL-MLZOO-D-frontend-registry-explainability.md` | Future PRP | UI, registry surfacing, and explanation polish | + +Dependency graph: + +```text +A. Foundation feature frames + -> B. LightGBM first model + -> C. XGBoost / Prophet-like extensions + -> D. Frontend / registry / explainability +``` + +The full vision is documented in `docs/optional-features/05-advanced-ml-model-zoo.md`. + +## EXAMPLES: + +Read these before creating any MLZOO PRP: + +- `docs/optional-features/05-advanced-ml-model-zoo.md` + - Full optional-feature concept and documentation links. + +- `PRPs/INITIAL/INITIAL-5.md` + - Earlier forecasting model brief, including baseline model zoo and global ML hooks. + +- `docs/PHASE/4-FORECASTING.md` + - Completed forecasting phase, model interface, configs, persistence, service, and API behavior. + +- `app/features/forecasting/models.py` + - Current baseline model interface. + +- `app/features/featuresets/service.py` + - Existing time-safe feature engineering. + +- `app/features/featuresets/tests/test_leakage.py` + - Existing leakage-safety testing pattern. + +## DOCUMENTATION: + +- LightGBM documentation: https://lightgbm.readthedocs.io/ +- XGBoost documentation: https://xgboost.readthedocs.io/en/stable/ +- Prophet documentation: https://facebook.github.io/prophet/docs/quick_start.html +- scikit-learn model persistence: https://scikit-learn.org/stable/model_persistence.html +- scikit-learn TimeSeriesSplit: https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.TimeSeriesSplit.html +- scikit-learn Pipeline composition: https://scikit-learn.org/stable/modules/compose.html +- Pandas time series documentation: https://pandas.pydata.org/docs/user_guide/timeseries.html +- Joblib persistence documentation: https://joblib.readthedocs.io/en/stable/persistence.html +- Pydantic documentation: https://docs.pydantic.dev/latest/ +- FastAPI documentation: https://fastapi.tiangolo.com/ + +## OTHER CONSIDERATIONS: + +- The first PRP should be generated from `INITIAL-MLZOO-A-foundation-feature-frames.md`. +- Do not implement LightGBM before the feature-frame contracts and leakage tests are stable. +- Do not implement XGBoost or Prophet-like models before the first advanced model path proves the architecture. +- Do not add frontend/explainability scope before backend metadata and persistence contracts are stable. +- Keep each PRP to one branch and one reviewable unit. + diff --git a/docs/optional-features/10-baseforecaster-feature-contract.md b/docs/optional-features/10-baseforecaster-feature-contract.md new file mode 100644 index 00000000..4d6f89a1 --- /dev/null +++ b/docs/optional-features/10-baseforecaster-feature-contract.md @@ -0,0 +1,116 @@ +# BaseForecaster Feature Contract + +## Summary + +Formalize the existing `BaseForecaster` interface as the canonical model contract for both target-only baseline models and feature-aware ML models. Add a `requires_features` class attribute or property so services can branch on model capability without `isinstance` checks or a new `FeatureAwareForecaster` subclass. + +This is a small but important foundation item for the Advanced ML Model Zoo. + +## Why It Fits ForecastLabAI + +ForecastLabAI already has a forecasting model interface where models expose: + +- `fit(y, X=None)` +- `predict(horizon, X=None)` + +Baseline models can ignore `X`; regression and future advanced models need `X`. Introducing a second base class too early would create inheritance churn without solving the harder platform problems: feature-frame contracts, future feature availability, leakage safety, and train/serve skew. + +## User Value + +- Keeps current baseline behavior stable. +- Makes feature-aware model support explicit. +- Prepares LightGBM/XGBoost/Prophet-like work without API churn. +- Avoids brittle `isinstance` checks in services. +- Reduces persistence risk for existing joblib model bundles. + +## Proposed Design + +Keep `BaseForecaster` as the single canonical model interface. + +Add a class-level capability flag: + +```python +requires_features: ClassVar[bool] = False +``` + +Baseline models: + +```python +class NaiveForecaster(BaseForecaster): + requires_features = False +``` + +Feature-aware models: + +```python +class RegressionForecaster(BaseForecaster): + requires_features = True +``` + +Service code branches on the model contract: + +```python +if model.requires_features: + # require and validate X / X_future +else: + # y-only baseline path +``` + +## Backend Design + +Likely files: + +- `app/features/forecasting/models.py` +- `app/features/forecasting/service.py` +- `app/features/forecasting/tests/test_models.py` +- `examples/models/model_interface.md` +- `docs/PHASE/4-FORECASTING.md` + +The change should document that: + +- `fit(y, X=None)` is the universal train contract. +- `predict(horizon, X=None)` is the universal predict contract. +- `requires_features = False` models may ignore `X`. +- `requires_features = True` models must receive valid feature frames. +- A `FeatureAwareForecaster` subclass should be revisited only after multiple advanced model families need shared behavior beyond the flag. + +## MVP Scope + +- Add `requires_features` to the model interface. +- Set it explicitly on existing baseline and regression models. +- Update service branching where it currently relies on model type checks. +- Add tests proving baseline models ignore `X` and regression requires it. +- Update model interface documentation. + +## Full Version + +- Add richer capability flags if needed: + - `supports_prediction_intervals` + - `supports_feature_importance` + - `supports_exogenous_future` + - `supports_recursive_prediction` +- Introduce a `FeatureAwareForecaster` subclass only when shared advanced-model behavior justifies the abstraction. + +## Risks + +- Adding a flag without tests can become another implicit contract. +- Service code must not silently pass `None` into feature-aware models. +- Documentation must be precise so future LightGBM work does not reinterpret the contract. + +## Validation Plan + +- Unit tests for each existing model's `requires_features` value. +- Unit tests proving baseline models still fit/predict with `X=None`. +- Unit tests proving feature-aware models reject missing required features. +- Regression tests for existing forecasting service behavior. +- `uv run pytest -q -m "not integration"` +- `uv run ruff check app tests` + +## Documentation + +- scikit-learn estimator development guide: https://scikit-learn.org/stable/developers/develop.html +- scikit-learn Pipeline composition: https://scikit-learn.org/stable/modules/compose.html +- scikit-learn model persistence: https://scikit-learn.org/stable/model_persistence.html +- Joblib persistence documentation: https://joblib.readthedocs.io/en/stable/persistence.html +- Pydantic documentation: https://docs.pydantic.dev/latest/ + diff --git a/docs/optional-features/11-feature-aware-predict-serving.md b/docs/optional-features/11-feature-aware-predict-serving.md new file mode 100644 index 00000000..1c745bd0 --- /dev/null +++ b/docs/optional-features/11-feature-aware-predict-serving.md @@ -0,0 +1,103 @@ +# Feature-Aware Forecasting Predict Serving + +## Summary + +Extend `POST /forecasting/predict` so feature-aware models can produce forecasts outside `/scenarios/simulate` when a leakage-safe future feature frame can be constructed or supplied. + +Today feature-aware regression models are rejected by `/forecasting/predict` because the endpoint cannot supply future `X`. That is correct for the current foundation, but it becomes a missing serving capability once MLZOO-B introduces the first advanced model. + +## Why It Fits ForecastLabAI + +ForecastLabAI is evolving from y-only baseline forecasting toward ML forecasting with `y + X`. Scenario simulation already provides a context where future assumptions can produce `X_future`. The standard forecasting endpoint needs a safe, explicit serving path for feature-aware models too, but only after the shared feature-frame contract is stable. + +## User Value + +- Advanced models become usable through the normal forecast API. +- Forecast visualization can load predictions from feature-aware jobs. +- LightGBM and future XGBoost models can serve without requiring the scenario UI. +- The product can distinguish baseline forecasts, scenario forecasts, and assumptions-free ML forecasts. + +## Proposed Design + +Add feature-aware predict support in a later PRP, not in the foundation-only MLZOO-A work. + +Supported future-frame modes: + +1. **Calendar-only / history-tail mode** + - Use known future calendar features. + - Use historical tail for lag and rolling seeds. + - Generate recursive target-derived features only from history and prior predictions. + - Reject unsafe feature columns that require explicit future assumptions. + +2. **Supplied future-frame mode** + - Client or service supplies validated `X_future`. + - API verifies required columns, order, dtypes, horizon length, and no target leakage. + +3. **Scenario-backed mode** + - Reuse saved scenario assumptions to construct `X_future`. + - Clearly mark the result as scenario-conditioned. + +## Backend Design + +Likely files: + +- `app/features/forecasting/routes.py` +- `app/features/forecasting/service.py` +- `app/features/forecasting/schemas.py` +- `app/shared/feature_frames/` +- `app/features/jobs/service.py` +- `frontend/src/pages/visualize/forecast.tsx` + +Possible request additions: + +- `feature_mode`: `baseline`, `history_calendar`, `supplied_frame`, `scenario` +- `future_frame`: optional structured future features +- `scenario_id`: optional saved scenario reference +- `history_tail_days`: optional bounded history window + +The endpoint must reject feature-aware predictions when required future features are unavailable. + +## MVP Scope + +- Keep current rejection behavior until the first advanced model lands. +- Add a dedicated PRP later for `history_calendar` mode. +- Support only safe known-ahead features and recursive target-derived features. +- Return metadata that states how `X_future` was built. + +## Full Version + +- Supplied future-frame mode. +- Scenario-backed mode. +- Prediction interval support where available. +- Feature availability diagnostics. +- UI warnings when forecasts are assumptions-free vs scenario-conditioned. + +## Risks + +- Assumptions-free future frames can be misleading if users expect promotions, inventory, or exogenous events to be included. +- Recursive lag generation can leak future targets if implemented incorrectly. +- Train/serve skew can silently degrade advanced model quality. +- API shape can become too broad if scenario, supplied-frame, and history-calendar modes are mixed without clear validation. + +## Validation Plan + +- Unit tests for future-frame validation. +- Leakage tests proving `X_future` never reads true future targets. +- API tests: + - baseline model predict still works + - feature-aware predict rejects missing future features + - feature-aware predict accepts valid history-calendar frame + - unsafe future feature requirements produce clear errors +- Job result metadata tests. +- Browser QA for forecast visualization using a feature-aware prediction job. + +## Documentation + +- FastAPI documentation: https://fastapi.tiangolo.com/ +- Pydantic documentation: https://docs.pydantic.dev/latest/ +- scikit-learn TimeSeriesSplit: https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.TimeSeriesSplit.html +- Pandas time series documentation: https://pandas.pydata.org/docs/user_guide/timeseries.html +- LightGBM Python API: https://lightgbm.readthedocs.io/en/stable/Python-API.html +- XGBoost Python package documentation: https://xgboost.readthedocs.io/en/stable/python/ +- Recharts documentation: https://recharts.org/en-US/ + diff --git a/docs/optional-features/README.md b/docs/optional-features/README.md index 6a8a309e..33c7f6b9 100644 --- a/docs/optional-features/README.md +++ b/docs/optional-features/README.md @@ -15,6 +15,8 @@ This folder contains implementation-oriented product and architecture notes for | Agent Experiment Workbench | [07-agent-experiment-workbench.md](07-agent-experiment-workbench.md) | Strategic | High | | Demand Anomaly and Data Quality Monitor | [08-demand-anomaly-data-quality-monitor.md](08-demand-anomaly-data-quality-monitor.md) | Medium-term | Medium | | Model Champion/Challenger Governance | [09-model-champion-challenger-governance.md](09-model-champion-challenger-governance.md) | Medium-term | High | +| BaseForecaster Feature Contract | [10-baseforecaster-feature-contract.md](10-baseforecaster-feature-contract.md) | MLZOO foundation | Low | +| Feature-Aware Forecasting Predict Serving | [11-feature-aware-predict-serving.md](11-feature-aware-predict-serving.md) | MLZOO follow-up | Medium | ## Promotion Criteria diff --git a/uv.lock b/uv.lock index 2abaae1e..6fb0e783 100644 --- a/uv.lock +++ b/uv.lock @@ -821,7 +821,7 @@ wheels = [ [[package]] name = "forecastlabai" -version = "0.2.8" +version = "0.2.15" source = { editable = "." } dependencies = [ { name = "alembic" },