feat: RAG Knowledge Base, Serving Layer, and Model Registry#50
Conversation
…ents (#36) * docs: expand INITIAL-7 with lifecycle, lineage, and artifact integrity details Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat(registry): implement model registry for run tracking and deployments Add model registry feature (PRP-7) with: - ORM models: ModelRun with JSONB columns (model_config, metrics, runtime_info), DeploymentAlias for mutable deployment pointers - Storage: LocalFSProvider with SHA-256 integrity verification and path traversal prevention, abstract interface for future S3/GCS support - Service: RegistryService with state machine validation, duplicate detection, config hashing, and run comparison - API endpoints: CRUD for runs and aliases, artifact verification, run comparison with config/metrics diffs - Database: Alembic migration with GIN indexes for JSONB containment queries - Tests: 103 unit tests (schemas, storage, service) + 24 integration tests - Example: registry_demo.py demonstrating full workflow Run lifecycle: PENDING → RUNNING → SUCCESS/FAILED → ARCHIVED Aliases can only point to SUCCESS runs for deployment safety. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs: update documentation for model registry implementation - README.md: Add registry to project structure, API endpoints section, and example reference - docs/ARCHITECTURE.md: Update section 7.6 with full implementation details, add registry endpoints to section 8, mark Phase 1 complete - docs/PHASE-index.md: Mark phases 4-6 as completed, add detailed completion entries for Forecasting, Backtesting, and Registry Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs: add PHASE documentation for forecasting, backtesting, and registry Create missing phase documentation files to complete the project's implementation records: - 4-FORECASTING.md: Model zoo with BaseForecaster interface, train/predict endpoints, and joblib persistence - 5-BACKTESTING.md: Time-series CV with expanding/sliding strategies, metrics calculation, and baseline comparisons - 6-MODEL_REGISTRY.md: Run tracking with state machine, deployment aliases, and SHA-256 artifact integrity verification Update PHASE-index.md to link to the new documentation files. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(registry): resolve type checking issues with Pydantic model_config alias - Add pydantic.mypy plugin to pyproject.toml for proper Pydantic type checking - Use model_config_data instead of model_config alias in tests to avoid collision with Pydantic's reserved model_config attribute - Update _model_to_response to use model_validate() for proper alias handling - Change docker-compose postgres port to 5433 to avoid conflicts Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: resolve CI failures for registry PR - Import registry models in alembic/env.py for schema validation - Fix import order and remove extraneous f-strings in registry_demo.py - Add type: ignore comments for frozen model tests with pydantic.mypy plugin Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: prevent db_session fixtures from dropping all tables The data_platform and root conftest.py db_session fixtures were dropping all tables after each test, causing subsequent integration tests to fail when they couldn't find migrated tables. Changes: - Remove Base.metadata.drop_all from db_session fixtures - Tests now rely on migrations for table creation - Each test just rolls back its own changes Also fixes ruff format issue in examples/registry_demo.py. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: add proper test data cleanup to db_session fixtures Update data_platform and ingest test fixtures to clean up test data explicitly instead of dropping all tables or just rolling back. - data_platform: delete test stores, products, calendar entries - ingest: delete test stores, products, sales, calendar entries This ensures test isolation while preserving migrated tables. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: use separate session for test cleanup to avoid transaction issues When tests cause integrity errors, the session enters a failed state. Use a fresh session for cleanup to avoid PendingRollbackError. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: use contextlib.suppress instead of try-except-pass Replace try-except-pass patterns with contextlib.suppress to satisfy ruff S110 linting rule. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Gabe@w7dev <gabor@w7-7.net> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
- Add date range filter to SalesDaily cleanup in ingest tests - Enforce artifact_hash presence before verification in registry routes - Compute SHA256 from saved file instead of source in storage - Fix override_get_db to mirror production transaction semantics - Filter DeploymentAlias cleanup to only test runs - Update database port to 5433 in config and .env.example - Add language identifiers to fenced code blocks (MD040) - Fix table formatting for markdownlint MD060 - Update PR reference in PHASE/6-MODEL_REGISTRY.md - Convert bare URLs to markdown links in INITIAL-7.md - Wrap __init__.py in backticks in PRP-7 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* chore: release v0.2.0 (#37) * feat(registry): implement model registry for run tracking and deployments (#36) * docs: expand INITIAL-7 with lifecycle, lineage, and artifact integrity details Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat(registry): implement model registry for run tracking and deployments Add model registry feature (PRP-7) with: - ORM models: ModelRun with JSONB columns (model_config, metrics, runtime_info), DeploymentAlias for mutable deployment pointers - Storage: LocalFSProvider with SHA-256 integrity verification and path traversal prevention, abstract interface for future S3/GCS support - Service: RegistryService with state machine validation, duplicate detection, config hashing, and run comparison - API endpoints: CRUD for runs and aliases, artifact verification, run comparison with config/metrics diffs - Database: Alembic migration with GIN indexes for JSONB containment queries - Tests: 103 unit tests (schemas, storage, service) + 24 integration tests - Example: registry_demo.py demonstrating full workflow Run lifecycle: PENDING → RUNNING → SUCCESS/FAILED → ARCHIVED Aliases can only point to SUCCESS runs for deployment safety. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs: update documentation for model registry implementation - README.md: Add registry to project structure, API endpoints section, and example reference - docs/ARCHITECTURE.md: Update section 7.6 with full implementation details, add registry endpoints to section 8, mark Phase 1 complete - docs/PHASE-index.md: Mark phases 4-6 as completed, add detailed completion entries for Forecasting, Backtesting, and Registry Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs: add PHASE documentation for forecasting, backtesting, and registry Create missing phase documentation files to complete the project's implementation records: - 4-FORECASTING.md: Model zoo with BaseForecaster interface, train/predict endpoints, and joblib persistence - 5-BACKTESTING.md: Time-series CV with expanding/sliding strategies, metrics calculation, and baseline comparisons - 6-MODEL_REGISTRY.md: Run tracking with state machine, deployment aliases, and SHA-256 artifact integrity verification Update PHASE-index.md to link to the new documentation files. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(registry): resolve type checking issues with Pydantic model_config alias - Add pydantic.mypy plugin to pyproject.toml for proper Pydantic type checking - Use model_config_data instead of model_config alias in tests to avoid collision with Pydantic's reserved model_config attribute - Update _model_to_response to use model_validate() for proper alias handling - Change docker-compose postgres port to 5433 to avoid conflicts Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: resolve CI failures for registry PR - Import registry models in alembic/env.py for schema validation - Fix import order and remove extraneous f-strings in registry_demo.py - Add type: ignore comments for frozen model tests with pydantic.mypy plugin Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: prevent db_session fixtures from dropping all tables The data_platform and root conftest.py db_session fixtures were dropping all tables after each test, causing subsequent integration tests to fail when they couldn't find migrated tables. Changes: - Remove Base.metadata.drop_all from db_session fixtures - Tests now rely on migrations for table creation - Each test just rolls back its own changes Also fixes ruff format issue in examples/registry_demo.py. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: add proper test data cleanup to db_session fixtures Update data_platform and ingest test fixtures to clean up test data explicitly instead of dropping all tables or just rolling back. - data_platform: delete test stores, products, calendar entries - ingest: delete test stores, products, sales, calendar entries This ensures test isolation while preserving migrated tables. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: use separate session for test cleanup to avoid transaction issues When tests cause integrity errors, the session enters a failed state. Use a fresh session for cleanup to avoid PendingRollbackError. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: use contextlib.suppress instead of try-except-pass Replace try-except-pass patterns with contextlib.suppress to satisfy ruff S110 linting rule. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Gabe@w7dev <gabor@w7-7.net> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> * fix: code improvements and documentation fixes - Add date range filter to SalesDaily cleanup in ingest tests - Enforce artifact_hash presence before verification in registry routes - Compute SHA256 from saved file instead of source in storage - Fix override_get_db to mirror production transaction semantics - Filter DeploymentAlias cleanup to only test runs - Update database port to 5433 in config and .env.example - Add language identifiers to fenced code blocks (MD040) - Fix table formatting for markdownlint MD060 - Update PR reference in PHASE/6-MODEL_REGISTRY.md - Convert bare URLs to markdown links in INITIAL-7.md - Wrap __init__.py in backticks in PRP-7 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Gabe@w7dev <gabor@w7-7.net> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> * chore(main): release 0.2.0 (#38) Release-As: 0.2.0 Co-authored-by: Gabe@w7dev <gabor@w7-7.net> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> * chore(main): release 0.2.0 (#39) * chore(main): release 0.2.0 * chore: trigger CI --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Gabe@w7dev <gabor@w7-7.net> --------- Co-authored-by: Gabe@w7dev <gabor@w7-7.net> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
sync: align dev with main after v0.2.0 release
* docs(initial-8): expand serving layer requirements Add specifications for job-driven orchestration, dimension discovery endpoints, standardized API protocols (filtering/pagination), and agent-first API design patterns for LLM tool-calling optimization. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs(prp-8): add serving layer implementation spec Comprehensive PRP for FastAPI serving layer including: - Dimensions module for store/product discovery endpoints - Analytics module for KPI/drilldown queries - Jobs module for async-ready task orchestration - RFC 7807 problem details for semantic error responses - OpenAPI export optimization for LLM tool-calling 26 tasks with validation gates and 8.5/10 confidence score. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat(serving-layer): implement PRP-8 agent-first API design Add RFC 7807 Problem Details for semantic error responses: - ProblemDetail schema with type URIs and error codes - application/problem+json content type - Validation exception handler with field-level errors Add dimensions module for store/product discovery: - GET /dimensions/stores with pagination, filtering, search - GET /dimensions/products with pagination, filtering, search - LLM-optimized Field descriptions for tool-calling Add analytics module for KPI aggregations: - GET /analytics/kpis with date range and dimension filters - GET /analytics/drilldowns for store/product/category/region/date - Revenue share and ranking calculations Add jobs module for async-ready task orchestration: - POST /jobs for train/predict/backtest operations - Job model with JSONB params/results - Status transitions: pending → running → completed/failed Integration: - New settings: analytics_max_rows, jobs_retention_days - Register routers in main.py - Alembic migration for jobs table Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs: update documentation for PRP-8 serving layer Update README.md: - Add dimensions, analytics, jobs modules to project structure - Document new API endpoints with examples - Add RFC 7807 error response documentation Update docs/ARCHITECTURE.md: - Mark serving layer section as implemented - Add configuration settings for new modules - Update roadmap with Phase-2 completion Update docs/PHASE-index.md: - Add Phase 7 (Serving Layer) as completed - Update phase overview table - Add version history entry Create docs/PHASE/7-SERVING_LAYER.md: - Comprehensive phase documentation - API endpoint specifications - Database schema and migration details - Usage examples and test coverage Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * style: fix ruff formatting Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Gabe@w7dev <gabor@w7-7.net> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
…ling - Add validate_date_range helper to analytics routes for reusable date validation - Apply date range validation to both get_kpis and get_drilldowns endpoints - Fix total_revenue_all calculation to use full dataset before limiting - Add run_id to train job result for downstream predict jobs - Fix predict job to resolve run_id to model metadata from bundle - Update test fixtures to use 32-char hex IDs per schema requirements Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
sync: merge main (v0.2.1) into dev
…IAL-9/10/11) (#47) * docs: restructure INITIAL-9 into modular three-phase roadmap Decompose monolithic INITIAL-9 into three specialized technical phases: - INITIAL-9: RAG Knowledge Base ("The Memory") - pgvector + OpenAI embeddings - Markdown/OpenAPI-aware chunking - Semantic retrieval endpoints - INITIAL-10: Agentic Layer ("The Brain") - PydanticAI agents (Experiment Orchestrator, RAG Assistant) - Tool orchestration with structured outputs - Human-in-the-loop approval workflow - INITIAL-11: ForecastLab Dashboard ("The Face") - React 19 + Vite + shadcn/ui - TanStack Table/Query for data management - Recharts for time series visualization - Agent chat interface with streaming Update PHASE-index.md and DAILY-FLOW.md to align with new structure. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs(prp): add PRP-9 RAG Knowledge Base implementation plan Comprehensive PRP for INITIAL-9 RAG Knowledge Base feature: - pgvector + SQLAlchemy 2.0 integration patterns - Markdown-aware and OpenAPI-aware chunking - Async OpenAI embeddings with batch processing - HNSW index for cosine similarity search - 15 ordered implementation tasks - 5-level validation loop (syntax → types → unit → integration → smoke) - Full ORM models and Pydantic schemas - Known gotchas and anti-patterns documented Confidence score: 8.5/10 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs(prp): add PRP-10 Agentic Layer implementation plan Comprehensive PRP for INITIAL-10 Agentic Layer feature: - PydanticAI agent framework integration - Experiment Orchestrator Agent (backtest → compare → deploy) - RAG Assistant Agent (query → retrieve → answer with citations) - Human-in-the-loop approval workflow for sensitive actions - WebSocket streaming for real-time token delivery - Session persistence with JSONB message history - 17 ordered implementation tasks - Tool definitions for registry, backtesting, forecasting, RAG - Full Pydantic schemas and ORM models Confidence score: 7.5/10 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Gabe@w7dev <gabor@w7-7.net> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Comprehensive PRP for INITIAL-11 (The Face) with: - 24 implementation tasks across 6 phases - React 19 + Vite + shadcn/ui + TanStack Table/Query - TypeScript types matching all backend API schemas - Reusable DataTable with server-side pagination - TimeSeriesChart component with Recharts - WebSocket hook for agent chat streaming - Complete documentation links and gotchas Confidence score: 7.5/10 (chat depends on INITIAL-10) Co-authored-by: Gabe@w7dev <gabor@w7-7.net> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
* feat(rag): implement PRP-9 RAG Knowledge Base with pgvector
Add RAG (Retrieval-Augmented Generation) knowledge base feature for
semantic document indexing and retrieval using PostgreSQL pgvector.
Key components:
- Document indexing with markdown-aware and OpenAPI-aware chunking
- Semantic retrieval using cosine similarity with configurable thresholds
- Idempotent re-indexing via SHA-256 content hash comparison
- OpenAI text-embedding-3-small for embeddings (1536 dimensions)
- HNSW index for fast approximate nearest neighbor search
API endpoints:
- POST /rag/index - Index documents with automatic chunking
- POST /rag/retrieve - Semantic search with relevance scoring
- GET /rag/sources - List indexed sources with statistics
- DELETE /rag/sources/{source_id} - Remove source and chunks
Includes:
- ORM models: DocumentSource, DocumentChunk with Vector column
- Pydantic v2 schemas with strict validation
- 68 unit tests + 14 integration tests
- Migration for pgvector extension and RAG tables
- Examples and environment configuration
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* feat(rag): add Ollama embedding provider with OpenAI-compatible API
- Add EmbeddingProvider abstract base class with provider pattern
- Refactor existing OpenAI code to OpenAIEmbeddingProvider
- Add OllamaEmbeddingProvider using /v1/embeddings endpoint
- Supports configurable dimensions parameter
- Uses OpenAI-compatible response format
- Add config settings: rag_embedding_provider, ollama_base_url, ollama_embedding_model
- Add migration for dynamic embedding dimension support
- Update tests for both providers (25 tests)
Enables local/LAN embedding generation without OpenAI API dependency.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* docs: add Ollama embedding provider documentation
- Update .env.example with Ollama configuration options
- Add RAG Knowledge Base section to README with:
- Embedding provider options (OpenAI/Ollama)
- Example index and retrieve requests
- Configuration examples for both providers
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* docs: add Phase 8 RAG Knowledge Base documentation
- Create docs/PHASE/8-RAG_KNOWLEDGE_BASE.md with full phase details
- Update docs/PHASE-index.md:
- Mark Phase 8 as Completed in overview table
- Add Phase 8 summary to Completed Phases section
- Add entry to Version History
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* fix(ci): add RAG models import to alembic env and format tests
- Add rag models import to alembic/env.py for schema validation
- Format test_embeddings.py to pass ruff format check
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---------
Co-authored-by: Gabe@w7dev <gabor@w7-7.net>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Sorry @w7-mgfcode, your pull request is larger than the review limit of 150000 diff characters
📝 WalkthroughWalkthroughAdds a new RAG (Retrieval-Augmented Generation) subsystem: database models and migrations with pgvector/HNSW, chunkers for Markdown/OpenAPI, pluggable embedding providers (OpenAI/Ollama), RAG service and API routes, tests, docs, examples, and configuration + FastAPI router integration. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant RAGRouter as RAG Router
participant RAGService as RAG Service
participant EmbeddingProv as Embedding Provider
participant Database
Client->>RAGRouter: POST /rag/index (source_type, source_path, content)
RAGRouter->>RAGService: index_document(request)
RAGService->>RAGService: compute_content_hash()
RAGService->>Database: find_existing_source(source_type, source_path)
alt content unchanged
Database-->>RAGService: existing_source (matching hash)
RAGService-->>RAGRouter: IndexResponse(status="unchanged")
else new/updated
RAGService->>RAGService: chunk_content()
RAGService->>EmbeddingProv: embed_texts(chunk_texts)
EmbeddingProv-->>RAGService: embeddings[]
RAGService->>Database: upsert_source_and_chunks(embeddings)
Database-->>RAGService: upsert_result
RAGService-->>RAGRouter: IndexResponse(status="indexed"/"updated")
end
RAGRouter-->>Client: 201 IndexResponse
sequenceDiagram
participant Client
participant RAGRouter as RAG Router
participant RAGService as RAG Service
participant EmbeddingProv as Embedding Provider
participant Database
Client->>RAGRouter: POST /rag/retrieve (query, top_k, similarity_threshold)
RAGRouter->>RAGService: retrieve(request)
RAGService->>EmbeddingProv: embed_query(query)
EmbeddingProv-->>RAGService: query_embedding
RAGService->>Database: search_similar_chunks(query_embedding, top_k)
Database-->>RAGService: chunks_with_scores
RAGService->>RAGService: filter_by_threshold(), format_results()
RAGService-->>RAGRouter: RetrieveResponse(results, timings)
RAGRouter-->>Client: 200 RetrieveResponse
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
There was a problem hiding this comment.
Actionable comments posted: 10
🤖 Fix all issues with AI agents
In `@alembic/versions/c5d9e1f2g345_rag_dynamic_embedding_dimension.py`:
- Around line 26-51: The migration's upgrade() reads RAG_EMBEDDING_DIMENSION
from the environment making the migration non-deterministic; fix by making the
applied dimension explicit and reversible: either hardcode the intended
dimension in this migration (replace the env read in upgrade() and create a new
migration for any future dimension change) and keep downgrade() symmetric, or if
you must parameterize, have upgrade() write the applied dimension into a
persistent metadata table (create or update a small table via op.create_table /
op.execute) and modify downgrade() to read that recorded value to restore the
previous vector size; update references in this file to the functions upgrade(),
downgrade(), and the op.execute/op.create_index/op.drop_index calls so the
dimension used is deterministic and recorded.
In `@app/features/rag/chunkers.py`:
- Around line 406-420: The code uses PyYAML in the YAML fallback for parsing
OpenAPI specs (yaml.safe_load in the try/except that falls back to
MarkdownChunker().chunk), but PyYAML is not declared as a dependency; add
"pyyaml" to the project's dependencies in pyproject.toml (or equivalent
dependency manifest) so importing yaml at runtime is guaranteed to succeed and
YAML parsing is supported for the code path that calls yaml.safe_load and the
MarkdownChunker fallback.
In `@app/features/rag/embeddings.py`:
- Around line 156-217: In embed_texts, capture the original token count before
you truncate each text and use that value in the logger.warning call so the
pre-truncation token count is reported; specifically, call
self.count_tokens(text) once into a variable (e.g., original_token_count) before
calling truncate_to_tokens, update token_count after truncation, and pass
original_token_count to logger.warning instead of recomputing with the
already-truncated text (refer to embed_texts, truncate_to_tokens, count_tokens,
and the logger.warning call).
In `@app/features/rag/service.py`:
- Around line 84-99: The _read_content_from_path method currently reads any
user-provided source_path; add path traversal protection by resolving the
incoming path and ensuring it is contained within an allowed base directory
(e.g., a class-level or injected Path like base_dir) using
Path(source_path).resolve() and then calling .relative_to(base_dir.resolve()) —
if relative_to raises ValueError or the resolved path is outside base_dir, raise
FileNotFoundError or a SecurityError; otherwise proceed to
read_text(encoding="utf-8"). Update references to Path and
_read_content_from_path accordingly and mirror the containment approach used in
app/features/registry/storage.py and app/features/forecasting/persistence.py.
In `@docs/DAILY-FLOW.md`:
- Around line 165-175: The fenced code block showing the phase list (lines
containing "Phase 8: RAG Knowledge Base", "Phase 9: Agentic Layer", "Phase 10:
ForecastLab Dashboard") lacks a language tag and triggers MD040; update the
opening fence from ``` to ```text (i.e., add the text language tag) so the block
becomes a ```text fenced block to satisfy markdownlint and improve readability
tooling.
In `@examples/rag/index_docs.py`:
- Around line 68-86: The logic treating any non-201 response from the /rag/index
call as a failure causes idempotent 200 responses to be counted as failures;
update the response handling in the block that checks response.status_code so
that it treats 200 the same as 201 (parse response.json(), read result["status"]
and update unchanged/indexed/total_chunks/total_tokens accordingly) and only
increment failed and print the FAILED message for other status codes; look for
the response.status_code checks and the variables
unchanged/indexed/failed/total_chunks/total_tokens to locate where to adjust the
conditional.
In `@INITIAL-10.md`:
- Around line 55-112: Several fenced code blocks in the provided Markdown
snippet are missing language identifiers (MD040); update each ``` block to
include the appropriate language specifier (e.g., ```json for the JSON payload
and response, ```python for any Python examples, or ```text for plain text) so
the code fences are tagged consistently; search for the untagged triple-backtick
blocks shown in the diff (the request/response JSON blocks and the other blocks
called out in the comment) and add the correct language token to each opening
fence.
- Around line 17-23: Fix the Markdown table pipe spacing in INITIAL-10.md by
adding spaces around the pipe delimiters for both table blocks (e.g., the header
row, separator row, and each content row shown in the diff) so they conform to
MD060; update the first table (the one with "Component | Technology | Purpose")
and the other table referenced at lines 390-397 to use " | " spacing between
columns consistently.
In `@PRPs/PRP-10-agentic-layer.md`:
- Around line 36-43: The markdown table fails MD060 due to inconsistent pipe
spacing; fix each table row (header and data rows) so there is a single space
after and before every pipe delimiter (e.g., for rows containing `POST
/agents/experiment/run`, `POST /agents/experiment/approve`, `POST
/agents/rag/query`, `GET /agents/status/{session_id}`, `WS /agents/stream`)
ensuring uniform " | " spacing around cells and a matching header separator row
so the linter passes.
- Around line 107-163: Two fenced code blocks listing the project tree are
missing language tags; update both triple-backtick fences that enclose the
"app/…" tree and the "app/features/agents/…" tree to include a language (e.g.,
```text) so they comply with MD040 and improve highlighting; locate the two
blocks in PRP-10-agentic-layer.md that start with "app/" and
"app/features/agents/" and change their opening fences from ``` to ```text
(leave the content and closing fences unchanged).
🧹 Nitpick comments (15)
app/features/rag/chunkers.py (2)
302-337: Sentence splitting regex may have edge cases.The regex
(?<=[.!?])\s+for sentence splitting will incorrectly split on abbreviations like "Dr. Smith", "U.S.A.", or decimal numbers like "3.14 meters". While this is acceptable for RAG chunking (minor semantic boundary issues), consider documenting this limitation.
597-601: Schema truncation may produce invalid JSON.Truncating JSON to 500 characters with
[:500]can cut mid-structure, producing invalid JSON in the chunk content. This is acceptable for RAG context (humans/LLMs can understand partial schemas), but consider truncating at a higher level or adding"..."to indicate truncation.💡 Optional improvement
try: - return json.dumps(schema, indent=2)[:500] # Limit size + formatted = json.dumps(schema, indent=2) + if len(formatted) > 500: + return formatted[:497] + "..." + return formatted except (TypeError, ValueError): - return str(schema)[:500] + schema_str = str(schema) + if len(schema_str) > 500: + return schema_str[:497] + "..." + return schema_strdocs/PHASE/8-RAG_KNOWLEDGE_BASE.md (2)
268-273: Add language specifier to fenced code block.The code block lacks a language specifier. Adding one improves syntax highlighting and satisfies markdownlint.
📝 Suggested fix
-``` +```text Ruff: All checks passed MyPy: 0 errors (117 source files) Pyright: 0 errors Pytest: 82 unit tests passed + 14 integration tests</details> --- `279-307`: **Add language specifier for directory structure code block.** <details> <summary>📝 Suggested fix</summary> ```diff -``` +```text app/ ├── core/ ...app/features/rag/tests/test_service.py (1)
210-224: Consider adding test for list_sources with existing sources.The test only covers the empty sources scenario. Adding a test with mocked sources would improve coverage of the response mapping logic.
📝 Additional test suggestion
`@pytest.mark.asyncio` async def test_list_sources_with_data(self): """Test listing sources when sources exist.""" service = RAGService() mock_source = MagicMock() mock_source.source_id = "src123" mock_source.source_type = "markdown" mock_source.source_path = "docs/test.md" mock_source.content_hash = "abc123" mock_source.indexed_at = datetime.now() mock_source.metadata_ = {} mock_db = AsyncMock() mock_result = MagicMock() mock_result.all.return_value = [(mock_source, 5)] # source, chunk_count mock_db.execute = AsyncMock(return_value=mock_result) response = await service.list_sources(db=mock_db) assert response.total_sources == 1 assert response.total_chunks == 5 assert response.sources[0].source_id == "src123"app/features/rag/tests/test_schemas.py (1)
211-243: Consider adding a test for negative relevance_score.The test validates the upper bound (1.5 raises ValidationError) but doesn't test the lower bound. Based on the schema in
app/features/rag/schemas.py(lines 89-113),relevance_scorehasge=0.0, le=1.0constraints.💡 Suggested addition
# Add after line 243 def test_relevance_score_negative_rejected(self): """Test negative relevance score is rejected.""" with pytest.raises(ValidationError): ChunkResult( chunk_id="c1", source_id="s1", source_path="test.md", source_type="markdown", content="test", relevance_score=-0.1, )PRPs/PRP-9-rag-knowledge-base.md (2)
34-39: Minor markdown table formatting.Static analysis flagged table column spacing inconsistencies (MD060). This is purely cosmetic and doesn't affect readability.
📝 Optional table formatting fix
-| Method | Path | Description | -|--------|------|-------------| -| `POST` | `/rag/index` | Index document (markdown/openapi) | +| Method | Path | Description | +| ------ | -------------------------- | -------------------------------- | +| `POST` | `/rag/index` | Index document (markdown/openapi)|
104-125: Add language specifier to fenced code blocks.Static analysis flagged missing language specifiers (MD040). Adding a language hint improves syntax highlighting and accessibility.
📝 Suggested fix
-``` +```text app/ ├── core/app/features/rag/routes.py (1)
238-271: Consider adding error handling for list_sources.Unlike other endpoints,
list_sourcesdoesn't wrap database errors. While less likely to fail, consistency would improve robustness.♻️ Optional: Add SQLAlchemyError handling
async def list_sources( db: AsyncSession = Depends(get_db), ) -> SourceListResponse: """List all indexed sources.""" service = RAGService() - response = await service.list_sources(db=db) + try: + response = await service.list_sources(db=db) + except SQLAlchemyError as e: + logger.error( + "rag.list_sources_failed", + error=str(e), + error_type=type(e).__name__, + exc_info=True, + ) + raise DatabaseError( + message="Failed to list sources", + details={"error": str(e)}, + ) from e logger.info( "rag.list_sources_completed",app/features/rag/service.py (1)
168-173: Redundant assignment after logging warning.If
chunksis empty/falsy, the warning is logged, but thenchunks = []is assigned which is redundant sincechunkswould already be falsy (empty list from chunker).♻️ Minor cleanup
if not chunks: logger.warning( "rag.index_document_no_chunks", source_path=request.source_path, ) - chunks = []app/features/rag/models.py (1)
31-32: Empty TYPE_CHECKING block.The
TYPE_CHECKINGguard is imported but the block is empty. This can be removed unless forward references are planned.♻️ Remove unused import
-from typing import TYPE_CHECKING, Any +from typing import Any -if TYPE_CHECKING: - passapp/features/rag/tests/conftest.py (2)
241-251: Fixture source_id length is inconsistent with model constraint.
sample_document_sourcehassource_id="test123456789012345678901234"which is 26 characters, but the model definesString(32)and expects a UUID hex (32 chars). While this may work for testing, it's inconsistent.💡 Fix source_id length
`@pytest.fixture` def sample_document_source() -> DocumentSource: """Sample DocumentSource ORM object for testing.""" return DocumentSource( - source_id="test123456789012345678901234", + source_id="test12345678901234567890123456", # 32 chars source_type="markdown",
254-265: Fixture chunk_id length is also inconsistent.
sample_document_chunkhaschunk_id="chunk12345678901234567890123"which is 27 characters instead of 32.💡 Fix chunk_id length
`@pytest.fixture` def sample_document_chunk() -> DocumentChunk: """Sample DocumentChunk ORM object for testing.""" return DocumentChunk( - chunk_id="chunk12345678901234567890123", + chunk_id="chunk1234567890123456789012345", # 32 chars source_id=1,app/features/rag/schemas.py (2)
46-65: Consider non-negative constraints for response metrics.This is a small validation hardening to prevent negative counts or durations from leaking.
💡 Suggested validation additions
- chunks_created: int - tokens_processed: int - duration_ms: float + chunks_created: int = Field(..., ge=0) + tokens_processed: int = Field(..., ge=0) + duration_ms: float = Field(..., ge=0)
116-129: Add non-negative bounds for timing/counters (optional).Helps catch regressions if negative metrics ever appear.
💡 Suggested validation additions
- query_embedding_time_ms: float - search_time_ms: float - total_chunks_searched: int + query_embedding_time_ms: float = Field(..., ge=0) + search_time_ms: float = Field(..., ge=0) + total_chunks_searched: int = Field(..., ge=0)
| | Method | Path | Description | | ||
| |--------|------|-------------| | ||
| | `POST` | `/agents/experiment/run` | Execute experiment workflow | | ||
| | `POST` | `/agents/experiment/approve` | Approve pending action | | ||
| | `POST` | `/agents/rag/query` | Query with answer generation | | ||
| | `GET` | `/agents/status/{session_id}` | Check session status | | ||
| | `WS` | `/agents/stream` | WebSocket for streaming | | ||
|
|
There was a problem hiding this comment.
Fix table pipe spacing to satisfy MD060.
Keeps markdownlint clean and consistent table styling.
✅ Suggested fix
-| Method | Path | Description |
-|--------|------|-------------|
-| `POST` | `/agents/experiment/run` | Execute experiment workflow |
-| `POST` | `/agents/experiment/approve` | Approve pending action |
-| `POST` | `/agents/rag/query` | Query with answer generation |
-| `GET` | `/agents/status/{session_id}` | Check session status |
-| `WS` | `/agents/stream` | WebSocket for streaming |
+| Method | Path | Description |
+| ------ | ---- | ----------- |
+| `POST` | `/agents/experiment/run` | Execute experiment workflow |
+| `POST` | `/agents/experiment/approve` | Approve pending action |
+| `POST` | `/agents/rag/query` | Query with answer generation |
+| `GET` | `/agents/status/{session_id}` | Check session status |
+| `WS` | `/agents/stream` | WebSocket for streaming |📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| | Method | Path | Description | | |
| |--------|------|-------------| | |
| | `POST` | `/agents/experiment/run` | Execute experiment workflow | | |
| | `POST` | `/agents/experiment/approve` | Approve pending action | | |
| | `POST` | `/agents/rag/query` | Query with answer generation | | |
| | `GET` | `/agents/status/{session_id}` | Check session status | | |
| | `WS` | `/agents/stream` | WebSocket for streaming | | |
| | Method | Path | Description | | |
| | ------ | ---- | ----------- | | |
| | `POST` | `/agents/experiment/run` | Execute experiment workflow | | |
| | `POST` | `/agents/experiment/approve` | Approve pending action | | |
| | `POST` | `/agents/rag/query` | Query with answer generation | | |
| | `GET` | `/agents/status/{session_id}` | Check session status | | |
| | `WS` | `/agents/stream` | WebSocket for streaming | |
🧰 Tools
🪛 markdownlint-cli2 (0.20.0)
[warning] 37-37: Table column style
Table pipe is missing space to the right for style "compact"
(MD060, table-column-style)
[warning] 37-37: Table column style
Table pipe is missing space to the left for style "compact"
(MD060, table-column-style)
[warning] 37-37: Table column style
Table pipe is missing space to the right for style "compact"
(MD060, table-column-style)
[warning] 37-37: Table column style
Table pipe is missing space to the left for style "compact"
(MD060, table-column-style)
[warning] 37-37: Table column style
Table pipe is missing space to the right for style "compact"
(MD060, table-column-style)
[warning] 37-37: Table column style
Table pipe is missing space to the left for style "compact"
(MD060, table-column-style)
🤖 Prompt for AI Agents
In `@PRPs/PRP-10-agentic-layer.md` around lines 36 - 43, The markdown table fails
MD060 due to inconsistent pipe spacing; fix each table row (header and data
rows) so there is a single space after and before every pipe delimiter (e.g.,
for rows containing `POST /agents/experiment/run`, `POST
/agents/experiment/approve`, `POST /agents/rag/query`, `GET
/agents/status/{session_id}`, `WS /agents/stream`) ensuring uniform " | "
spacing around cells and a matching header separator row so the linter passes.
- Make migration deterministic by hardcoding dimension values instead of reading from environment (alembic migration) - Add pyyaml dependency for YAML parsing in OpenAPI chunker - Fix token count logging to capture original count before truncation - Add path traversal protection to RAG service _read_content_from_path (mirrors registry/storage.py pattern) - Fix markdown linting issues: - Add language tags to fenced code blocks (MD040) - Fix table pipe spacing (MD060) - Fix index_docs.py to treat 200 same as 201 for idempotent responses - Add test for path traversal protection Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
w7-learn
left a comment
There was a problem hiding this comment.
The fixes have been pushed to the dev branch. The PR #50 (#50) now includes these changes.
Summary of Fixes Applied
┌────────────────────────────────────────┬────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ File │ Fix │
├────────────────────────────────────────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ alembic/versions/c5d9e1f2g345_...py │ Made migration deterministic by hardcoding PREVIOUS_DIMENSION and TARGET_DIMENSION instead of reading from environment │
├────────────────────────────────────────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ pyproject.toml │ Added pyyaml>=6.0.0 dependency for YAML parsing in OpenAPI chunker │
├────────────────────────────────────────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ app/features/rag/embeddings.py │ Fixed token count logging to capture original count before truncation │
├────────────────────────────────────────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ app/features/rag/service.py │ Added path traversal protection with injectable base_dir parameter (mirrors registry/storage.py pattern) │
├────────────────────────────────────────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ app/features/rag/tests/test_service.py │ Updated test to use base_dir parameter; added path traversal test │
├────────────────────────────────────────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ docs/DAILY-FLOW.md │ Added text language tag to fenced code block (MD040) │
├────────────────────────────────────────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ examples/rag/index_docs.py │ Fixed response handling to treat 200 same as 201 for idempotent responses │
├────────────────────────────────────────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ INITIAL-10.md │ Fixed code block language tags and table pipe spacing (MD040, MD060) │
├────────────────────────────────────────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ PRPs/PRP-10-agentic-layer.md │ Fixed table pipe spacing and added language tags to code blocks │
└────────────────────────────────────────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
All tests pass (83 unit tests) and type checking is clean (mypy + pyright).
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@examples/rag/index_docs.py`:
- Around line 126-130: index_readme currently treats anything other than 201 as
failure; make it consistent with index_markdown_docs by accepting both 200 and
201 as successful responses. In the index_readme function, change the
response.status_code check to allow 200 or 201, parse response.json() for
success, and update the success print to use result['chunks_created'] and
result['status'] (or an appropriate field) similarly to index_markdown_docs so
previously-indexed/unchanged README returns a success message instead of "Failed
to index".
🧹 Nitpick comments (3)
PRPs/PRP-10-agentic-layer.md (1)
500-508: Update to latest stable dependency versions and consider exact pinning for production stability.The doc's minimum versions (pydantic-ai≥0.1.0, anthropic≥0.40.0, websockets≥13.0) are significantly outdated. Latest stable releases are:
- pydantic-ai: 1.47.0 (Jan 24, 2026)
- anthropic: 0.76.0 (Jan 13, 2026)
- websockets: 16.0 (Jan 10, 2026)
Given the rapid iteration of these libraries, pinning exact versions or using narrower ranges (e.g.,
~=1.47.0) is recommended to ensure tested compatibility rather than relying on loose>=ranges that may pull in breaking changes.app/features/rag/service.py (1)
553-612: Consider the trade-off in similarity search result limiting.The query fetches
top_k * 2results then filters by threshold, which is a reasonable optimization. However, if many results fall below the threshold, the final result set could be smaller thantop_k. This is acceptable behavior but worth documenting in the method docstring for clarity.📝 Optional: Document the behavior
async def _search_similar_chunks( self, db: AsyncSession, query_embedding: list[float], top_k: int, threshold: float, filters: dict[str, Any] | None, ) -> list[ChunkResult]: """Search for similar chunks using cosine distance. Args: db: Database session. query_embedding: Query embedding vector. top_k: Maximum results to return. threshold: Minimum similarity threshold. filters: Optional metadata filters. Returns: - List of chunk results with relevance scores. + List of chunk results with relevance scores. May return fewer than + top_k results if insufficient chunks meet the similarity threshold. """app/features/rag/embeddings.py (1)
495-536: Consider async cleanup for the Ollama provider singleton.The
OllamaEmbeddingProviderhas an asyncclose()method, butreset_embedding_service()doesn't call it. If the singleton is reset while using Ollama, the HTTP client won't be properly closed.♻️ Proposed fix to handle async cleanup
-def reset_embedding_service() -> None: +async def reset_embedding_service() -> None: """Reset the singleton embedding provider. Useful for testing or reconfiguration. """ global _embedding_provider + if _embedding_provider is not None: + if isinstance(_embedding_provider, OllamaEmbeddingProvider): + await _embedding_provider.close() _embedding_provider = NoneAlternatively, keep the sync version and add a separate async cleanup function if the sync reset is needed for simpler test fixtures:
def reset_embedding_service() -> None: """Reset the singleton embedding provider (sync). Note: Does not close async resources. Use reset_embedding_service_async() for proper cleanup of OllamaEmbeddingProvider. """ global _embedding_provider _embedding_provider = None async def reset_embedding_service_async() -> None: """Reset the singleton embedding provider with async cleanup.""" global _embedding_provider if _embedding_provider is not None: if isinstance(_embedding_provider, OllamaEmbeddingProvider): await _embedding_provider.close() _embedding_provider = None
| if response.status_code == 201: | ||
| result = response.json() | ||
| print(f"README.md indexed: {result['chunks_created']} chunks ({result['status']})") | ||
| else: | ||
| print(f"Failed to index README.md: {response.status_code}") |
There was a problem hiding this comment.
Inconsistent response handling compared to index_markdown_docs.
The index_readme function only checks for status code 201, while index_markdown_docs (line 68) correctly handles both 200 and 201. If the README was previously indexed (unchanged content), the endpoint returns 200, causing this function to incorrectly report failure.
🐛 Proposed fix for consistent response handling
- if response.status_code == 201:
+ if response.status_code in (200, 201):
result = response.json()
print(f"README.md indexed: {result['chunks_created']} chunks ({result['status']})")
else:
print(f"Failed to index README.md: {response.status_code}")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if response.status_code == 201: | |
| result = response.json() | |
| print(f"README.md indexed: {result['chunks_created']} chunks ({result['status']})") | |
| else: | |
| print(f"Failed to index README.md: {response.status_code}") | |
| if response.status_code in (200, 201): | |
| result = response.json() | |
| print(f"README.md indexed: {result['chunks_created']} chunks ({result['status']})") | |
| else: | |
| print(f"Failed to index README.md: {response.status_code}") |
🤖 Prompt for AI Agents
In `@examples/rag/index_docs.py` around lines 126 - 130, index_readme currently
treats anything other than 201 as failure; make it consistent with
index_markdown_docs by accepting both 200 and 201 as successful responses. In
the index_readme function, change the response.status_code check to allow 200 or
201, parse response.json() for success, and update the success print to use
result['chunks_created'] and result['status'] (or an appropriate field)
similarly to index_markdown_docs so previously-indexed/unchanged README returns
a success message instead of "Failed to index".
* feat: RAG Knowledge Base, Serving Layer, and Model Registry (#50) * feat(registry): implement model registry for run tracking and deployments (#36) * docs: expand INITIAL-7 with lifecycle, lineage, and artifact integrity details Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat(registry): implement model registry for run tracking and deployments Add model registry feature (PRP-7) with: - ORM models: ModelRun with JSONB columns (model_config, metrics, runtime_info), DeploymentAlias for mutable deployment pointers - Storage: LocalFSProvider with SHA-256 integrity verification and path traversal prevention, abstract interface for future S3/GCS support - Service: RegistryService with state machine validation, duplicate detection, config hashing, and run comparison - API endpoints: CRUD for runs and aliases, artifact verification, run comparison with config/metrics diffs - Database: Alembic migration with GIN indexes for JSONB containment queries - Tests: 103 unit tests (schemas, storage, service) + 24 integration tests - Example: registry_demo.py demonstrating full workflow Run lifecycle: PENDING → RUNNING → SUCCESS/FAILED → ARCHIVED Aliases can only point to SUCCESS runs for deployment safety. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs: update documentation for model registry implementation - README.md: Add registry to project structure, API endpoints section, and example reference - docs/ARCHITECTURE.md: Update section 7.6 with full implementation details, add registry endpoints to section 8, mark Phase 1 complete - docs/PHASE-index.md: Mark phases 4-6 as completed, add detailed completion entries for Forecasting, Backtesting, and Registry Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs: add PHASE documentation for forecasting, backtesting, and registry Create missing phase documentation files to complete the project's implementation records: - 4-FORECASTING.md: Model zoo with BaseForecaster interface, train/predict endpoints, and joblib persistence - 5-BACKTESTING.md: Time-series CV with expanding/sliding strategies, metrics calculation, and baseline comparisons - 6-MODEL_REGISTRY.md: Run tracking with state machine, deployment aliases, and SHA-256 artifact integrity verification Update PHASE-index.md to link to the new documentation files. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(registry): resolve type checking issues with Pydantic model_config alias - Add pydantic.mypy plugin to pyproject.toml for proper Pydantic type checking - Use model_config_data instead of model_config alias in tests to avoid collision with Pydantic's reserved model_config attribute - Update _model_to_response to use model_validate() for proper alias handling - Change docker-compose postgres port to 5433 to avoid conflicts Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: resolve CI failures for registry PR - Import registry models in alembic/env.py for schema validation - Fix import order and remove extraneous f-strings in registry_demo.py - Add type: ignore comments for frozen model tests with pydantic.mypy plugin Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: prevent db_session fixtures from dropping all tables The data_platform and root conftest.py db_session fixtures were dropping all tables after each test, causing subsequent integration tests to fail when they couldn't find migrated tables. Changes: - Remove Base.metadata.drop_all from db_session fixtures - Tests now rely on migrations for table creation - Each test just rolls back its own changes Also fixes ruff format issue in examples/registry_demo.py. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: add proper test data cleanup to db_session fixtures Update data_platform and ingest test fixtures to clean up test data explicitly instead of dropping all tables or just rolling back. - data_platform: delete test stores, products, calendar entries - ingest: delete test stores, products, sales, calendar entries This ensures test isolation while preserving migrated tables. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: use separate session for test cleanup to avoid transaction issues When tests cause integrity errors, the session enters a failed state. Use a fresh session for cleanup to avoid PendingRollbackError. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: use contextlib.suppress instead of try-except-pass Replace try-except-pass patterns with contextlib.suppress to satisfy ruff S110 linting rule. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Gabe@w7dev <gabor@w7-7.net> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> * fix: code improvements and documentation fixes - Add date range filter to SalesDaily cleanup in ingest tests - Enforce artifact_hash presence before verification in registry routes - Compute SHA256 from saved file instead of source in storage - Fix override_get_db to mirror production transaction semantics - Filter DeploymentAlias cleanup to only test runs - Update database port to 5433 in config and .env.example - Add language identifiers to fenced code blocks (MD040) - Fix table formatting for markdownlint MD060 - Update PR reference in PHASE/6-MODEL_REGISTRY.md - Convert bare URLs to markdown links in INITIAL-7.md - Wrap __init__.py in backticks in PRP-7 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * sync: update dev from phase-6 (#40) * chore: release v0.2.0 (#37) * feat(registry): implement model registry for run tracking and deployments (#36) * docs: expand INITIAL-7 with lifecycle, lineage, and artifact integrity details Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat(registry): implement model registry for run tracking and deployments Add model registry feature (PRP-7) with: - ORM models: ModelRun with JSONB columns (model_config, metrics, runtime_info), DeploymentAlias for mutable deployment pointers - Storage: LocalFSProvider with SHA-256 integrity verification and path traversal prevention, abstract interface for future S3/GCS support - Service: RegistryService with state machine validation, duplicate detection, config hashing, and run comparison - API endpoints: CRUD for runs and aliases, artifact verification, run comparison with config/metrics diffs - Database: Alembic migration with GIN indexes for JSONB containment queries - Tests: 103 unit tests (schemas, storage, service) + 24 integration tests - Example: registry_demo.py demonstrating full workflow Run lifecycle: PENDING → RUNNING → SUCCESS/FAILED → ARCHIVED Aliases can only point to SUCCESS runs for deployment safety. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs: update documentation for model registry implementation - README.md: Add registry to project structure, API endpoints section, and example reference - docs/ARCHITECTURE.md: Update section 7.6 with full implementation details, add registry endpoints to section 8, mark Phase 1 complete - docs/PHASE-index.md: Mark phases 4-6 as completed, add detailed completion entries for Forecasting, Backtesting, and Registry Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs: add PHASE documentation for forecasting, backtesting, and registry Create missing phase documentation files to complete the project's implementation records: - 4-FORECASTING.md: Model zoo with BaseForecaster interface, train/predict endpoints, and joblib persistence - 5-BACKTESTING.md: Time-series CV with expanding/sliding strategies, metrics calculation, and baseline comparisons - 6-MODEL_REGISTRY.md: Run tracking with state machine, deployment aliases, and SHA-256 artifact integrity verification Update PHASE-index.md to link to the new documentation files. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(registry): resolve type checking issues with Pydantic model_config alias - Add pydantic.mypy plugin to pyproject.toml for proper Pydantic type checking - Use model_config_data instead of model_config alias in tests to avoid collision with Pydantic's reserved model_config attribute - Update _model_to_response to use model_validate() for proper alias handling - Change docker-compose postgres port to 5433 to avoid conflicts Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: resolve CI failures for registry PR - Import registry models in alembic/env.py for schema validation - Fix import order and remove extraneous f-strings in registry_demo.py - Add type: ignore comments for frozen model tests with pydantic.mypy plugin Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: prevent db_session fixtures from dropping all tables The data_platform and root conftest.py db_session fixtures were dropping all tables after each test, causing subsequent integration tests to fail when they couldn't find migrated tables. Changes: - Remove Base.metadata.drop_all from db_session fixtures - Tests now rely on migrations for table creation - Each test just rolls back its own changes Also fixes ruff format issue in examples/registry_demo.py. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: add proper test data cleanup to db_session fixtures Update data_platform and ingest test fixtures to clean up test data explicitly instead of dropping all tables or just rolling back. - data_platform: delete test stores, products, calendar entries - ingest: delete test stores, products, sales, calendar entries This ensures test isolation while preserving migrated tables. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: use separate session for test cleanup to avoid transaction issues When tests cause integrity errors, the session enters a failed state. Use a fresh session for cleanup to avoid PendingRollbackError. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: use contextlib.suppress instead of try-except-pass Replace try-except-pass patterns with contextlib.suppress to satisfy ruff S110 linting rule. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Gabe@w7dev <gabor@w7-7.net> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> * fix: code improvements and documentation fixes - Add date range filter to SalesDaily cleanup in ingest tests - Enforce artifact_hash presence before verification in registry routes - Compute SHA256 from saved file instead of source in storage - Fix override_get_db to mirror production transaction semantics - Filter DeploymentAlias cleanup to only test runs - Update database port to 5433 in config and .env.example - Add language identifiers to fenced code blocks (MD040) - Fix table formatting for markdownlint MD060 - Update PR reference in PHASE/6-MODEL_REGISTRY.md - Convert bare URLs to markdown links in INITIAL-7.md - Wrap __init__.py in backticks in PRP-7 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Gabe@w7dev <gabor@w7-7.net> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> * chore(main): release 0.2.0 (#38) Release-As: 0.2.0 Co-authored-by: Gabe@w7dev <gabor@w7-7.net> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> * chore(main): release 0.2.0 (#39) * chore(main): release 0.2.0 * chore: trigger CI --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Gabe@w7dev <gabor@w7-7.net> --------- Co-authored-by: Gabe@w7dev <gabor@w7-7.net> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * feat(serving-layer): implement PRP-8 agent-first API design (#42) * docs(initial-8): expand serving layer requirements Add specifications for job-driven orchestration, dimension discovery endpoints, standardized API protocols (filtering/pagination), and agent-first API design patterns for LLM tool-calling optimization. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs(prp-8): add serving layer implementation spec Comprehensive PRP for FastAPI serving layer including: - Dimensions module for store/product discovery endpoints - Analytics module for KPI/drilldown queries - Jobs module for async-ready task orchestration - RFC 7807 problem details for semantic error responses - OpenAPI export optimization for LLM tool-calling 26 tasks with validation gates and 8.5/10 confidence score. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat(serving-layer): implement PRP-8 agent-first API design Add RFC 7807 Problem Details for semantic error responses: - ProblemDetail schema with type URIs and error codes - application/problem+json content type - Validation exception handler with field-level errors Add dimensions module for store/product discovery: - GET /dimensions/stores with pagination, filtering, search - GET /dimensions/products with pagination, filtering, search - LLM-optimized Field descriptions for tool-calling Add analytics module for KPI aggregations: - GET /analytics/kpis with date range and dimension filters - GET /analytics/drilldowns for store/product/category/region/date - Revenue share and ranking calculations Add jobs module for async-ready task orchestration: - POST /jobs for train/predict/backtest operations - Job model with JSONB params/results - Status transitions: pending → running → completed/failed Integration: - New settings: analytics_max_rows, jobs_retention_days - Register routers in main.py - Alembic migration for jobs table Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs: update documentation for PRP-8 serving layer Update README.md: - Add dimensions, analytics, jobs modules to project structure - Document new API endpoints with examples - Add RFC 7807 error response documentation Update docs/ARCHITECTURE.md: - Mark serving layer section as implemented - Add configuration settings for new modules - Update roadmap with Phase-2 completion Update docs/PHASE-index.md: - Add Phase 7 (Serving Layer) as completed - Update phase overview table - Add version history entry Create docs/PHASE/7-SERVING_LAYER.md: - Comprehensive phase documentation - API endpoint specifications - Database schema and migration details - Usage examples and test coverage Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * style: fix ruff formatting Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Gabe@w7dev <gabor@w7-7.net> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> * fix(serving-layer): improve analytics validation and jobs run_id handling - Add validate_date_range helper to analytics routes for reusable date validation - Apply date range validation to both get_kpis and get_drilldowns endpoints - Fix total_revenue_all calculation to use full dataset before limiting - Add run_id to train job result for downstream predict jobs - Fix predict job to resolve run_id to model metadata from bundle - Update test fixtures to use 32-char hex IDs per schema requirements Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * style: format jobs service Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs: restructure roadmap into modular three-phase architecture (INITIAL-9/10/11) (#47) * docs: restructure INITIAL-9 into modular three-phase roadmap Decompose monolithic INITIAL-9 into three specialized technical phases: - INITIAL-9: RAG Knowledge Base ("The Memory") - pgvector + OpenAI embeddings - Markdown/OpenAPI-aware chunking - Semantic retrieval endpoints - INITIAL-10: Agentic Layer ("The Brain") - PydanticAI agents (Experiment Orchestrator, RAG Assistant) - Tool orchestration with structured outputs - Human-in-the-loop approval workflow - INITIAL-11: ForecastLab Dashboard ("The Face") - React 19 + Vite + shadcn/ui - TanStack Table/Query for data management - Recharts for time series visualization - Agent chat interface with streaming Update PHASE-index.md and DAILY-FLOW.md to align with new structure. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs(prp): add PRP-9 RAG Knowledge Base implementation plan Comprehensive PRP for INITIAL-9 RAG Knowledge Base feature: - pgvector + SQLAlchemy 2.0 integration patterns - Markdown-aware and OpenAPI-aware chunking - Async OpenAI embeddings with batch processing - HNSW index for cosine similarity search - 15 ordered implementation tasks - 5-level validation loop (syntax → types → unit → integration → smoke) - Full ORM models and Pydantic schemas - Known gotchas and anti-patterns documented Confidence score: 8.5/10 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs(prp): add PRP-10 Agentic Layer implementation plan Comprehensive PRP for INITIAL-10 Agentic Layer feature: - PydanticAI agent framework integration - Experiment Orchestrator Agent (backtest → compare → deploy) - RAG Assistant Agent (query → retrieve → answer with citations) - Human-in-the-loop approval workflow for sensitive actions - WebSocket streaming for real-time token delivery - Session persistence with JSONB message history - 17 ordered implementation tasks - Tool definitions for registry, backtesting, forecasting, RAG - Full Pydantic schemas and ORM models Confidence score: 7.5/10 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Gabe@w7dev <gabor@w7-7.net> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> * docs(prp): add PRP-11 ForecastLab Dashboard implementation plan (#48) Comprehensive PRP for INITIAL-11 (The Face) with: - 24 implementation tasks across 6 phases - React 19 + Vite + shadcn/ui + TanStack Table/Query - TypeScript types matching all backend API schemas - Reusable DataTable with server-side pagination - TimeSeriesChart component with Recharts - WebSocket hook for agent chat streaming - Complete documentation links and gotchas Confidence score: 7.5/10 (chat depends on INITIAL-10) Co-authored-by: Gabe@w7dev <gabor@w7-7.net> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> * feat(rag): implement PRP-9 RAG Knowledge Base with pgvector (#49) * feat(rag): implement PRP-9 RAG Knowledge Base with pgvector Add RAG (Retrieval-Augmented Generation) knowledge base feature for semantic document indexing and retrieval using PostgreSQL pgvector. Key components: - Document indexing with markdown-aware and OpenAPI-aware chunking - Semantic retrieval using cosine similarity with configurable thresholds - Idempotent re-indexing via SHA-256 content hash comparison - OpenAI text-embedding-3-small for embeddings (1536 dimensions) - HNSW index for fast approximate nearest neighbor search API endpoints: - POST /rag/index - Index documents with automatic chunking - POST /rag/retrieve - Semantic search with relevance scoring - GET /rag/sources - List indexed sources with statistics - DELETE /rag/sources/{source_id} - Remove source and chunks Includes: - ORM models: DocumentSource, DocumentChunk with Vector column - Pydantic v2 schemas with strict validation - 68 unit tests + 14 integration tests - Migration for pgvector extension and RAG tables - Examples and environment configuration Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat(rag): add Ollama embedding provider with OpenAI-compatible API - Add EmbeddingProvider abstract base class with provider pattern - Refactor existing OpenAI code to OpenAIEmbeddingProvider - Add OllamaEmbeddingProvider using /v1/embeddings endpoint - Supports configurable dimensions parameter - Uses OpenAI-compatible response format - Add config settings: rag_embedding_provider, ollama_base_url, ollama_embedding_model - Add migration for dynamic embedding dimension support - Update tests for both providers (25 tests) Enables local/LAN embedding generation without OpenAI API dependency. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs: add Ollama embedding provider documentation - Update .env.example with Ollama configuration options - Add RAG Knowledge Base section to README with: - Embedding provider options (OpenAI/Ollama) - Example index and retrieve requests - Configuration examples for both providers Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs: add Phase 8 RAG Knowledge Base documentation - Create docs/PHASE/8-RAG_KNOWLEDGE_BASE.md with full phase details - Update docs/PHASE-index.md: - Mark Phase 8 as Completed in overview table - Add Phase 8 summary to Completed Phases section - Add entry to Version History Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(ci): add RAG models import to alembic env and format tests - Add rag models import to alembic/env.py for schema validation - Format test_embeddings.py to pass ruff format check Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Gabe@w7dev <gabor@w7-7.net> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> * fix: address code review issues for RAG module and docs - Make migration deterministic by hardcoding dimension values instead of reading from environment (alembic migration) - Add pyyaml dependency for YAML parsing in OpenAPI chunker - Fix token count logging to capture original count before truncation - Add path traversal protection to RAG service _read_content_from_path (mirrors registry/storage.py pattern) - Fix markdown linting issues: - Add language tags to fenced code blocks (MD040) - Fix table pipe spacing (MD060) - Fix index_docs.py to treat 200 same as 201 for idempotent responses - Add test for path traversal protection Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Gabe@w7dev <gabor@w7-7.net> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * ci: add release-please branch trigger and wire workflow_dispatch ref (#52) - Add 'release-please--branches--**' pattern to match actual release-please branch naming (e.g., release-please--branches--main--components--forecastlabai) - Add 'ref' input to workflow_dispatch with proper type declaration - Wire ref input to all checkout steps via CHECKOUT_REF env var - Use inputs.ref || github.ref for predictable fallback behavior - Update concurrency group to respect manual ref input Co-authored-by: Gabe@w7dev <gabor@w7-7.net> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> * chore(main): release 0.2.2 (#51) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Gabor Szabo <168316277+w7-mgfcode@users.noreply.github.com> --------- Co-authored-by: Gabe@w7dev <gabor@w7-7.net> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Summary
This PR merges the
devbranch intomain, bringing significant new features and improvements:Key Changes
RAG Feature (
app/features/rag/)Serving Layer Improvements
Model Registry
Files Changed
Test plan
pytest -m "not integration")pytest -m integration)🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Documentation
Chores
✏️ Tip: You can customize this high-level summary in your review settings.