chore(main): release Phase 2 - Ingest Layer#21
Conversation
- DAILY-FLOW.md: Daily development workflow (branch strategy, PR flow, commands) - PHASE-FLOW.md: Phase completion workflow (lifecycle, snapshot, sync) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…ily (#19) * docs(prp): add PRP-3 for ingest layer implementation Comprehensive implementation plan for idempotent batch upsert endpoints: - POST /ingest/sales-daily with natural key resolution (store_code, sku) - PostgreSQL ON CONFLICT DO UPDATE for replay-safe ingestion - Partial success pattern with row-level error reporting - Configurable batch size and timeout settings - 13 ordered implementation tasks with validation gates Confidence score: 8/10 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat(ingest): implement idempotent batch upsert endpoint for sales_daily Implements PRP-3 Ingest Layer with: - POST /ingest/sales-daily endpoint accepting natural keys (store_code, sku) - KeyResolver service for store_code → store_id, sku → product_id resolution - Calendar date FK validation (dates must exist in calendar table) - PostgreSQL ON CONFLICT DO UPDATE for replay-safe upserts - Partial success pattern: valid rows processed, invalid rows returned as errors - Configurable batch size and timeout via Settings - Structured logging with ingest.sales_daily.{action}_{state} events New files: - app/features/ingest/ - Complete vertical slice (schemas, service, routes, tests) - examples/api/ingest_sales_daily.http - HTTP client examples Test coverage: 42 new tests (18 schema + 15 service + 9 integration) All validation gates passing: ruff, mypy, pyright, pytest (101 tests) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(ingest): address Sourcery code review comments - Replace inserted_count/updated_count with processed_count (ON CONFLICT can't distinguish inserts vs updates without xmax check complexity) - Remove misleading updated_count field that was always 0 - Use result.rowcount instead of len(result.fetchall()) for efficiency - Change errors field from mutable default=[] to default_factory=list - Catch specific SQLAlchemyError instead of bare Exception - Wire db_session into FastAPI deps override in integration tests - Remove no-op validate_total_amount_consistency validator Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Gabe@w7dev <gabor@w7-7.net> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
- Update README.md with API Endpoints section and ingest examples - Mark Phase 2 as completed in PHASE-index.md - Add comprehensive Phase 2 documentation (2-INGEST_LAYER.md) Co-authored-by: Gabe@w7dev <gabor@w7-7.net> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Sorry @w7-mgfcode, you have reached your weekly rate limit of 500000 diff characters.
Please try again later or upgrade to continue using Sourcery
|
Caution Review failedThe pull request is closed. 📝 WalkthroughWalkthroughIntroduces a complete ingest layer (PRP-3) featuring batch upsert of sales-daily records using natural keys (store_code, sku) with PostgreSQL ON CONFLICT DO UPDATE for idempotency. Includes key resolution service for resolving store codes and SKUs to internal IDs, routes for POST /ingest/sales-daily endpoint, comprehensive test coverage, and documentation. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant API as "POST /ingest/sales-daily"
participant KeyResolver as KeyResolver
participant DB as Database
participant Response as Response
Client->>API: POST SalesDailyIngestRequest
API->>KeyResolver: Extract unique store_codes, skus, dates
KeyResolver->>DB: Query Store by codes
DB-->>KeyResolver: Map code → store_id
KeyResolver->>DB: Query Product by sku
DB-->>KeyResolver: Map sku → product_id
KeyResolver->>DB: Query Calendar by dates
DB-->>KeyResolver: Return valid dates
API->>API: Validate rows vs resolved IDs (collect errors)
API->>DB: INSERT ... ON CONFLICT DO UPDATE (date, store_id, product_id)
DB-->>API: rowcount (processed)
API->>Response: SalesDailyIngestResponse (processed_count, rejected_count, errors, duration_ms)
Response-->>Client: HTTP 200 OK
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~35 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai help |
ChatThere are 3 ways to chat with CodeRabbit:
CodeRabbit commands
Other keywords and placeholders
CodeRabbit configuration file (
|
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In `@docs/PHASE/2-INGEST_LAYER.md`:
- Around line 402-409: In the "Lessons Learned" text, replace the phrase
"floating point" with the hyphenated form "floating-point" (e.g., in the
sentence about Decimal for Money or the "Decimal for Money" bullet) to correct
the compound adjective; update the markdown content so the wording reads
"floating-point precision issues".
- Around line 87-93: Add a blank line before the "**Error Codes:**" table and
another blank line after the table so the markdown table is separated from
surrounding content (fixes markdownlint MD058); edit the block containing the
"**Error Codes:**" heading and the subsequent table rows (`| Code | Description
|` ... `| `UNKNOWN_DATE` | Date not found in calendar table |`) to ensure there
is an empty line above the heading/table and an empty line following the final
table row.
- Around line 236-253: In the "Directory Structure" code fence that displays the
app/features/ingest/ tree (the fenced block starting with the line
"app/features/ingest/"), add a language identifier to the opening fence (e.g.,
change ``` to ```text) to satisfy markdownlint MD040; ensure the closing ```
remains unchanged so the snippet renders as a plain-text directory listing.
In `@examples/api/ingest_sales_daily.http`:
- Around line 41-50: Update the example HTTP response to match the
SalesDailyIngestResponse schema by replacing the separate inserted_count and
updated_count fields with a single processed_count field, and rename
total_processed to total_received; keep rejected_count, errors, and duration_ms
unchanged and preserve their example values (e.g., processed_count: 2,
rejected_count: 0, total_received: 2, errors: [], duration_ms: 45.23).
🧹 Nitpick comments (8)
examples/api/ingest_sales_daily.http (1)
74-84: Clarify idempotent behavior in expected response.The comment at line 84 correctly notes the row is updated due to
ON CONFLICT DO UPDATE, but the expected response showsinserted_count: 1andupdated_count: 0. If the implementation usesprocessed_count, this response example should be updated to match. Additionally, the wording could be clearer that PostgreSQL'sON CONFLICT DO UPDATEdoesn't distinguish between insert vs update in the row count.docs/DAILY-FLOW.md (1)
165-175: Update or remove the "next phase" section.This section references the Ingest Layer (PRP-3) as the next phase to work on, but this PR is releasing Phase 2 which includes the Ingest Layer. Consider updating this to reference the actual next phase (Phase 3) or removing this section to avoid confusion.
docs/PHASE-FLOW.md (1)
9-33: Consider adding language specifier to ASCII diagram code block.The markdownlint tool flagged this fenced code block (and the one at line 117) for missing a language specifier. For ASCII diagrams, you can use
textorplaintextas the language identifier.♻️ Suggested fix
-``` +```text ┌─────────────────────────────────────────────────────────────────┐ │ PHASE LIFECYCLE │app/features/ingest/tests/test_schemas.py (1)
140-164: Align total_amount test names/comments with the “accepted as-is” contract.These names/comments read as if there’s cross-field validation or tolerance, but the schema intentionally doesn’t enforce it. Renaming avoids confusion.
✏️ Suggested rename/wording updates
- def test_total_amount_consistency_validation_passes(self): - """Test that total amount consistency validation passes for matching values.""" + def test_total_amount_accepts_matching_value(self): + """Test that total_amount accepts matching values without extra validation.""" @@ - # Should pass even with slight mismatch + # No cross-field validation; value is accepted as provided @@ - def test_total_amount_allows_mismatch_within_tolerance(self): - """Test that small mismatches in total amount are allowed.""" - # Expected: 10 * 9.99 = 99.90, actual: 99.91 (within 0.01 tolerance) + def test_total_amount_accepts_mismatch(self): + """Test that total_amount accepts mismatches (no cross-field validation).""" + # Expected: 10 * 9.99 = 99.90, actual: 99.91 (accepted as-is)app/features/ingest/schemas.py (1)
25-32: Align Decimal constraints with DB precision to prevent DB-side rejections.
SalesDaily.unit_priceandSalesDaily.total_amountareNumeric(10,2)andNumeric(12,2); addingmax_digitskeeps API validation consistent with storage limits.🔧 Suggested constraint alignment
- unit_price: Decimal = Field(..., ge=0, decimal_places=2, description="Price per unit") + unit_price: Decimal = Field( + ..., ge=0, max_digits=10, decimal_places=2, description="Price per unit" + ) @@ - total_amount: Decimal = Field( - ..., - ge=0, - decimal_places=2, - description="Total sales amount (trusted as-is from source system)", - ) + total_amount: Decimal = Field( + ..., + ge=0, + max_digits=12, + decimal_places=2, + description="Total sales amount (trusted as-is from source system)", + )app/features/ingest/tests/conftest.py (1)
12-33: Avoidordefaults so empty maps/sets are respected.Using
oroverrides intentionally empty inputs, which can limit test scenarios.🔧 Suggested fix
- self._store_map = store_map or {"S001": 1, "S002": 2} - self._product_map = product_map or {"SKU-001": 101, "SKU-002": 102, "SKU-003": 103} - self._valid_dates = valid_dates or {date(2024, 1, 15), date(2024, 1, 16)} + self._store_map = {"S001": 1, "S002": 2} if store_map is None else store_map + self._product_map = ( + {"SKU-001": 101, "SKU-002": 102, "SKU-003": 103} + if product_map is None + else product_map + ) + self._valid_dates = ( + {date(2024, 1, 15), date(2024, 1, 16)} if valid_dates is None else valid_dates + )app/features/ingest/tests/test_routes.py (1)
111-130: Mirror get_db commit/rollback and restore overrides precisely.This keeps the override aligned with app/core/database.get_db semantics and avoids wiping unrelated overrides.
🔧 Suggested update
async def override_get_db(): - yield db_session + try: + yield db_session + await db_session.commit() + except Exception: + await db_session.rollback() + raise - app.dependency_overrides[get_db] = override_get_db + previous_overrides = app.dependency_overrides.copy() + app.dependency_overrides[get_db] = override_get_db @@ - app.dependency_overrides.clear() + app.dependency_overrides.clear() + app.dependency_overrides.update(previous_overrides)app/features/ingest/service.py (1)
217-222: Consider a more explicit rowcount check.The current check
if cursor_result.rowcounttreats 0 andNoneas falsy but would pass through -1 (which means "not available" in DB-API). While PostgreSQL ON CONFLICT should always return a valid count, a more defensive check would be clearer.♻️ Suggested improvement
- processed = int(cursor_result.rowcount) if cursor_result.rowcount else 0 # type: ignore[attr-defined] # pyright: ignore[reportAttributeAccessIssue,reportUnknownMemberType,reportUnknownArgumentType] + raw_rowcount = cursor_result.rowcount # type: ignore[attr-defined] # pyright: ignore[reportAttributeAccessIssue,reportUnknownMemberType] + processed = int(raw_rowcount) if raw_rowcount and raw_rowcount > 0 else len(valid_rows)This falls back to
len(valid_rows)if rowcount is unavailable (-1),None, or 0, which is a reasonable approximation for successful upserts.Alternatively, if you prefer to keep 0 distinct:
- processed = int(cursor_result.rowcount) if cursor_result.rowcount else 0 # type: ignore[attr-defined] # pyright: ignore[reportAttributeAccessIssue,reportUnknownMemberType,reportUnknownArgumentType] + raw_rowcount = cursor_result.rowcount # type: ignore[attr-defined] # pyright: ignore[reportAttributeAccessIssue,reportUnknownMemberType] + processed = int(raw_rowcount) if raw_rowcount is not None and raw_rowcount >= 0 else len(valid_rows)
| **Error Codes:** | ||
| | Code | Description | | ||
| |------|-------------| | ||
| | `UNKNOWN_STORE` | Store code not found in database | | ||
| | `UNKNOWN_PRODUCT` | SKU not found in database | | ||
| | `UNKNOWN_DATE` | Date not found in calendar table | | ||
|
|
There was a problem hiding this comment.
Add blank lines around the Error Codes table (markdownlint MD058).
🧹 Suggested formatting fix
-**Error Codes:**
-| Code | Description |
-|------|-------------|
-| `UNKNOWN_STORE` | Store code not found in database |
-| `UNKNOWN_PRODUCT` | SKU not found in database |
-| `UNKNOWN_DATE` | Date not found in calendar table |
+**Error Codes:**
+
+| Code | Description |
+|------|-------------|
+| `UNKNOWN_STORE` | Store code not found in database |
+| `UNKNOWN_PRODUCT` | SKU not found in database |
+| `UNKNOWN_DATE` | Date not found in calendar table |
+📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| **Error Codes:** | |
| | Code | Description | | |
| |------|-------------| | |
| | `UNKNOWN_STORE` | Store code not found in database | | |
| | `UNKNOWN_PRODUCT` | SKU not found in database | | |
| | `UNKNOWN_DATE` | Date not found in calendar table | | |
| **Error Codes:** | |
| | Code | Description | | |
| |------|-------------| | |
| | `UNKNOWN_STORE` | Store code not found in database | | |
| | `UNKNOWN_PRODUCT` | SKU not found in database | | |
| | `UNKNOWN_DATE` | Date not found in calendar table | | |
🧰 Tools
🪛 markdownlint-cli2 (0.18.1)
88-88: Tables should be surrounded by blank lines
(MD058, blanks-around-tables)
🤖 Prompt for AI Agents
In `@docs/PHASE/2-INGEST_LAYER.md` around lines 87 - 93, Add a blank line before
the "**Error Codes:**" table and another blank line after the table so the
markdown table is separated from surrounding content (fixes markdownlint MD058);
edit the block containing the "**Error Codes:**" heading and the subsequent
table rows (`| Code | Description |` ... `| `UNKNOWN_DATE` | Date not found in
calendar table |`) to ensure there is an empty line above the heading/table and
an empty line following the final table row.
- Change "floating point" to "floating-point" (compound adjective) - Add blank lines around Error Codes table (MD058) - Add language identifier to directory structure code fence (MD040) - Update HTTP examples to use processed_count/total_received schema Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
w7-learn
left a comment
There was a problem hiding this comment.
Done. CodeRabbit review fixes pushed to dev. PR #21 is now updated with:
Fixed:
- docs/PHASE/2-INGEST_LAYER.md:
- "floating point" → "floating-point" (compound adjective)
- Added blank lines around Error Codes table (MD058)
- Added text language identifier to directory structure code fence (MD040) - examples/api/ingest_sales_daily.http:
- All response examples updated to use processed_count instead of inserted_count/updated_count
- All response examples updated to use total_received instead of total_processed
Summary
Merge dev to main - Phase 2 (Ingest Layer) release.
What's Included
New Features
Configuration
INGEST_BATCH_SIZE(default: 1000)INGEST_TIMEOUT_SECONDS(default: 60)Documentation
Code Quality
Test plan
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Configuration
Documentation
Tests
✏️ Tip: You can customize this high-level summary in your review settings.