fix : add centralized error handling and strong input validation#153
fix : add centralized error handling and strong input validation#153RohitKumar717 wants to merge 1 commit intoAOSSIE-Org:mainfrom
Conversation
📝 WalkthroughWalkthroughThe changes implement strong input validation and centralized error handling across the backend. URL fields are now validated using Pydantic's HttpUrl type, chat messages require non-empty content, and three new exception handlers (HTTP, validation, and global) standardize error responses with consistent JSON formatting and trace IDs. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip CodeRabbit can use TruffleHog to scan for secrets in your code with verification capabilities.Add a TruffleHog config file (e.g. trufflehog-config.yml, trufflehog.yml) to your project to customize detectors and scanning behavior. The tool runs only when a config file is present. |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
backend/app/routes/routes.py (1)
74-75: Add exception chaining withfrom e(Ruff B904).Preserve the original traceback by explicitly chaining exceptions when converting
ValueErrortoHTTPExceptionacross all three affected endpoints:
- Lines 74-75:
/detect_biasroute- Lines 87-88:
/processroute- Lines 101-102:
/chatroute♻️ Proposed fix
except ValueError as e: - raise HTTPException(status_code=400, detail=str(e)) + raise HTTPException(status_code=400, detail=str(e)) from eApply the same change to all three locations (lines 74–75, 87–88, 101–102).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/routes/routes.py` around lines 74 - 75, Three route handlers raise HTTPException from a caught ValueError without exception chaining; update each except block in the /detect_bias, /process, and /chat endpoints so the HTTPException is raised with explicit chaining (use "raise HTTPException(... ) from e") instead of plain "raise HTTPException(...)" to preserve the original traceback; locate the except ValueError as e: blocks in backend/app/routes/routes.py for the detect_bias, process, and chat handlers and add "from e" to each raise.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/app/routes/routes.py`:
- Around line 69-73: The scraper can return {"cleaned_text": "", "keywords": []}
so update the route handler around the run_scraper_pipeline and check_bias calls
to (1) call asyncio.to_thread without tuple-wrapping the argument (use
asyncio.to_thread(run_scraper_pipeline, str(request.url))) and (2) explicitly
validate the returned dict's cleaned_text (e.g., ensure
result.get("cleaned_text") is a non-empty string) before calling check_bias; if
cleaned_text is empty/missing, return a deterministic 400 response. Reference
run_scraper_pipeline, check_bias, and request.url when making these changes.
In `@backend/main.py`:
- Around line 28-31: The FastAPI app instance is created but never mounts the
routes router, so endpoints defined on router in backend/app/routes/routes.py
return 404; import the router (named router) from that module and call
app.include_router(router) on the FastAPI instance (app) after logger creation
to register the endpoints, ensuring any prefix or tags used in routes.py are
preserved when including the router.
---
Nitpick comments:
In `@backend/app/routes/routes.py`:
- Around line 74-75: Three route handlers raise HTTPException from a caught
ValueError without exception chaining; update each except block in the
/detect_bias, /process, and /chat endpoints so the HTTPException is raised with
explicit chaining (use "raise HTTPException(... ) from e") instead of plain
"raise HTTPException(...)" to preserve the original traceback; locate the except
ValueError as e: blocks in backend/app/routes/routes.py for the detect_bias,
process, and chat handlers and add "from e" to each raise.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9b96bc1e-c7c8-463c-8311-2ab4de6a7718
📒 Files selected for processing (2)
backend/app/routes/routes.pybackend/main.py
| try: | ||
| content = await asyncio.to_thread(run_scraper_pipeline, (str(request.url))) | ||
| bias_score = await asyncio.to_thread(check_bias, (content)) | ||
| logger.info(f"Bias detection result: {bias_score}") | ||
| return bias_score |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify failure semantics in scraper/pipeline code paths used by /bias and /process.
rg -n -C3 'def run_scraper_pipeline|def extract\(|def _fetch_html|return ""|return \{"url": self\.url, "text": "", "error": "Failed to extract article\."\}|raise ValueError' backend/app/modules/pipeline.py backend/app/modules/scraper/extractor.pyRepository: AOSSIE-Org/Perspective
Length of output: 2427
🏁 Script executed:
# Check the full run_scraper_pipeline function
cat -n backend/app/modules/pipeline.py | head -80Repository: AOSSIE-Org/Perspective
Length of output: 2975
🏁 Script executed:
# Check the routes.py file to see error handlers and context
cat -n backend/app/routes/routes.pyRepository: AOSSIE-Org/Perspective
Length of output: 4566
🏁 Script executed:
# Check what check_bias expects
rg -n -A10 'def check_bias' backend/Repository: AOSSIE-Org/Perspective
Length of output: 1009
🏁 Script executed:
# Check what run_langgraph_workflow expects
rg -n -A10 'def run_langgraph_workflow' backend/Repository: AOSSIE-Org/Perspective
Length of output: 462
🏁 Script executed:
# Check clean_extracted_text function
rg -n -A15 'def clean_extracted_text' backend/Repository: AOSSIE-Org/Perspective
Length of output: 1238
🏁 Script executed:
# Check full check_bias function logic
cat -n backend/app/modules/bias_detection/check_bias.py | head -60Repository: AOSSIE-Org/Perspective
Length of output: 2362
Add explicit validation for empty article content to ensure deterministic 400 handling.
The scraper pipeline can return {"cleaned_text": "", "keywords": []} on failure. Since check_bias() receives the entire dict and only checks if not text: (dict truthiness), an empty cleaned_text value bypasses validation and causes inconsistent behavior. Both endpoints must validate that cleaned_text is actually populated.
Also remove unnecessary tuple wrapping when passing str(request.url) to asyncio.to_thread().
Suggested fix
`@router.post`("/bias")
async def bias_detection(request: URLRequest):
try:
- content = await asyncio.to_thread(run_scraper_pipeline, (str(request.url)))
+ content = await asyncio.to_thread(run_scraper_pipeline, str(request.url))
+ if not content.get("cleaned_text"):
+ raise ValueError("Unable to extract article content from URL")
bias_score = await asyncio.to_thread(check_bias, (content))
logger.info(f"Bias detection result: {bias_score}")
return bias_score
@@
`@router.post`("/process")
async def run_pipelines(request: URLRequest):
try:
- article_text = await asyncio.to_thread(run_scraper_pipeline, (str(request.url)))
+ article_text = await asyncio.to_thread(run_scraper_pipeline, str(request.url))
+ if not article_text.get("cleaned_text"):
+ raise ValueError("Unable to extract article content from URL")
logger.debug(f"Scraper output: {json.dumps(article_text, indent=2, ensure_ascii=False)}")
data = await asyncio.to_thread(run_langgraph_workflow, (article_text))
return data🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/routes/routes.py` around lines 69 - 73, The scraper can return
{"cleaned_text": "", "keywords": []} so update the route handler around the
run_scraper_pipeline and check_bias calls to (1) call asyncio.to_thread without
tuple-wrapping the argument (use asyncio.to_thread(run_scraper_pipeline,
str(request.url))) and (2) explicitly validate the returned dict's cleaned_text
(e.g., ensure result.get("cleaned_text") is a non-empty string) before calling
check_bias; if cleaned_text is empty/missing, return a deterministic 400
response. Reference run_scraper_pipeline, check_bias, and request.url when
making these changes.
| app = FastAPI() | ||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
There was a problem hiding this comment.
Register the APIRouter; otherwise all API endpoints return 404.
backend/app/routes/routes.py defines the handlers on router, but this app instance never includes that router. In the current state, the API surface is effectively unreachable.
🔧 Proposed fix
+from app.routes.routes import router
+
app = FastAPI()
+app.include_router(router)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| app = FastAPI() | |
| logger = logging.getLogger(__name__) | |
| from app.routes.routes import router | |
| app = FastAPI() | |
| app.include_router(router) | |
| logger = logging.getLogger(__name__) | |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/main.py` around lines 28 - 31, The FastAPI app instance is created
but never mounts the routes router, so endpoints defined on router in
backend/app/routes/routes.py return 404; import the router (named router) from
that module and call app.include_router(router) on the FastAPI instance (app)
after logger creation to register the endpoints, ensuring any prefix or tags
used in routes.py are preserved when including the router.
Summary
Fixes #143 — Adds centralized error handling and strong input validation
to the FastAPI backend.
Problem
had no error handling, causing unhandled 500 errors
Changes Made
backend/app/routes/routes.pyurl: strwithurl: HttpUrlin URLRequest model forautomatic URL format validation
Field(min_length=1)and@validatorto ChatQuery to rejectempty/whitespace messages
backend/main.py@app.exception_handler(Exception)to catch allunhandled exceptions
@app.exception_handler(HTTPException)for consistent format@app.exception_handler(RequestValidationError)returning 400{
"error": "",
"code": "<error_type>",
"trace_id": "" # for 500 errors
}
Testing
Summary by CodeRabbit
New Features
Improvements