Skip to content

fix : add centralized error handling and strong input validation#153

Open
RohitKumar717 wants to merge 1 commit intoAOSSIE-Org:mainfrom
RohitKumar717:fix/centralized-error-handling
Open

fix : add centralized error handling and strong input validation#153
RohitKumar717 wants to merge 1 commit intoAOSSIE-Org:mainfrom
RohitKumar717:fix/centralized-error-handling

Conversation

@RohitKumar717
Copy link

@RohitKumar717 RohitKumar717 commented Mar 18, 2026

Summary

Fixes #143 — Adds centralized error handling and strong input validation
to the FastAPI backend.

Problem

  • URLRequest accepted any string including invalid URLs
  • ChatQuery accepted empty/whitespace messages
  • Pipeline functions like run_scraper_pipeline, check_bias, ask_llm
    had no error handling, causing unhandled 500 errors
  • No consistent error response format for the frontend to parse

Changes Made

backend/app/routes/routes.py

  • Replaced url: str with url: HttpUrl in URLRequest model for
    automatic URL format validation
  • Added Field(min_length=1) and @validator to ChatQuery to reject
    empty/whitespace messages
  • Wrapped all 3 endpoints (/bias, /process, /chat) in try/except blocks
  • ValueError raised as structured 400 HTTP response
  • Other exceptions re-raised to global handler

backend/main.py

  • Added global @app.exception_handler(Exception) to catch all
    unhandled exceptions
  • Added @app.exception_handler(HTTPException) for consistent format
  • Added @app.exception_handler(RequestValidationError) returning 400
  • All errors return consistent JSON format:
    {
    "error": "",
    "code": "<error_type>",
    "trace_id": "" # for 500 errors
    }
  • Unhandled exceptions logged with unique trace IDs for debugging

Testing

  • Invalid URL (e.g. "not-a-url") → returns 400 VALIDATION_ERROR
  • Empty message (e.g. "") → returns 400 VALIDATION_ERROR
  • Whitespace message (e.g. " ") → returns 400 VALIDATION_ERROR
  • Server-side crash → returns 500 with trace_id for debugging
  • Valid inputs → work as expected, no regression

Summary by CodeRabbit

  • New Features

    • URLs now require proper format validation across all requests
    • Chat messages cannot be empty or contain only whitespace
  • Improvements

    • Error responses include structured error codes and details
    • Unhandled errors now include unique trace IDs for debugging

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 18, 2026

📝 Walkthrough

Walkthrough

The changes implement strong input validation and centralized error handling across the backend. URL fields are now validated using Pydantic's HttpUrl type, chat messages require non-empty content, and three new exception handlers (HTTP, validation, and global) standardize error responses with consistent JSON formatting and trace IDs.

Changes

Cohort / File(s) Summary
Input Validation in Routes
backend/app/routes/routes.py
Adds URLRequest model with HttpUrl validation, strengthens ChatQuery.message validation with min_length and whitespace stripping, updates bias detection and processing endpoints to accept validated URL requests, and wraps chat endpoint with error handling for ValueError.
Centralized Error Handling
backend/main.py
Introduces three exception handlers: http_exception_handler for HTTPException, validation_exception_handler for RequestValidationError (returns 400), and global_exception_handler for unhandled exceptions (logs with trace ID, returns 500). Simplifies app initialization and adds module-level logger.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 Validation strong, URLs now bright,
Error handlers catch what's not quite right,
Trace IDs logged, in JSON so neat,
Empty messages bounce—no defeat!
Our API hops safely, ever so fleet! 🐇

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main changes: adding centralized error handling and strong input validation for the FastAPI backend.
Linked Issues check ✅ Passed The PR fully addresses all coding requirements from issue #143: implements pydantic.HttpUrl for URL validation, adds Field constraints and validators for non-empty chat messages, and adds centralized exception handlers returning structured JSON responses with error codes and trace IDs.
Out of Scope Changes check ✅ Passed All changes are directly aligned with issue #143 objectives; no out-of-scope modifications were introduced beyond the required input validation and error handling implementations.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

CodeRabbit can use TruffleHog to scan for secrets in your code with verification capabilities.

Add a TruffleHog config file (e.g. trufflehog-config.yml, trufflehog.yml) to your project to customize detectors and scanning behavior. The tool runs only when a config file is present.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
backend/app/routes/routes.py (1)

74-75: Add exception chaining with from e (Ruff B904).

Preserve the original traceback by explicitly chaining exceptions when converting ValueError to HTTPException across all three affected endpoints:

  • Lines 74-75: /detect_bias route
  • Lines 87-88: /process route
  • Lines 101-102: /chat route
♻️ Proposed fix
 except ValueError as e:
-    raise HTTPException(status_code=400, detail=str(e))
+    raise HTTPException(status_code=400, detail=str(e)) from e

Apply the same change to all three locations (lines 74–75, 87–88, 101–102).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/routes/routes.py` around lines 74 - 75, Three route handlers
raise HTTPException from a caught ValueError without exception chaining; update
each except block in the /detect_bias, /process, and /chat endpoints so the
HTTPException is raised with explicit chaining (use "raise HTTPException(... )
from e") instead of plain "raise HTTPException(...)" to preserve the original
traceback; locate the except ValueError as e: blocks in
backend/app/routes/routes.py for the detect_bias, process, and chat handlers and
add "from e" to each raise.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/app/routes/routes.py`:
- Around line 69-73: The scraper can return {"cleaned_text": "", "keywords": []}
so update the route handler around the run_scraper_pipeline and check_bias calls
to (1) call asyncio.to_thread without tuple-wrapping the argument (use
asyncio.to_thread(run_scraper_pipeline, str(request.url))) and (2) explicitly
validate the returned dict's cleaned_text (e.g., ensure
result.get("cleaned_text") is a non-empty string) before calling check_bias; if
cleaned_text is empty/missing, return a deterministic 400 response. Reference
run_scraper_pipeline, check_bias, and request.url when making these changes.

In `@backend/main.py`:
- Around line 28-31: The FastAPI app instance is created but never mounts the
routes router, so endpoints defined on router in backend/app/routes/routes.py
return 404; import the router (named router) from that module and call
app.include_router(router) on the FastAPI instance (app) after logger creation
to register the endpoints, ensuring any prefix or tags used in routes.py are
preserved when including the router.

---

Nitpick comments:
In `@backend/app/routes/routes.py`:
- Around line 74-75: Three route handlers raise HTTPException from a caught
ValueError without exception chaining; update each except block in the
/detect_bias, /process, and /chat endpoints so the HTTPException is raised with
explicit chaining (use "raise HTTPException(... ) from e") instead of plain
"raise HTTPException(...)" to preserve the original traceback; locate the except
ValueError as e: blocks in backend/app/routes/routes.py for the detect_bias,
process, and chat handlers and add "from e" to each raise.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9b96bc1e-c7c8-463c-8311-2ab4de6a7718

📥 Commits

Reviewing files that changed from the base of the PR and between 68a64c3 and cbb22fe.

📒 Files selected for processing (2)
  • backend/app/routes/routes.py
  • backend/main.py

Comment on lines +69 to +73
try:
content = await asyncio.to_thread(run_scraper_pipeline, (str(request.url)))
bias_score = await asyncio.to_thread(check_bias, (content))
logger.info(f"Bias detection result: {bias_score}")
return bias_score
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify failure semantics in scraper/pipeline code paths used by /bias and /process.
rg -n -C3 'def run_scraper_pipeline|def extract\(|def _fetch_html|return ""|return \{"url": self\.url, "text": "", "error": "Failed to extract article\."\}|raise ValueError' backend/app/modules/pipeline.py backend/app/modules/scraper/extractor.py

Repository: AOSSIE-Org/Perspective

Length of output: 2427


🏁 Script executed:

# Check the full run_scraper_pipeline function
cat -n backend/app/modules/pipeline.py | head -80

Repository: AOSSIE-Org/Perspective

Length of output: 2975


🏁 Script executed:

# Check the routes.py file to see error handlers and context
cat -n backend/app/routes/routes.py

Repository: AOSSIE-Org/Perspective

Length of output: 4566


🏁 Script executed:

# Check what check_bias expects
rg -n -A10 'def check_bias' backend/

Repository: AOSSIE-Org/Perspective

Length of output: 1009


🏁 Script executed:

# Check what run_langgraph_workflow expects
rg -n -A10 'def run_langgraph_workflow' backend/

Repository: AOSSIE-Org/Perspective

Length of output: 462


🏁 Script executed:

# Check clean_extracted_text function
rg -n -A15 'def clean_extracted_text' backend/

Repository: AOSSIE-Org/Perspective

Length of output: 1238


🏁 Script executed:

# Check full check_bias function logic
cat -n backend/app/modules/bias_detection/check_bias.py | head -60

Repository: AOSSIE-Org/Perspective

Length of output: 2362


Add explicit validation for empty article content to ensure deterministic 400 handling.

The scraper pipeline can return {"cleaned_text": "", "keywords": []} on failure. Since check_bias() receives the entire dict and only checks if not text: (dict truthiness), an empty cleaned_text value bypasses validation and causes inconsistent behavior. Both endpoints must validate that cleaned_text is actually populated.

Also remove unnecessary tuple wrapping when passing str(request.url) to asyncio.to_thread().

Suggested fix
 `@router.post`("/bias")
 async def bias_detection(request: URLRequest):
     try:
-        content = await asyncio.to_thread(run_scraper_pipeline, (str(request.url)))
+        content = await asyncio.to_thread(run_scraper_pipeline, str(request.url))
+        if not content.get("cleaned_text"):
+            raise ValueError("Unable to extract article content from URL")
         bias_score = await asyncio.to_thread(check_bias, (content))
         logger.info(f"Bias detection result: {bias_score}")
         return bias_score
@@
 `@router.post`("/process")
 async def run_pipelines(request: URLRequest):
     try:
-        article_text = await asyncio.to_thread(run_scraper_pipeline, (str(request.url)))
+        article_text = await asyncio.to_thread(run_scraper_pipeline, str(request.url))
+        if not article_text.get("cleaned_text"):
+            raise ValueError("Unable to extract article content from URL")
         logger.debug(f"Scraper output: {json.dumps(article_text, indent=2, ensure_ascii=False)}")
         data = await asyncio.to_thread(run_langgraph_workflow, (article_text))
         return data
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/routes/routes.py` around lines 69 - 73, The scraper can return
{"cleaned_text": "", "keywords": []} so update the route handler around the
run_scraper_pipeline and check_bias calls to (1) call asyncio.to_thread without
tuple-wrapping the argument (use asyncio.to_thread(run_scraper_pipeline,
str(request.url))) and (2) explicitly validate the returned dict's cleaned_text
(e.g., ensure result.get("cleaned_text") is a non-empty string) before calling
check_bias; if cleaned_text is empty/missing, return a deterministic 400
response. Reference run_scraper_pipeline, check_bias, and request.url when
making these changes.

Comment on lines +28 to +31
app = FastAPI()

logger = logging.getLogger(__name__)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Register the APIRouter; otherwise all API endpoints return 404.

backend/app/routes/routes.py defines the handlers on router, but this app instance never includes that router. In the current state, the API surface is effectively unreachable.

🔧 Proposed fix
+from app.routes.routes import router
+
 app = FastAPI()
+app.include_router(router)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
app = FastAPI()
logger = logging.getLogger(__name__)
from app.routes.routes import router
app = FastAPI()
app.include_router(router)
logger = logging.getLogger(__name__)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/main.py` around lines 28 - 31, The FastAPI app instance is created
but never mounts the routes router, so endpoints defined on router in
backend/app/routes/routes.py return 404; import the router (named router) from
that module and call app.include_router(router) on the FastAPI instance (app)
after logger creation to register the endpoints, ensuring any prefix or tags
used in routes.py are preserved when including the router.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Centralized Error Handling + Strong Input Validation

1 participant