feat: add external provider mode for quiz generation by lakshsarda137 · Pull Request #581 · AOSSIE-Org/EduAid

lakshsarda137 · 2026-03-16T01:03:39Z

Description
This PR adds optional external provider support (OpenAI/Anthropic) for quiz generation while preserving existing local/default generation behavior.

What changed

Added extension UI controls to:

toggle between local/default and external provider modes
enter provider API key in extension UI
select provider (openai / anthropic)
choose model (OpenAI curated list, Anthropic loaded dynamically)

Routed backend generation endpoints to external providers when enabled:

get_mcq, get_boolq, get_shortq, get_problems

Preserved local generation paths when external mode is disabled.
Improved external MCQ quality:

sanitize/dedupe questions and options
remove answer leakage into options
retry once with stricter prompt when MCQ quality is weak

Removed storage logging that could expose sensitive values in extension console logs.

Addressed Issues
Fixes #580

Screenshots/Recordings

Option to select provider and enter credentials

AI Usage Disclosure
[ ] This PR does not contain AI-generated code at all.
[x] This PR contains AI-generated code. I have read the AI Usage Policy and this PR complies with this policy. I have tested the code locally and I am responsible for it.
I have used the following AI models and tools:
Cursor
ChatGPT

Checklist
[x] My PR addresses a single issue, fixes a single bug or makes a single improvement.
[x] My code follows the project's code style and conventions
[x] If applicable, I have made corresponding changes or additions to the documentation
[x] If applicable, I have made corresponding changes or additions to tests
[x] My changes generate no new warnings or errors
[x] I have joined the Discord server and I will share a link to this PR with the project maintainers there
[x] I have read the Contribution Guidelines
[x] Once I submit my PR, CodeRabbit AI will automatically review it and I will address CodeRabbit's comments.
[x] I have filled this PR template completely and carefully, and I understand that my PR may be closed without review otherwise.

Summary by CodeRabbit

New Features
- LLM-powered question generation (OpenAI & Anthropic) with per-endpoint llm_used reporting and combined problem output.
- External LLM configuration UI: provider, model selection, API key, model loading, and toggle to enable external mode.
- Short-answer question display alongside MCQ and boolean questions.
Bug Fixes
- Safer handling and normalization of stored question data to avoid runtime issues.
Chores
- Added OpenAI/Anthropic SDKs and expanded extension host permissions.
Documentation
- README note describing External provider mode and API key behavior.

Add optional OpenAI/Anthropic provider usage in extension UI and route backend generation endpoints through external providers when enabled while preserving local generation defaults. Improve external MCQ output quality and fix all-questions parsing/runtime issues. Made-with: Cursor

coderabbitai · 2026-03-16T01:04:03Z

Warning

Rate limit exceeded

@lakshsarda137 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 14 minutes and 3 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 79266813-7ca8-4cbf-9187-7670ee407d09

📥 Commits

Reviewing files that changed from the base of the PR and between b3728d7 and 224d068.

📒 Files selected for processing (2)

backend/server.py
extension/src/pages/text_input/TextInput.jsx

📝 Walkthrough

Walkthrough

Adds optional external LLM support (OpenAI, Anthropic): backend LLM orchestration, JSON extraction, normalization, deduplication, MCQ quality scoring and retrying; extension UI for provider toggle, API key, model loading/selection, storage persistence; dependency and manifest updates; local generation remains fallback.

Changes

Cohort / File(s)	Summary
Backend LLM Generation `backend/server.py`	Adds LLM orchestration utilities: JSON extraction, text normalization, deduplication, MCQ quality scoring, retry heuristics, and `_call_llm_for_questions`. Routes `/get_mcq`, `/get_boolq`, `/get_shortq`, `/get_problems` accept `llm_provider`, `llm_model`, `llm_api_key` and return `llm_used` with per-endpoint LLM error handling; falls back to existing generators when LLM not provided.
Backend config & ignores `requirements.txt`, `.gitignore`	Adds Python deps `openai` and `anthropic` in `requirements.txt` and adds `backend/s2v_reddit_2015_md.tar.gz` to `.gitignore`.
Extension dependencies & permissions `extension/package.json`, `extension/public/manifest.json`	Adds `@anthropic-ai/sdk` and `openai` JS deps; expands `host_permissions` to include `http://127.0.0.1:5000/`, `http://localhost:5000/`, `https://api.openai.com/`, `https://api.anthropic.com/`.
Extension UI: LLM config & flow `extension/src/pages/text_input/TextInput.jsx`, `README.md`	Implements external LLM UI: provider toggle, provider selector, API key input, Load Models button, model fetching (OpenAI/Anthropic), model selector, storage persistence, backend calls including `llm_*` fields, and surfaced backend/provider errors. README documents external provider mode behavior.
Extension question handling `extension/src/pages/question/Question.jsx`	Tightens guards using `Array.isArray`, adds structured handling for `output_shortq`, normalizes question/qaPair shapes, and adjusts MCQ/Bool/Short processing paths to prefer structured LLM output.
Extension background behavior `extension/public/background.js`	Removes `chrome.storage.onChanged` listener that propagated storage changes to `chrome.storage.local` (deleted logging/propagation block).

Sequence Diagram

sequenceDiagram
    actor User
    participant Extension as Extension UI
    participant Storage as Chrome\ Storage
    participant Backend as Backend\ Server
    participant Provider as LLM\ Provider\n(OpenAI / Anthropic)

    User->>Extension: Enable external LLM mode, enter API key, select provider/model
    Extension->>Storage: Save provider, API key, selected model
    User->>Extension: Click "Load Models"
    Extension->>Provider: Fetch model list (using API key)
    Provider-->>Extension: Return model list
    Extension->>Storage: Cache model options
    User->>Extension: Generate questions
    Extension->>Backend: POST /get_* with llm_provider, llm_model, llm_api_key
    Backend->>Provider: Call provider API to generate JSON output
    Provider-->>Backend: Return generated JSON
    Backend->>Backend: Extract JSON, normalize, dedupe, score, retry if needed
    Backend-->>Extension: Respond {"output": ..., "llm_used": true}
    Extension->>User: Display parsed questions

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐰 I nibble keys and hop through code,
I fetch bright models down the road.
I prune, dedupe, and stitch each clue,
Then hand you quizzes fresh and new.
Hooray for prompts and processes too!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately and concisely describes the main feature added: external provider mode for quiz generation via LLM providers.
Linked Issues check	✅ Passed	All major objectives from issue `#580` are met: external provider UI toggle/selection/key input/model selection [`#580`], backend routing for generation endpoints with external provider support [`#580`], local fallback preservation [`#580`], MCQ quality improvements including deduplication/answer leakage prevention [`#580`], and secure local API key storage [`#580`].
Out of Scope Changes check	✅ Passed	Changes are narrowly scoped to implementing external provider mode: UI controls, backend routing, LLM orchestration utilities, dependencies, and removal of storage logging for security. The addition of gitignore entry for backend tarball is minor and unrelated but not problematic.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (5)

extension/src/pages/text_input/TextInput.jsx (1)
325-338: Difficulty selection is ignored when using external providers.

When useExternalLlm is true, the function always returns the base questionType endpoint, ignoring the difficulty setting. Users can still select "Medium Difficulty" or "Hard Difficulty" in the UI, but it has no effect with external providers. Consider either:

Disabling the difficulty selector when external LLM is enabled, or

Passing difficulty to the backend for inclusion in the LLM prompt.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@extension/src/pages/text_input/TextInput.jsx` around lines 325 - 338,
getEndpoint currently returns questionType immediately when useExternalLlm is
true, ignoring difficulty; update behavior so difficulty is respected when
external providers are used. Either (A) disable/hide the difficulty selector in
the UI when useExternalLlm is true (update the component state/JSX where the
selector is rendered and reference useExternalLlm to conditionally
render/disable it), or (B) change getEndpoint and the request-building code to
include difficulty (pass the difficulty value alongside questionType to the
backend or include it in the payload used for external LLM prompts) so the
backend can adjust prompts; locate getEndpoint, useExternalLlm, and the code
that sends the questionType payload and modify accordingly.
extension/src/pages/question/Question.jsx (1)
90-97: Use strict equality (===) for consistency.

Line 90 uses loose equality == while line 78 uses strict equality ===. For consistency and to avoid type coercion surprises, prefer ===.
♻️ Suggested fix
-      if (questionType == "get_boolq" && Array.isArray(qaPairsFromStorage["output"])) {
+      if (questionType === "get_boolq" && Array.isArray(qaPairsFromStorage["output"])) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@extension/src/pages/question/Question.jsx` around lines 90 - 97, Change the
loose equality check in the Question.jsx branch to use strict equality: replace
the `if (questionType == "get_boolq" &&
Array.isArray(qaPairsFromStorage["output"]))` condition so it uses `===` for
`questionType` (i.e., compare `questionType === "get_boolq"`); this keeps
behavior consistent with other checks such as the one using `questionType !==
"get_mcq"` and avoids type-coercion issues with variables `questionType`,
`qaPairsFromStorage`, and `combinedQaPairs`.
backend/server.py (2)
297-305: Consider more specific exception handling for LLM calls.

Catching bare Exception is broad. Consider catching specific exceptions like openai.AuthenticationError, openai.RateLimitError, or anthropic.AuthenticationError to provide more targeted error messages to users (e.g., "Invalid API key" vs "Rate limit exceeded").
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/server.py` around lines 297 - 305, The current broad except in the
block that calls _call_llm_for_questions should be replaced with targeted
exception handling: catch provider-specific exceptions (e.g.,
openai.AuthenticationError, openai.RateLimitError, OpenAIError if available, and
anthropic.AuthenticationError / RateLimit or their SDK equivalents) and return
descriptive JSON errors (e.g., "Invalid API key" or "Rate limit exceeded") using
the same llm_used flag; after those specific except clauses add a final broad
Exception to handle unexpected errors and return a generic message. Update the
try/except around the _call_llm_for_questions call and ensure you import the
specific exception classes from the respective SDKs so you can map each to a
clear error response while preserving the existing response shape.
173-173: Silent input truncation may surprise users.

Input text is silently truncated to 12,000 characters. For very long documents, users may not understand why generated questions don't cover all content. Consider adding a truncated flag to the response or logging when truncation occurs.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/server.py` at line 173, The code silently truncates input_text into
trimmed_input = input_text[:12000]; change this to detect truncation (e.g.,
truncated = len(input_text) > 12000), keep trimmed_input as the shortened
string, and propagate the truncated flag into the function's response payload
(and optionally a server log/processLogger.warn) so callers can see when
truncation occurred. Locate the assignment to trimmed_input and the
response-building code in server.py (where trimmed_input is used) and add the
truncated boolean to the returned JSON object or response structure.
extension/package.json (1)
12-13: Document browser-side API security implications for users.

Both dependencies are at their latest versions (0.78.0 and 6.29.0). The dangerouslyAllowBrowser: true flag in TextInput.jsx shows awareness of the security model, but documentation should explicitly clarify for users that API keys are stored locally and transmitted directly from the browser.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@extension/package.json` around lines 12 - 13, Add documentation clarifying
browser-side API security implications: update the extension README or user docs
to state that using the included "@anthropic-ai/sdk" and "openai" client
libraries with dangerouslyAllowBrowser: true (referenced in TextInput.jsx)
causes API keys to be stored locally and sent directly from the user's browser,
potentially exposing them to client-side attacks; describe risks (key
exfiltration, CSRF, XSS), recommend mitigations (use short-lived keys, restrict
key scopes/IPs, prefer a server-side proxy for production, secure storage
practices, and CORS protections), and show how to opt for a safer server-proxy
setup instead of embedding keys in the browser.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/server.py`:
- Around line 228-258: The except block that swallows all Exceptions when
parsing LLM completions should be narrowed and should re-raise the original chat
exception with proper chaining; replace the broad "except Exception:" around the
parsing/sanitizing logic (which wraps calls to client.completions.create,
_extract_json_object, _sanitize_llm_output, _should_retry_mcq, and
_mcq_quality_score) with more specific exceptions (e.g., ValueError,
json.JSONDecodeError, or a custom parsing/validation exception you already use)
and when propagating the original chat exception use "raise chat_exc from None"
(or "raise chat_exc from <orig_exc>" if you want to preserve the original
parsing exception) instead of the current unchained "raise chat_exc".

In `@extension/public/manifest.json`:
- Around line 13-17: The manifest's host_permissions entry currently allows
http://127.0.0.1:8000/* but the backend runs on port 5000 and frontend API calls
target http://localhost:5000; update extension/public/manifest.json to replace
the 8000 host permission with the correct port (5000) so it matches
backend/server.py's app.run() and the API calls in TextInput.jsx, and ensure any
other localhost host_permissions (127.0.0.1 vs localhost) cover the URLs used by
the code.

In `@extension/src/pages/text_input/TextInput.jsx`:
- Around line 12-20: Update the OPENAI_HARDCODED_MODELS array in TextInput.jsx
by removing the non-existent entries "gpt-5.2" and "gpt-5.3" so only valid
models like "gpt-4", "gpt-4o", "gpt-4o-mini", "gpt-5.1", and "gpt-5.4" remain;
locate the OPENAI_HARDCODED_MODELS constant and delete those two strings to
prevent passing unsupported model names to the API.

---

Nitpick comments:
In `@backend/server.py`:
- Around line 297-305: The current broad except in the block that calls
_call_llm_for_questions should be replaced with targeted exception handling:
catch provider-specific exceptions (e.g., openai.AuthenticationError,
openai.RateLimitError, OpenAIError if available, and
anthropic.AuthenticationError / RateLimit or their SDK equivalents) and return
descriptive JSON errors (e.g., "Invalid API key" or "Rate limit exceeded") using
the same llm_used flag; after those specific except clauses add a final broad
Exception to handle unexpected errors and return a generic message. Update the
try/except around the _call_llm_for_questions call and ensure you import the
specific exception classes from the respective SDKs so you can map each to a
clear error response while preserving the existing response shape.
- Line 173: The code silently truncates input_text into trimmed_input =
input_text[:12000]; change this to detect truncation (e.g., truncated =
len(input_text) > 12000), keep trimmed_input as the shortened string, and
propagate the truncated flag into the function's response payload (and
optionally a server log/processLogger.warn) so callers can see when truncation
occurred. Locate the assignment to trimmed_input and the response-building code
in server.py (where trimmed_input is used) and add the truncated boolean to the
returned JSON object or response structure.

In `@extension/package.json`:
- Around line 12-13: Add documentation clarifying browser-side API security
implications: update the extension README or user docs to state that using the
included "@anthropic-ai/sdk" and "openai" client libraries with
dangerouslyAllowBrowser: true (referenced in TextInput.jsx) causes API keys to
be stored locally and sent directly from the user's browser, potentially
exposing them to client-side attacks; describe risks (key exfiltration, CSRF,
XSS), recommend mitigations (use short-lived keys, restrict key scopes/IPs,
prefer a server-side proxy for production, secure storage practices, and CORS
protections), and show how to opt for a safer server-proxy setup instead of
embedding keys in the browser.

In `@extension/src/pages/question/Question.jsx`:
- Around line 90-97: Change the loose equality check in the Question.jsx branch
to use strict equality: replace the `if (questionType == "get_boolq" &&
Array.isArray(qaPairsFromStorage["output"]))` condition so it uses `===` for
`questionType` (i.e., compare `questionType === "get_boolq"`); this keeps
behavior consistent with other checks such as the one using `questionType !==
"get_mcq"` and avoids type-coercion issues with variables `questionType`,
`qaPairsFromStorage`, and `combinedQaPairs`.

In `@extension/src/pages/text_input/TextInput.jsx`:
- Around line 325-338: getEndpoint currently returns questionType immediately
when useExternalLlm is true, ignoring difficulty; update behavior so difficulty
is respected when external providers are used. Either (A) disable/hide the
difficulty selector in the UI when useExternalLlm is true (update the component
state/JSX where the selector is rendered and reference useExternalLlm to
conditionally render/disable it), or (B) change getEndpoint and the
request-building code to include difficulty (pass the difficulty value alongside
questionType to the backend or include it in the payload used for external LLM
prompts) so the backend can adjust prompts; locate getEndpoint, useExternalLlm,
and the code that sends the questionType payload and modify accordingly.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: cbb547e9-0a76-4582-aef8-9c50467db1c4

📥 Commits

Reviewing files that changed from the base of the PR and between fc3bf1a and 1404eba.

⛔ Files ignored due to path filters (1)

extension/package-lock.json is excluded by !**/package-lock.json

📒 Files selected for processing (8)

.gitignore
backend/server.py
extension/package.json
extension/public/background.js
extension/public/manifest.json
extension/src/pages/question/Question.jsx
extension/src/pages/text_input/TextInput.jsx
requirements.txt

💤 Files with no reviewable changes (1)

extension/public/background.js

lakshsarda137 · 2026-03-16T01:25:13Z

Pushed fixes from all 3 actionable comments + listed nitpicks

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

extension/src/pages/question/Question.jsx (1)

53-76: ⚠️ Potential issue | 🟠 Major

Prevent duplicate question ingestion when both response shapes exist.

Line [53]-Line [76] parse structured keys, but Line [78]-Line [108] can also parse legacy output in the same pass. If backend returns both (common during schema transition), users will see duplicated questions.

Suggested fix (prefer structured shape, fallback to legacy only when missing)

+      const hasStructuredBool = Array.isArray(
+        qaPairsFromStorage?.output_boolq?.Boolean_Questions
+      );
+      const hasStructuredMcq = Array.isArray(
+        qaPairsFromStorage?.output_mcq?.questions
+      );
+      const hasStructuredShort = Array.isArray(
+        qaPairsFromStorage?.output_shortq?.questions
+      );
+
-      if (Array.isArray(qaPairsFromStorage?.output_boolq?.Boolean_Questions)) {
+      if (hasStructuredBool) {
         qaPairsFromStorage["output_boolq"]["Boolean_Questions"].forEach(
           (question, index) => {
             combinedQaPairs.push({
               question,
               question_type: "Boolean",
               context: qaPairsFromStorage["output_boolq"]["Text"],
             });
           }
         );
       }

-      if (Array.isArray(qaPairsFromStorage?.output_mcq?.questions)) {
+      if (hasStructuredMcq) {
         qaPairsFromStorage["output_mcq"]["questions"].forEach((qaPair) => {
           combinedQaPairs.push({
             question: qaPair.question_statement,
             question_type: "MCQ",
             options: qaPair.options,
             answer: qaPair.answer,
             context: qaPair.context,
           });
         });
       }

-      if (Array.isArray(qaPairsFromStorage?.output_shortq?.questions)) {
+      if (hasStructuredShort) {
         qaPairsFromStorage["output_shortq"]["questions"].forEach((qaPair) => {
           combinedQaPairs.push({
             question:
               qaPair.question || qaPair.question_statement || qaPair.Question,
             options: qaPair.options,
             answer: qaPair.answer || qaPair.Answer,
             context: qaPair.context,
             question_type: "Short",
           });
         });
       }

-      if (questionType === "get_mcq" && Array.isArray(qaPairsFromStorage["output"])) {
+      if (
+        questionType === "get_mcq" &&
+        !hasStructuredMcq &&
+        Array.isArray(qaPairsFromStorage["output"])
+      ) {
         qaPairsFromStorage["output"].forEach((qaPair) => {
           combinedQaPairs.push({
             question: qaPair.question_statement,
             question_type: "MCQ",
             options: qaPair.options,
             answer: qaPair.answer,
             context: qaPair.context,
           });
         });
       }

-      if (questionType === "get_boolq" && Array.isArray(qaPairsFromStorage["output"])) {
+      if (
+        questionType === "get_boolq" &&
+        !hasStructuredBool &&
+        Array.isArray(qaPairsFromStorage["output"])
+      ) {
         qaPairsFromStorage["output"].forEach((qaPair) => {
           combinedQaPairs.push({
             question: qaPair,
             question_type: "Boolean",
           });
         });
-      } else if (Array.isArray(qaPairsFromStorage["output"]) && questionType !== "get_mcq") {
+      } else if (
+        !hasStructuredShort &&
+        Array.isArray(qaPairsFromStorage["output"]) &&
+        questionType !== "get_mcq" &&
+        questionType !== "get_boolq"
+      ) {
         qaPairsFromStorage["output"].forEach((qaPair) => {
           combinedQaPairs.push({
             question:
               qaPair.question || qaPair.question_statement || qaPair.Question,
             options: qaPair.options,
             answer: qaPair.answer || qaPair.Answer,
             context: qaPair.context,
             question_type: "Short",
           });
         });
       }

Also applies to: 78-108

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@extension/src/pages/question/Question.jsx` around lines 53 - 76, When
building combinedQaPairs from qaPairsFromStorage, prefer the structured shapes
output_mcq and output_shortq and skip ingesting legacy output items if the
corresponding structured arrays already provided questions; update the logic
around combinedQaPairs population (the blocks referencing
qaPairsFromStorage["output_mcq"], qaPairsFromStorage["output_shortq"], and the
later legacy parsing of qaPairsFromStorage["output"]) to check for existing
entries (e.g., if combinedQaPairs already contains items for that question_type
or if output_mcq/output_shortq arrays are present and non-empty) and only fall
back to parsing legacy output when the structured arrays are absent or empty so
duplicates are not added.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/server.py`:
- Around line 84-142: The function _sanitize_llm_output currently assumes
llm_output is a mapping and calls llm_output.get(...) which will raise
AttributeError if llm_output is None or a non-dict; add a top-level type guard
at the start of _sanitize_llm_output to verify llm_output is an instance of dict
(or mapping) and return a safe empty structure matching the requested
question_type (e.g., {"questions": []} or {"Boolean_Questions": []}) when the
check fails; ensure this guard runs before any _norm_text or .get() calls so
downstream logic (get_mcq, get_boolq, get_shortq) always receives a well-typed
llm_output.
- Around line 223-285: The OpenAI and Anthropic SDK calls in
_call_llm_for_questions (notably run_openai_json and the fallback
client.completions.create usage, and run_anthropic_json / Anthropic
client.messages.create) can raise various SDK exceptions beyond APIError and are
currently uncaught, causing 500s; wrap OpenAI interactions to catch broad
SDK/network exceptions (e.g., APIError, OpenAIError, RequestExceptions) and map
them to a controlled 400 error with a helpful message, and add a try/except
around the Anthropic block (around client = Anthropic(...) and
run_anthropic_json invocation) to catch Anthropic SDK/network exceptions and
similarly raise/return a 400-friendly error; ensure any JSON parsing still
raises the original chat_exc where appropriate and preserve existing retry logic
(refer to _should_retry_mcq, _sanitize_llm_output, and _mcq_quality_score when
adding the handlers).

In `@extension/src/pages/question/Question.jsx`:
- Around line 90-95: When handling the boolean fallback in Question.jsx, don't
assume each item in qaPairsFromStorage["output"] is a string; normalize qaPair
before pushing to combinedQaPairs. Replace the direct question: qaPair with a
normalized value like: if typeof qaPair === "string" use it, else try
qaPair?.question or qaPair?.text, and if those are absent fall back to
JSON.stringify(qaPair) or skip empty values; ensure this normalizedQuestion is
non-empty before pushing and keep question_type as "Boolean". Use the existing
variables questionType, qaPairsFromStorage["output"], and combinedQaPairs to
locate and update the loop.

In `@extension/src/pages/text_input/TextInput.jsx`:
- Around line 250-260: The current checks in TextInput.jsx use useExternalLlm &&
!apiKey.trim() and useExternalLlm && !selectedModel to setBackendError,
setLoading(false) and return, which prevents the local/default fallback request
from running; instead, remove the early returns and change the logic to only
skip or abort the external-request path when useExternalLlm is true but apiKey
or selectedModel is missing (i.e., setBackendError and setLoading(false) but
continue to dispatch the local/fallback request), updating the external-request
branch (the code that uses useExternalLlm, apiKey, selectedModel) to be
conditional so fallback is executed; apply the same change to the analogous
checks around the other block that uses the same symbols.

---

Outside diff comments:
In `@extension/src/pages/question/Question.jsx`:
- Around line 53-76: When building combinedQaPairs from qaPairsFromStorage,
prefer the structured shapes output_mcq and output_shortq and skip ingesting
legacy output items if the corresponding structured arrays already provided
questions; update the logic around combinedQaPairs population (the blocks
referencing qaPairsFromStorage["output_mcq"],
qaPairsFromStorage["output_shortq"], and the later legacy parsing of
qaPairsFromStorage["output"]) to check for existing entries (e.g., if
combinedQaPairs already contains items for that question_type or if
output_mcq/output_shortq arrays are present and non-empty) and only fall back to
parsing legacy output when the structured arrays are absent or empty so
duplicates are not added.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1364eb7a-ea96-462b-90b0-0ee2885623ed

📥 Commits

Reviewing files that changed from the base of the PR and between 1404eba and 9fbc992.

📒 Files selected for processing (5)

README.md
backend/server.py
extension/public/manifest.json
extension/src/pages/question/Question.jsx
extension/src/pages/text_input/TextInput.jsx

✅ Files skipped from review due to trivial changes (1)

README.md

Harden external LLM output handling for malformed payloads and provider SDK failures, improve boolean fallback normalization in question parsing, and preserve local fallback behavior when external config is incomplete. Made-with: Cursor

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

backend/server.py (1)
441-463: ⚠️ Potential issue | 🔴 Critical

Response structures in get_problems are inconsistent between LLM and local paths.

The local path returns raw generator outputs that include extra fields:

MCQ: {"statement": str, "questions": [...], "time_taken": float}

BoolQ: {"Text": str, "Count": int, "Boolean_Questions": [...]}

ShortQ: {"statement": str, "questions": [...]}

The LLM path returns only schema-hinted fields:

MCQ: {"questions": [...]}

BoolQ: {"Boolean_Questions": [...]} (missing required "Text" field)

ShortQ: {"questions": [...]}

The frontend requires both "Boolean_Questions" and "Text" in the BoolQ response (see lines 28, 33, 45, 50 in SidePanel.jsx and Question.jsx). The LLM's schema hint omits the "Text" field entirely, which will cause the LLM path to break the frontend.

Normalize both paths to return only the fields the frontend expects, ensuring BoolQ includes the input text:
     output1 = MCQGen.generate_mcq(
         {"input_text": input_text, "max_questions": max_questions_mcq}
     )
     output2 = BoolQGen.generate_boolq(
         {"input_text": input_text, "max_questions": max_questions_boolq}
     )
     output3 = ShortQGen.generate_shortq(
         {"input_text": input_text, "max_questions": max_questions_shortq}
     )
     return jsonify(
-        {"output_mcq": output1, "output_boolq": output2, "output_shortq": output3, "llm_used": False}
+        {
+            "output_mcq": {"questions": output1.get("questions", [])},
+            "output_boolq": {"Boolean_Questions": output2.get("Boolean_Questions", []), "Text": input_text},
+            "output_shortq": {"questions": output3.get("questions", [])},
+            "llm_used": False,
+        }
     )
Also fix the LLM path to include the "Text" field for BoolQ responses to match frontend expectations.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/server.py` around lines 441 - 463, The get_problems endpoint returns
different response shapes between the LLM path (try/except) and the local
generator path (calls to MCQGen.generate_mcq, BoolQGen.generate_boolq,
ShortQGen.generate_shortq); normalize both to the frontend-required schemas: MCQ
must return only {"questions": [...]}, BoolQ must return {"Text": <input_text>,
"Boolean_Questions": [...]}, and ShortQ must return only {"questions": [...]};
update the LLM branch (where output_mcq/output_boolq/output_shortq are
assembled) to inject/rename fields to these exact keys (remove "time_taken",
"statement", "Count", etc.) and update the local path outputs (after
MCQGen.generate_mcq, BoolQGen.generate_boolq, ShortQGen.generate_shortq) to map
their raw outputs into the same shapes (use the request's input_text for BoolQ
"Text"), and keep the existing "llm_used" boolean behavior.

🧹 Nitpick comments (3)

backend/server.py (2)

276-295: Add clarifying comment for the fallback exception handling.

Line 293 catches a blind Exception, which static analysis flags. The intent is clear (fallback if SDK doesn't export specific exceptions), but a brief comment would help future maintainers understand this is deliberate.
🔧 Suggested fix
-        except Exception:
+        except Exception:  # Fallback if SDK version lacks these exception classes
             anthropic_errors = (Exception,)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/server.py` around lines 276 - 295, The fallback in the
anthropic-specific import block currently catches a blind Exception to set
anthropic_errors = (Exception,), which static analysis flags; update the except
block in the llm_provider == "anthropic" section (the try/except that sets
anthropic_errors) to include a short clarifying comment stating this is an
intentional fallback for older/partial SDKs that don’t export the named
exception classes so maintainers and linters understand the catch is deliberate
and safe.
177-179: Consider surfacing truncation info to the user.

Input text is silently truncated to 12,000 characters with only a server-side log. Users with longer documents won't know their content was cut, potentially leading to incomplete quiz coverage.

Consider returning a truncated: true flag or warning in the response so the frontend can inform users.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/server.py` around lines 177 - 179, The handler currently silently
truncates input_text to trimmed_input when len(input_text) > 12000 and only logs
a server-side warning; update the request handler that produces the response
JSON to include a truncated boolean (e.g., "truncated": True) or a warning
message whenever len(input_text) > 12000 so the frontend can show users that
their input was cut, while keeping the existing app.logger.warning call; locate
the truncation logic around the input_text/trimmed_input variables and add the
truncated flag into the response payload returned by that endpoint.

extension/src/pages/text_input/TextInput.jsx (1)

536-544: Consider adding error handling for async storage write.

The inline async handler in onChange calls setChromeStorage without a try/catch. While storage failures are rare, unhandled rejections could cause silent failures.

🔧 Suggested fix

               onChange={async (e) => {
                 const model = e.target.value;
                 setSelectedModel(model);
-                await setChromeStorage({ llmModel: model });
+                try {
+                  await setChromeStorage({ llmModel: model });
+                } catch (err) {
+                  console.error("Failed to persist model selection:", err);
+                }
               }}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@extension/src/pages/text_input/TextInput.jsx` around lines 536 - 544, The
onChange handler for the <select> sets selectedModel and awaits setChromeStorage
without error handling; wrap the await call to setChromeStorage inside a
try/catch in the onChange async function (referencing selectedModel,
setSelectedModel, and setChromeStorage) and handle failures by reverting state
or showing a user-visible error (e.g., reset selectedModel to previous value or
call an error notification) so unhandled promise rejections are avoided and the
UI remains consistent.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@extension/src/pages/text_input/TextInput.jsx`:
- Around line 248-257: The current logic uses setBackendError to show an error
when external config is incomplete (in the block that checks useExternalLlm,
apiKey, and selectedModel) even though the app falls back to local generation;
change this to show a non-error informational/warning state instead of an error:
replace calls to setBackendError for the incomplete-external-config cases (the
checks using useExternalLlm, apiKey.trim(), and selectedModel) with a call that
sets an informational/warning message (e.g., setBackendWarning or
setBackendInfo) that clearly states “External config incomplete — falling back
to local generation” and only call setBackendError when you actually abort the
operation; keep hasExternalConfig logic intact so fallback proceeds.

---

Outside diff comments:
In `@backend/server.py`:
- Around line 441-463: The get_problems endpoint returns different response
shapes between the LLM path (try/except) and the local generator path (calls to
MCQGen.generate_mcq, BoolQGen.generate_boolq, ShortQGen.generate_shortq);
normalize both to the frontend-required schemas: MCQ must return only
{"questions": [...]}, BoolQ must return {"Text": <input_text>,
"Boolean_Questions": [...]}, and ShortQ must return only {"questions": [...]};
update the LLM branch (where output_mcq/output_boolq/output_shortq are
assembled) to inject/rename fields to these exact keys (remove "time_taken",
"statement", "Count", etc.) and update the local path outputs (after
MCQGen.generate_mcq, BoolQGen.generate_boolq, ShortQGen.generate_shortq) to map
their raw outputs into the same shapes (use the request's input_text for BoolQ
"Text"), and keep the existing "llm_used" boolean behavior.

---

Nitpick comments:
In `@backend/server.py`:
- Around line 276-295: The fallback in the anthropic-specific import block
currently catches a blind Exception to set anthropic_errors = (Exception,),
which static analysis flags; update the except block in the llm_provider ==
"anthropic" section (the try/except that sets anthropic_errors) to include a
short clarifying comment stating this is an intentional fallback for
older/partial SDKs that don’t export the named exception classes so maintainers
and linters understand the catch is deliberate and safe.
- Around line 177-179: The handler currently silently truncates input_text to
trimmed_input when len(input_text) > 12000 and only logs a server-side warning;
update the request handler that produces the response JSON to include a
truncated boolean (e.g., "truncated": True) or a warning message whenever
len(input_text) > 12000 so the frontend can show users that their input was cut,
while keeping the existing app.logger.warning call; locate the truncation logic
around the input_text/trimmed_input variables and add the truncated flag into
the response payload returned by that endpoint.

In `@extension/src/pages/text_input/TextInput.jsx`:
- Around line 536-544: The onChange handler for the <select> sets selectedModel
and awaits setChromeStorage without error handling; wrap the await call to
setChromeStorage inside a try/catch in the onChange async function (referencing
selectedModel, setSelectedModel, and setChromeStorage) and handle failures by
reverting state or showing a user-visible error (e.g., reset selectedModel to
previous value or call an error notification) so unhandled promise rejections
are avoided and the UI remains consistent.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 453163b7-0e31-4d87-b0f3-4e6d6c9eea16

📥 Commits

Reviewing files that changed from the base of the PR and between 9fbc992 and b3728d7.

📒 Files selected for processing (3)

backend/server.py
extension/src/pages/question/Question.jsx
extension/src/pages/text_input/TextInput.jsx

🚧 Files skipped from review as they are similar to previous changes (1)

extension/src/pages/question/Question.jsx

Align get_problems output schema across local and LLM paths, add truncation metadata, and switch incomplete external config UX from error to informational fallback. Made-with: Cursor

jayydevs · 2026-03-22T05:29:17Z

    chrome.action.setBadgeText({ text: "" });
  });

-chrome.storage.onChanged.addListener((changes, namespace) => {


this might break existing features like extracting youtube subtitles and from context menu

jayydevs · 2026-03-22T05:33:51Z

+            BadRequestError,
+        )
+
+        client = OpenAI(api_key=llm_api_key)


in the README, we mentioned that directly from the browser extension to provider APIs" but here we are sending keys to our backend, either we need to update README, or we might want to change our handling of API keys, if we are targeting production launch of backend , as this will be a security risk

coderabbitai Bot reviewed Mar 16, 2026

View reviewed changes

Comment thread backend/server.py Outdated

Comment thread extension/public/manifest.json

Comment thread extension/src/pages/text_input/TextInput.jsx

fix: address CodeRabbit actionable comments and nitpicks

9fbc992

coderabbitai Bot reviewed Mar 16, 2026

View reviewed changes

Comment thread backend/server.py

Comment thread backend/server.py

Comment thread extension/src/pages/question/Question.jsx Outdated

Comment thread extension/src/pages/text_input/TextInput.jsx Outdated

coderabbitai Bot reviewed Mar 16, 2026

View reviewed changes

Comment thread extension/src/pages/text_input/TextInput.jsx

fix: normalize problems response and fallback messaging

224d068

Align get_problems output schema across local and LLM paths, add truncation metadata, and switch incomplete external config UX from error to informational fallback. Made-with: Cursor

jayydevs reviewed Mar 22, 2026

View reviewed changes

Uh oh!

Conversation

lakshsarda137 commented Mar 16, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram

Estimated Code Review Effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lakshsarda137 commented Mar 16, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jayydevs Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jayydevs Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lakshsarda137 commented Mar 16, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Mar 16, 2026 •

edited

Loading

jayydevs Mar 22, 2026 •

edited

Loading