Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a large backend tarball to .gitignore, documents Python 3.11/3.12 requirement, relaxes torch pin, adds a fontkit dependency, replaces generic QA pipeline with explicit transformers model/tokenizer/device initialization and guarded startup, and implements asynchronous font loading + embedding for PDF generation and tightened file input/route names. Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant Output as Output.jsx
participant FontLoader as Font Loader
participant Worker as pdfWorker.js
participant PDFLib as pdf-lib + fontkit
User->>Output: Click "Generate PDF"
Output->>FontLoader: async fetch/import font bytes
FontLoader-->>Output: fontBytes or error
Output->>Worker: postMessage({qaPairs, mode, logoBytes, fontBytes})
Worker->>PDFLib: register fontkit & embed fontBytes (or use fallback)
PDFLib-->>Worker: generated PDF bytes
Worker-->>Output: PDF Blob
Output->>User: createObjectURL -> trigger download -> revoke
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~30 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (3)
eduaid_web/src/pages/Output.jsx (1)
19-34: Consider providing user feedback if font loading fails.If the font fails to load (e.g., network error, missing asset),
fontBytesremainsnulland the button stays disabled with "Loading Font..." indefinitely. Users won't understand why PDF generation is unavailable.Consider adding a fallback or error state:
💡 Proposed enhancement
+ const [fontLoadError, setFontLoadError] = useState(false); useEffect(() => { const loadFont = async () => { try { const fontModule = await import("../assets/fonts/NotoSans-Regular.ttf"); const resp = await fetch(fontModule.default); if (!resp.ok) { throw new Error(`HTTP ${resp.status}: ${resp.statusText}`); } const arrayBuffer = await resp.arrayBuffer(); setFontBytes(arrayBuffer); } catch (e) { console.error("Failed to load Noto Sans font:", e); + setFontLoadError(true); } }; loadFont(); },[]);Then update the button text (around line 431):
{fontLoadError ? 'Font Error' : !fontBytes ? 'Loading Font...' : 'Generate PDF'}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@eduaid_web/src/pages/Output.jsx` around lines 19 - 34, The font loader useEffect (loadFont) currently swallows failures and leaves fontBytes null; add a new state like fontLoadError with setter (e.g., [fontLoadError, setFontLoadError]) and in loadFont's catch setFontLoadError(true) (and optionally setFontBytes to a fallback or empty ArrayBuffer), then update the UI where fontBytes is used (the Generate PDF button logic) to show an explicit error label such as {fontLoadError ? 'Font Error' : !fontBytes ? 'Loading Font...' : 'Generate PDF'} and optionally expose a retry action that re-invokes loadFont; reference loadFont, setFontBytes, fontBytes, and the new fontLoadError/setFontLoadError in your changes.backend/server.py (1)
50-64: Consider adding GPU support for inference consistency.The
ExtractiveQAclass doesn't move the model to GPU when available, unlikeMCQGenerator(inbackend/Generator/main.py:30) which usestorch.device("cuda" if torch.cuda.is_available() else "cpu"). This may cause performance inconsistency on GPU-enabled systems.💡 Proposed enhancement for GPU support
class ExtractiveQA: def __init__(self, model_name="deepset/roberta-base-squad2"): self.tokenizer = AutoTokenizer.from_pretrained(model_name) self.model = AutoModelForQuestionAnswering.from_pretrained(model_name) + self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu") + self.model.to(self.device) def __call__(self, question, context): - inputs = self.tokenizer(question, context, return_tensors="pt") + inputs = self.tokenizer(question, context, return_tensors="pt").to(self.device) with torch.no_grad(): outputs = self.model(**inputs)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/server.py` around lines 50 - 64, The ExtractiveQA class currently doesn't use GPU; modify its __init__ to set self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu") and move the model to that device (self.model.to(self.device)); in __call__, after tokenization, move input tensors to the device (e.g., inputs = {k: v.to(self.device) for k,v in inputs.items()}) before forwarding and computing logits, keeping the torch.no_grad() context; ensure token decoding stays the same so qa_model continues to work.requirements.txt (1)
1-1: Consider using a version constraint range instead of open-ended minimum.PyTorch 2.8.0 is available (latest stable is 2.10.0). However, using
torch>=2.8.0without an upper bound introduces unnecessary risk—major version updates can introduce breaking changes in tensor operations, model behavior, or CUDA compatibility that could silently affect inference results.Consider either:
- Pinning to a specific version:
torch==2.8.0- Using a bounded range:
torch>=2.8.0,<3.0.0🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@requirements.txt` at line 1, The requirement line currently uses an open-ended minimum "torch>=2.8.0"; change this to a safer constraint by either pinning to a specific version or adding an upper bound (for example replace "torch>=2.8.0" with a pinned "torch==2.8.0" or a bounded range like "torch>=2.8.0,<3.0.0") so future major releases cannot introduce breaking changes; update the single requirement line in requirements.txt accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/server.py`:
- Line 39: Update README.md to state that both Google Docs and Google Forms now
use a single service account file named credentials.json (the code's
SERVICE_ACCOUNT_FILE) and to show the required scopes (ensure FORMS_SCOPES /
auth configuration includes both documents.readonly and forms.body). Replace
references to separate files (service_account_key.json + credentials.json) with
one credentials.json, update any setup/installation steps and examples
accordingly, and note that the single service account must be created with both
Docs and Forms API access.
In `@README.md`:
- Around line 43-44: The Markdown list item is missing a space after the leading
hyphen on the line starting with "Note: Ensure you are using Python 3.11 or
3.12. Python 3.13 is currently incompatible with several required machine
learning libraries."; fix it by inserting a space after the hyphen so the line
becomes a properly formatted list item (leading "- " instead of "-").
---
Nitpick comments:
In `@backend/server.py`:
- Around line 50-64: The ExtractiveQA class currently doesn't use GPU; modify
its __init__ to set self.device = torch.device("cuda" if
torch.cuda.is_available() else "cpu") and move the model to that device
(self.model.to(self.device)); in __call__, after tokenization, move input
tensors to the device (e.g., inputs = {k: v.to(self.device) for k,v in
inputs.items()}) before forwarding and computing logits, keeping the
torch.no_grad() context; ensure token decoding stays the same so qa_model
continues to work.
In `@eduaid_web/src/pages/Output.jsx`:
- Around line 19-34: The font loader useEffect (loadFont) currently swallows
failures and leaves fontBytes null; add a new state like fontLoadError with
setter (e.g., [fontLoadError, setFontLoadError]) and in loadFont's catch
setFontLoadError(true) (and optionally setFontBytes to a fallback or empty
ArrayBuffer), then update the UI where fontBytes is used (the Generate PDF
button logic) to show an explicit error label such as {fontLoadError ? 'Font
Error' : !fontBytes ? 'Loading Font...' : 'Generate PDF'} and optionally expose
a retry action that re-invokes loadFont; reference loadFont, setFontBytes,
fontBytes, and the new fontLoadError/setFontLoadError in your changes.
In `@requirements.txt`:
- Line 1: The requirement line currently uses an open-ended minimum
"torch>=2.8.0"; change this to a safer constraint by either pinning to a
specific version or adding an upper bound (for example replace "torch>=2.8.0"
with a pinned "torch==2.8.0" or a bounded range like "torch>=2.8.0,<3.0.0") so
future major releases cannot introduce breaking changes; update the single
requirement line in requirements.txt accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 34282d80-ba8c-4453-b8a1-a72bb311ba41
⛔ Files ignored due to path filters (2)
eduaid_web/package-lock.jsonis excluded by!**/package-lock.jsoneduaid_web/src/assets/fonts/NotoSans-Regular.ttfis excluded by!**/*.ttf
📒 Files selected for processing (9)
.gitignoreREADME.mdbackend/server.pyeduaid_web/package.jsoneduaid_web/src/pages/Output.jsxeduaid_web/src/pages/QuestionType.jsxeduaid_web/src/pages/TextInput.jsxeduaid_web/src/workers/pdfWorker.jsrequirements.txt
| print("Starting Flask App...") | ||
|
|
||
| SERVICE_ACCOUNT_FILE = './service_account_key.json' | ||
| SERVICE_ACCOUNT_FILE = './credentials.json' |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check README references to credential files
rg -n 'credentials\.json|service_account_key\.json' README.mdRepository: AOSSIE-Org/EduAid
Length of output: 172
🏁 Script executed:
sed -n '70,92p' README.mdRepository: AOSSIE-Org/EduAid
Length of output: 831
🏁 Script executed:
cat -n .gitignore | grep -E 'credentials\.json|service_account_key\.json'Repository: AOSSIE-Org/EduAid
Length of output: 132
🏁 Script executed:
rg -n 'service_account_key|credentials\.json' backend/ --type py -A 2 -B 2Repository: AOSSIE-Org/EduAid
Length of output: 1092
🏁 Script executed:
rg -n 'forms\.' backend/server.pyRepository: AOSSIE-Org/EduAid
Length of output: 276
🏁 Script executed:
rg -n 'forms' backend/ --type pyRepository: AOSSIE-Org/EduAid
Length of output: 795
🏁 Script executed:
sed -n '37,45p' backend/server.pyRepository: AOSSIE-Org/EduAid
Length of output: 330
🏁 Script executed:
sed -n '220,250p' backend/server.pyRepository: AOSSIE-Org/EduAid
Length of output: 1502
Update README.md to reflect consolidated credentials file usage.
The code now uses credentials.json for both Google Docs and Google Forms APIs (see SERVICE_ACCOUNT_FILE at line 39, with FORMS_SCOPES applied in the same file). However, README.md (lines 78-85) still documents two separate credential files—service_account_key.json for Google Docs and credentials.json for Google Forms. Update the documentation to clarify that both APIs now use a single credentials.json service account file with the necessary scopes for both Google Docs (documents.readonly) and Google Forms (forms.body).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/server.py` at line 39, Update README.md to state that both Google
Docs and Google Forms now use a single service account file named
credentials.json (the code's SERVICE_ACCOUNT_FILE) and to show the required
scopes (ensure FORMS_SCOPES / auth configuration includes both
documents.readonly and forms.body). Replace references to separate files
(service_account_key.json + credentials.json) with one credentials.json, update
any setup/installation steps and examples accordingly, and note that the single
service account must be created with both Docs and Forms API access.
…orms backend service account auth
… remove external GitHub dependency
…licks
- Replaced the generic \!event.target.closest('button')\ check with a specific \id\ check (\pdfTriggerButton\).
- Added an \event.target instanceof Node\ safeguard to prevent errors when the event target is not a standard DOM element.
1cb8b30 to
8eec6cf
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
eduaid_web/src/pages/TextInput.jsx (1)
188-192: Optional cleanup: remove legacy commented-out JSX.These “Original” commented lines are now obsolete and can be removed to keep the component lean.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@eduaid_web/src/pages/TextInput.jsx` around lines 188 - 192, Remove the obsolete commented-out JSX in the TextInput component: delete the legacy "<p className="text-white text-lg">Choose a file (PDF, MP3 supported)</p>" and the commented "<input type="file" ref={fileInputRef} onChange={handleFileUpload} ..." lines so only the corrected label and the input with the accept attribute remain; this cleanup targets the commented legacy markup around the file picker (references: fileInputRef, handleFileUpload) in TextInput.jsx.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/server.py`:
- Line 20: Remove the stray merge marker ">>>>>>> Stashed changes" (it's a
syntax error) and properly initialize the imported qa_model so it exists when
endpoints use it; locate the import of qa_model and either assign it a valid
model instance at module startup (or set qa_model = None and lazily load/raise a
clear error before use), and ensure get_mcq_answer and get_answer check qa_model
is initialized (or rely on the startup initialization) to avoid NameError at
runtime.
- Around line 18-19: The endpoints get_mcq_answer and get_answer reference
qa_model but it is never defined at module scope; define and initialize qa_model
(and its tokenizer qa_tokenizer) during module import or lazily before first
use: load AutoModelForQuestionAnswering and AutoTokenizer with the chosen model
name, move the model to the appropriate device (CPU/GPU via torch.device), and
handle load errors so the global variables qa_model and qa_tokenizer are
available to both get_mcq_answer and get_answer; alternatively add a small init
function (e.g., init_qa_model) that sets these globals and call it at module
import or at the start of those endpoint functions if not yet initialized.
---
Nitpick comments:
In `@eduaid_web/src/pages/TextInput.jsx`:
- Around line 188-192: Remove the obsolete commented-out JSX in the TextInput
component: delete the legacy "<p className="text-white text-lg">Choose a file
(PDF, MP3 supported)</p>" and the commented "<input type="file"
ref={fileInputRef} onChange={handleFileUpload} ..." lines so only the corrected
label and the input with the accept attribute remain; this cleanup targets the
commented legacy markup around the file picker (references: fileInputRef,
handleFileUpload) in TextInput.jsx.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e05de6a8-788d-4a44-8472-2aa89513a8a1
⛔ Files ignored due to path filters (2)
eduaid_web/package-lock.jsonis excluded by!**/package-lock.jsoneduaid_web/src/assets/fonts/NotoSans-Regular.ttfis excluded by!**/*.ttf
📒 Files selected for processing (7)
README.mdbackend/server.pyeduaid_web/package.jsoneduaid_web/src/App.jseduaid_web/src/pages/Output.jsxeduaid_web/src/pages/TextInput.jsxeduaid_web/src/workers/pdfWorker.js
🚧 Files skipped from review as they are similar to previous changes (2)
- README.md
- eduaid_web/src/workers/pdfWorker.js
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
backend/server.py (1)
19-20:⚠️ Potential issue | 🔴 CriticalInitialize the QA pipeline before first use (current code crashes at import time).
At Line [47],
qa_modelreferences undefined symbols (pipeline,qa_model_instance,qa_tokenizer,device). This raises immediately when importing the module, so the server won’t start.💡 Proposed fix
-from transformers import AutoModelForQuestionAnswering, AutoTokenizer +from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline import torch @@ -qa_model = pipeline("question-answering", model=qa_model_instance, tokenizer=qa_tokenizer, device=device) +QA_MODEL_NAME = "deepset/roberta-base-squad2" +qa_tokenizer = AutoTokenizer.from_pretrained(QA_MODEL_NAME) +qa_model_instance = AutoModelForQuestionAnswering.from_pretrained(QA_MODEL_NAME) +device = 0 if torch.cuda.is_available() else -1 +qa_model = pipeline( + "question-answering", + model=qa_model_instance, + tokenizer=qa_tokenizer, + device=device, +)#!/bin/bash # Verify QA pipeline symbols are defined in backend/server.py rg -nP 'from transformers import .*pipeline' backend/server.py rg -nP '^\s*(qa_tokenizer|qa_model_instance|device|qa_model)\s*=' backend/server.py -n rg -nP '\bqa_model\s*\(' backend/server.py -C2Also applies to: 47-47
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/server.py` around lines 19 - 20, The module currently references undefined symbols (pipeline, qa_model_instance, qa_tokenizer, device, qa_model) at import time; fix by importing pipeline from transformers and initializing the QA artifacts lazily or at startup: define device (e.g., via torch.device and GPU index), create qa_tokenizer using AutoTokenizer.from_pretrained, create qa_model_instance using AutoModelForQuestionAnswering.from_pretrained, then build qa_model with pipeline('question-answering', model=qa_model_instance, tokenizer=qa_tokenizer, device=...) — alternatively move this initialization into a function like get_qa_model() that performs the above on first call and caches the result to avoid raising during import.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@eduaid_web/src/pages/Output.jsx`:
- Around line 418-443: The dropdown with id "pdfDropdown" is placed outside the
relative wrapper that contains the trigger button (id "pdfTriggerButton"), so
its absolute positioning isn't anchored correctly; move the <div
id="pdfDropdown"> so it lives inside the same relative container that wraps the
button (the div with class "relative w-full sm:w-auto") or extend that wrapper
to include both elements, ensuring the dropdown remains a child of the relative
element so its absolute positioning works as intended.
---
Duplicate comments:
In `@backend/server.py`:
- Around line 19-20: The module currently references undefined symbols
(pipeline, qa_model_instance, qa_tokenizer, device, qa_model) at import time;
fix by importing pipeline from transformers and initializing the QA artifacts
lazily or at startup: define device (e.g., via torch.device and GPU index),
create qa_tokenizer using AutoTokenizer.from_pretrained, create
qa_model_instance using AutoModelForQuestionAnswering.from_pretrained, then
build qa_model with pipeline('question-answering', model=qa_model_instance,
tokenizer=qa_tokenizer, device=...) — alternatively move this initialization
into a function like get_qa_model() that performs the above on first call and
caches the result to avoid raising during import.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ad76f502-08a8-4cab-8815-e2b5c653220a
📒 Files selected for processing (4)
README.mdbackend/server.pyeduaid_web/src/pages/Output.jsxrequirements.txt
🚧 Files skipped from review as they are similar to previous changes (2)
- README.md
- requirements.txt
Reintroduced LLMQuestionGenerator and configured QA model with tokenizer and device settings.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
backend/server.py (1)
19-20:⚠️ Potential issue | 🔴 CriticalImport
pipelinefrom transformers before using it at line 57.The code constructs a QA pipeline at module scope (line 57) using
pipeline(...), butpipelineis not imported from transformers on line 19. This causes aNameErrorduring module import and prevents the server from starting.Suggested fix
-from transformers import AutoModelForQuestionAnswering, AutoTokenizer +from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/server.py` around lines 19 - 20, The module currently uses pipeline(...) at import/module scope (the QA pipeline creation near where the model/tokenizer are used, e.g., in the code that constructs the QA pipeline) but never imports pipeline from transformers, causing a NameError; add an import for pipeline from transformers alongside the existing imports (e.g., include "pipeline" with AutoModelForQuestionAnswering and AutoTokenizer) so the pipeline symbol is defined before the QA pipeline creation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/server.py`:
- Around line 52-62: Wrap the model/tokenizer loading so a failure doesn't abort
import: catch exceptions around AutoTokenizer.from_pretrained(QA_MODEL_NAME) and
AutoModelForQuestionAnswering.from_pretrained(QA_MODEL_NAME) (and the
pipeline(...) call), log the error and set qa_tokenizer, qa_model_instance and
qa_model to None on failure; ensure qa_model is only created when all loads
succeed. Then update the two QA endpoints that use qa_model (the handlers for
/get_mcq_answer and /get_shortq_answer) to check if qa_model is None and return
HTTP 503 Service Unavailable with a clear error message instead of proceeding.
Apply the same defensive pattern (try/except on load and 503 when missing) to
any other routes that rely on optional external models or resources.
---
Duplicate comments:
In `@backend/server.py`:
- Around line 19-20: The module currently uses pipeline(...) at import/module
scope (the QA pipeline creation near where the model/tokenizer are used, e.g.,
in the code that constructs the QA pipeline) but never imports pipeline from
transformers, causing a NameError; add an import for pipeline from transformers
alongside the existing imports (e.g., include "pipeline" with
AutoModelForQuestionAnswering and AutoTokenizer) so the pipeline symbol is
defined before the QA pipeline creation.
| QA_MODEL_NAME = "deepset/roberta-base-squad2" | ||
| qa_tokenizer = AutoTokenizer.from_pretrained(QA_MODEL_NAME) | ||
| qa_model_instance = AutoModelForQuestionAnswering.from_pretrained(QA_MODEL_NAME) | ||
| device = 0 if torch.cuda.is_available() else -1 | ||
|
|
||
| qa_model = pipeline( | ||
| "question-answering", | ||
| model=qa_model_instance, | ||
| tokenizer=qa_tokenizer, | ||
| device=device | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Show the import-time QA initialization and its call sites.
sed -n '52,62p' backend/server.py
rg -nP '\bqa_model\s*\(' backend/server.py -C2Repository: AOSSIE-Org/EduAid
Length of output: 865
🏁 Script executed:
# Check if qa_model initialization is already wrapped in try-except
cat -n backend/server.py | sed -n '50,70p'Repository: AOSSIE-Org/EduAid
Length of output: 767
🏁 Script executed:
# Get full context around the qa_model call sites
cat -n backend/server.py | sed -n '205,215p'
cat -n backend/server.py | sed -n '230,245p'Repository: AOSSIE-Org/EduAid
Length of output: 1342
🏁 Script executed:
# Check if there are any route definitions that use qa_model and how they handle errors
rg -nP '@(app|.*\.route)' backend/server.py -B1 -A5 | head -50Repository: AOSSIE-Org/EduAid
Length of output: 1623
🏁 Script executed:
# Find the function containing line 210
cat -n backend/server.py | sed -n '180,220p'Repository: AOSSIE-Org/EduAid
Length of output: 2137
🏁 Script executed:
# Check all routes and which ones use qa_model
rg -n 'def get_' backend/server.py | head -20Repository: AOSSIE-Org/EduAid
Length of output: 450
🏁 Script executed:
# Search for any error handling around qa_model usage
rg -n 'try:|except:|qa_model' backend/server.py | grep -A2 -B2 'qa_model'Repository: AOSSIE-Org/EduAid
Length of output: 413
🏁 Script executed:
# Check get_mcq_answer function completely
cat -n backend/server.py | sed -n '197,230p'Repository: AOSSIE-Org/EduAid
Length of output: 1631
🏁 Script executed:
# Check get_answer and get_boolean_answer functions
cat -n backend/server.py | sed -n '231,262p'Repository: AOSSIE-Org/EduAid
Length of output: 1226
🏁 Script executed:
# Check if there's any try-except or error handling for qa_model anywhere
rg -n 'get_mcq_answer|get_answer|get_boolean_answer' backend/server.py -A15 | head -60Repository: AOSSIE-Org/EduAid
Length of output: 1850
Guard QA model initialization so one failed model load doesn't take down the whole API.
Lines 53–57 run from_pretrained() during module import without error handling. If the model is missing from cache or the load fails, backend/server.py never finishes importing and the entire server fails to start—even unrelated routes go down. Additionally, the two endpoints that use qa_model (/get_mcq_answer at line 210 and /get_shortq_answer at line 237) have no error handling and will crash if the model is unavailable.
Catch initialization failures and have the QA endpoints return 503 when qa_model is unavailable. For consistency, apply a similar pattern to other error-prone routes.
Suggested direction
QA_MODEL_NAME = "deepset/roberta-base-squad2"
-qa_tokenizer = AutoTokenizer.from_pretrained(QA_MODEL_NAME)
-qa_model_instance = AutoModelForQuestionAnswering.from_pretrained(QA_MODEL_NAME)
device = 0 if torch.cuda.is_available() else -1
-
-qa_model = pipeline(
- "question-answering",
- model=qa_model_instance,
- tokenizer=qa_tokenizer,
- device=device
-)
+qa_model = None
+qa_model_init_error = None
+
+try:
+ qa_tokenizer = AutoTokenizer.from_pretrained(QA_MODEL_NAME)
+ qa_model_instance = AutoModelForQuestionAnswering.from_pretrained(QA_MODEL_NAME)
+ qa_model = pipeline(
+ "question-answering",
+ model=qa_model_instance,
+ tokenizer=qa_tokenizer,
+ device=device,
+ )
+except Exception as exc:
+ qa_model_init_error = exc
+ app.logger.exception("Failed to initialize QA model: %s", exc)Then have the two QA answer routes check if qa_model is None and return 503 Service Unavailable with an appropriate error message.
🧰 Tools
🪛 Ruff (0.15.6)
[error] 57-57: Undefined name pipeline
(F821)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/server.py` around lines 52 - 62, Wrap the model/tokenizer loading so
a failure doesn't abort import: catch exceptions around
AutoTokenizer.from_pretrained(QA_MODEL_NAME) and
AutoModelForQuestionAnswering.from_pretrained(QA_MODEL_NAME) (and the
pipeline(...) call), log the error and set qa_tokenizer, qa_model_instance and
qa_model to None on failure; ensure qa_model is only created when all loads
succeed. Then update the two QA endpoints that use qa_model (the handlers for
/get_mcq_answer and /get_shortq_answer) to check if qa_model is None and return
HTTP 503 Service Unavailable with a clear error message instead of proceeding.
Apply the same defensive pattern (try/except on load and 503 when missing) to
any other routes that rely on optional external models or resources.
Refactor QA model initialization with error handling.
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
eduaid_web/src/pages/Output.jsx (1)
162-182:⚠️ Potential issue | 🔴 CriticalGuard
outputaccess and tighten the MCQ source condition.At Line [163] and Line [175],
.forEachcan run on an undefinedqaPairsFromStorage["output"], causing a runtime crash. Also, Line [162] currently treatsoutputas MCQ wheneveroutput_mcqexists, which can produce mixed/duplicate datasets in non-MCQ flows.💡 Proposed fix
- if (qaPairsFromStorage["output_mcq"] || questionType === "get_mcq") { - qaPairsFromStorage["output"].forEach((qaPair) => { + if ( + questionType === "get_mcq" && + Array.isArray(qaPairsFromStorage["output"]) + ) { + qaPairsFromStorage["output"].forEach((qaPair) => { combinedQaPairs.push({ question: qaPair.question_statement, question_type: "MCQ", options: qaPair.options, answer: qaPair.answer, context: qaPair.context, }); }); } - if (questionType === "get_boolq") { - qaPairsFromStorage["output"].forEach((qaPair) => { + if ( + questionType === "get_boolq" && + Array.isArray(qaPairsFromStorage["output"]) + ) { + qaPairsFromStorage["output"].forEach((qaPair) => { combinedQaPairs.push({ question: qaPair, question_type: "Boolean", }); }); - } else if (qaPairsFromStorage["output"] && questionType !== "get_mcq") { + } else if ( + Array.isArray(qaPairsFromStorage["output"]) && + questionType !== "get_mcq" + ) { qaPairsFromStorage["output"].forEach((qaPair) => { combinedQaPairs.push({ question: qaPair.question || qaPair.question_statement || qaPair.Question, options: qaPair.options, answer: qaPair.answer || qaPair.Answer, context: qaPair.context, question_type: "Short", }); }); }Based on learnings: In
eduaid_web/src/pages/Output.jsx,output_mcqandoutputmust be coordinated as separate MCQ sources only in the intendedquestionType === "get_mcq"flow to avoid duplicates/order inconsistencies.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@eduaid_web/src/pages/Output.jsx` around lines 162 - 182, Guard access to qaPairsFromStorage["output"] and only treat MCQ data as coming from "output" when questionType === "get_mcq": check that qaPairsFromStorage && Array.isArray(qaPairsFromStorage["output"]) before calling .forEach, and change the condition that currently uses qaPairsFromStorage["output_mcq"] to require questionType === "get_mcq" (i.e. only push MCQ items into combinedQaPairs from qaPairsFromStorage["output"] when questionType === "get_mcq"); similarly wrap the boolean and fallback branches so they only iterate when qaPairsFromStorage["output"] is a defined array, ensuring combinedQaPairs, qaPairsFromStorage, questionType, output_mcq, and output logic stays consistent and avoids duplicate/undefined iterations.
♻️ Duplicate comments (1)
backend/server.py (1)
51-69:⚠️ Potential issue | 🟠 MajorQA initialization is guarded, but QA usage is still unguarded.
If model loading fails,
qa_modelremainsNone; calls at Line 216 and Line 243 then crash with a callable error (500) instead of a controlled 503.🔧 Suggested follow-up guard
@@ qa_model = None qa_model_init_error = None @@ except Exception as exc: qa_model_init_error = exc app.logger.exception("Failed to initialize QA model: %s", exc) + +def _qa_unavailable_response(): + details = str(qa_model_init_error) if qa_model_init_error else "QA model unavailable" + return jsonify({"error": "QA model unavailable", "details": details}), 503 @@ def get_mcq_answer(): + if qa_model is None: + return _qa_unavailable_response() @@ def get_answer(): + if qa_model is None: + return _qa_unavailable_response()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/server.py` around lines 51 - 69, The QA pipeline (qa_model) can be None if initialization failed (qa_model_init_error set), so before any use of qa_model in the request handlers replace direct calls with a guard: check qa_model is not None (or check qa_model_init_error) and if the model is unavailable return a 503 response with a clear message and log the qa_model_init_error via app.logger.exception or app.logger.error; update the handlers that call qa_model (where pipeline(...) is used downstream) to perform this check and avoid invoking qa_model when None.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/server.py`:
- Around line 27-28: The /generate_gform route currently uses legacy symbols
(file.Storage, client.flow_from_clientsecrets, tools.run_flow, discovery.build,
Http) that are no longer imported and will raise NameError; update the handler
to use the new imports by replacing
file.Storage()/client.flow_from_clientsecrets()/tools.run_flow() flow with
sa.Credentials.from_service_account_file(...) to create credentials, replace
discovery.build(...) with google_build(...), and pass the resulting credentials
via the credentials parameter (e.g., google_build(service, version,
credentials=creds)) instead of using Http() authorization; modify
functions/variables in the route (look for generate_gform handler and any
variables named creds/credentials, storage, flow, or service) accordingly so the
code uses sa.Credentials.from_service_account_file and google_build with
credentials.
In `@eduaid_web/src/pages/Output.jsx`:
- Line 239: The direct DOM access document.getElementById('pdfDropdown') can be
null if the component unmounted before the async callback runs; guard this call
by first retrieving the element into a variable (e.g., const pdfDropdown =
document.getElementById('pdfDropdown')) and only call
pdfDropdown.classList.add('hidden') when pdfDropdown is non-null (truthy).
Update the code around the existing
document.getElementById('pdfDropdown').classList.add('hidden') invocation in
Output.jsx to perform this null-check to avoid runtime exceptions.
---
Outside diff comments:
In `@eduaid_web/src/pages/Output.jsx`:
- Around line 162-182: Guard access to qaPairsFromStorage["output"] and only
treat MCQ data as coming from "output" when questionType === "get_mcq": check
that qaPairsFromStorage && Array.isArray(qaPairsFromStorage["output"]) before
calling .forEach, and change the condition that currently uses
qaPairsFromStorage["output_mcq"] to require questionType === "get_mcq" (i.e.
only push MCQ items into combinedQaPairs from qaPairsFromStorage["output"] when
questionType === "get_mcq"); similarly wrap the boolean and fallback branches so
they only iterate when qaPairsFromStorage["output"] is a defined array, ensuring
combinedQaPairs, qaPairsFromStorage, questionType, output_mcq, and output logic
stays consistent and avoids duplicate/undefined iterations.
---
Duplicate comments:
In `@backend/server.py`:
- Around line 51-69: The QA pipeline (qa_model) can be None if initialization
failed (qa_model_init_error set), so before any use of qa_model in the request
handlers replace direct calls with a guard: check qa_model is not None (or check
qa_model_init_error) and if the model is unavailable return a 503 response with
a clear message and log the qa_model_init_error via app.logger.exception or
app.logger.error; update the handlers that call qa_model (where pipeline(...) is
used downstream) to perform this check and avoid invoking qa_model when None.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 64d8034b-2822-4acb-bd71-121ccb9c5025
📒 Files selected for processing (2)
backend/server.pyeduaid_web/src/pages/Output.jsx
| from googleapiclient.discovery import build as google_build | ||
| from google.oauth2 import service_account as sa |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify import/call-site mismatch for Google Forms auth client symbols.
rg -nP '^\s*from\s+googleapiclient\.discovery\s+import\s+build\s+as\s+google_build|^\s*from\s+google\.oauth2\s+import\s+service_account\s+as\s+sa|^\s*from\s+oauth2client\s+import|^\s*from\s+httplib2\s+import\s+Http' backend/server.py
# Show legacy symbols still used by generate_gform.
rg -nP '\b(file\.Storage|client\.flow_from_clientsecrets|tools\.run_flow|discovery\.build|Http\()' backend/server.py -C2
# Confirm usage of newly imported aliases.
rg -nP '\bgoogle_build\s*\(|\bsa\.' backend/server.py -C2Repository: AOSSIE-Org/EduAid
Length of output: 699
🏁 Script executed:
# Check if the legacy symbols are imported anywhere in the file
rg -n '^\s*(from\s+(oauth2client|httplib2|googleapiclient)\s+import|import\s+(oauth2client|httplib2|googleapiclient))' backend/server.py
# Get more context around the generate_gform function to understand its scope
rg -n 'def generate_gform|@app\.route.*generate_gform' backend/server.py -A 50 | head -80Repository: AOSSIE-Org/EduAid
Length of output: 2027
🏁 Script executed:
# Check if file, client, tools, discovery, Http are referenced anywhere else in the file
rg -n '\b(file|client|tools|discovery|Http)\b' backend/server.py
# Verify the file structure and imports at the top
head -50 backend/server.pyRepository: AOSSIE-Org/EduAid
Length of output: 2566
/generate_gform references undefined OAuth symbols—code will crash at runtime.
Lines 294–305 use legacy oauth2client and httplib2 symbols (file, client, tools, discovery, Http) that are not imported. After switching to the new imports at lines 27–28, these symbols are undefined and will raise NameError when the route is hit.
Update the handler to use the new imports:
- Replace
file.Storage(),client.flow_from_clientsecrets(), andtools.run_flow()withsa.Credentials.from_service_account_file() - Replace
discovery.build()withgoogle_build() - Replace the
Http()authorization pattern with thecredentialsparameter
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/server.py` around lines 27 - 28, The /generate_gform route currently
uses legacy symbols (file.Storage, client.flow_from_clientsecrets,
tools.run_flow, discovery.build, Http) that are no longer imported and will
raise NameError; update the handler to use the new imports by replacing
file.Storage()/client.flow_from_clientsecrets()/tools.run_flow() flow with
sa.Credentials.from_service_account_file(...) to create credentials, replace
discovery.build(...) with google_build(...), and pass the resulting credentials
via the credentials parameter (e.g., google_build(service, version,
credentials=creds)) instead of using Http() authorization; modify
functions/variables in the route (look for generate_gform handler and any
variables named creds/credentials, storage, flow, or service) accordingly so the
code uses sa.Credentials.from_service_account_file and google_build with
credentials.
| document.body.removeChild(link); | ||
| URL.revokeObjectURL(url); | ||
|
|
||
| document.getElementById('pdfDropdown').classList.add('hidden'); |
There was a problem hiding this comment.
Null-check #pdfDropdown before classList.add.
In async worker callbacks, the component may already be unmounted; direct access can throw.
💡 Proposed fix
- document.getElementById('pdfDropdown').classList.add('hidden');
+ const dropdown = document.getElementById("pdfDropdown");
+ if (dropdown) {
+ dropdown.classList.add("hidden");
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| document.getElementById('pdfDropdown').classList.add('hidden'); | |
| const dropdown = document.getElementById("pdfDropdown"); | |
| if (dropdown) { | |
| dropdown.classList.add("hidden"); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@eduaid_web/src/pages/Output.jsx` at line 239, The direct DOM access
document.getElementById('pdfDropdown') can be null if the component unmounted
before the async callback runs; guard this call by first retrieving the element
into a variable (e.g., const pdfDropdown =
document.getElementById('pdfDropdown')) and only call
pdfDropdown.classList.add('hidden') when pdfDropdown is non-null (truthy).
Update the code around the existing
document.getElementById('pdfDropdown').classList.add('hidden') invocation in
Output.jsx to perform this null-check to avoid runtime exceptions.
Description:
This PR fixes two issues in the Output component:
PDF Unicode ("Tofu") Fix: Embeds the NotoSans font into the PDF worker so foreign languages, math symbols, and complex characters render correctly instead of showing missing character boxes.
Dropdown Bug Fix: Fixes a UI bug where clicking unrelated buttons kept the PDF dropdown open. It scopes the outside-click listener strictly to the pdfTriggerButton ID instead of generically targeting all buttons.
Addressed Issues:
Fixes #(TODO:issue number)
Screenshots/Recordings:
TODO: If applicable, add screenshots or recordings that demonstrate the interface before and after the changes.
Additional Notes:
Replaced the generic !event.target.closest('button') check with a strict ID check and Node verification to prevent DOM errors. Embedding the custom font ensures pdf-lib does not crash when users generate quizzes with non-English text.
AI Usage Disclosure:
We encourage contributors to use AI tools responsibly when creating Pull Requests. While AI can be a valuable aid, it is essential to ensure that your contributions meet the task requirements, build successfully, include relevant tests, and pass all linters. Submissions that do not meet these standards may be closed without warning to maintain the quality and integrity of the project. Please take the time to understand the changes you are proposing and their impact. AI slop is strongly discouraged and may lead to banning and blocking. Do not spam our repos with AI slop.
Check one of the checkboxes below:
I have used the following AI models and tools: TODO
Checklist
Summary by CodeRabbit
Documentation
New Features
Improvements
Chores