Skip to content

Feat/hoagie requirements data#514

Merged
angelina-ji merged 2 commits intomasterfrom
feat/hoagie-requirements-data
Mar 27, 2026
Merged

Feat/hoagie requirements data#514
angelina-ji merged 2 commits intomasterfrom
feat/hoagie-requirements-data

Conversation

@angelina-ji
Copy link
Copy Markdown
Contributor

Bundle Hoagie requirements data locally, serve at page load, and fix dist_area scraping

Problem

  • TigerPath fetched requirements YAML from the old branch of Princeton-Departmental-Data over HTTP — that data was outdated and the network fetch caused the requirements sidebar to flash in after a delay
  • The course scraper wasn't populating dist_area on courses because the modernization PR dropped the per-course /courses/details call, and the old MobileApp endpoint now returns 404

Changes

Requirements data:

  • Copied HoagieClub/plan's MIT-licensed YAML data into tigerpath/requirements_data/ (majors, minors, certificates, degrees)
  • Replaced remote HTTP fetch in verifier.py with local file reads
  • Pre-compute requirements in the Django index view and inject into the HTML template
  • Frontend checks for preloaded data before falling back to the API fetch

Scraper fix:

  • Added registrar front-end API integration (same approach as PrincetonCourses) to fetch distribution_area_short for each course
  • No OIT API subscription needed — token is scraped from the public registrar.princeton.edu page

Testing

  1. Log in, select a major, reload — requirements sidebar appears instantly with no delay
  2. DevTools → Network → filter get_requirements → hard refresh — zero requests on initial load
  3. Run make seed-courses — courses now have dist_area populated
  4. Drag courses into semesters — distribution requirements update correctly

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3398942cc1

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@@ -111,6 +111,11 @@
</script>
{% endif %}

{% if preloaded_requirements_json %}
<script>
window.__TIGERPATH_PRELOADED_REQUIREMENTS__ = {{ preloaded_requirements_json|safe }};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Escape preloaded requirements before embedding in HTML

The template writes preloaded_requirements_json into a <script> using |safe, but that JSON contains user-controlled external credit names propagated through populate_user_schedule() and check_major/check_degree in views.py. A crafted value containing </script> can break out of the script and execute arbitrary JavaScript on page load (stored XSS for that account). Use Django’s json_script (or equivalent escaping of <, >, and &) instead of raw safe output.

Useful? React with 👍 / 👎.

api = MobileApp()

print("[scrape] Fetching registrar token...", flush=True)
registrar_token = get_registrar_token()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Continue scraping when registrar token retrieval fails

The new registrar dependency is fetched unconditionally at startup, and any exception from get_registrar_token() now aborts scrape_parse_semester before any courses are parsed. That means a transient registrar outage or page-format change causes seed-courses to fail for the entire term, whereas course ingestion previously still worked (without dist-area enrichment). Catch this failure and fall back to parsing courses with empty distribution data.

Useful? React with 👍 / 👎.

Comment on lines +83 to +87
if instance.major and instance.major.supported and instance.year:
schedule = populate_user_schedule(instance.user_schedule)
try:
preloaded_requirements.append(
check_major(instance.major.code, schedule, instance.year)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Remove unused server precompute or wire it into initial state

This block eagerly runs check_major/check_degree on every index request, but the frontend still does its normal /api/v1/get_requirements/ fetch and there is no code reading window.__TIGERPATH_PRELOADED_REQUIREMENTS__. The result is duplicate verifier work and larger HTML payloads without the intended first-render speedup. Either consume the preloaded payload during app initialization or skip this server-side computation.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Member

@joshuamotoaki joshuamotoaki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@angelina-ji angelina-ji merged commit 4e7891e into master Mar 27, 2026
1 check passed
@angelina-ji angelina-ji deleted the feat/hoagie-requirements-data branch March 27, 2026 04:59
DIodide added a commit that referenced this pull request Mar 28, 2026
The hoagie requirements data update (#514) changed category names in the
YAML files, breaking the verifier's settled path matching for existing
users. This adds a management command to clear stale settled paths and
also:

- Adds POR/SPA to AB_CONCENTRATIONS (new local YAML files)
- Cleans up verifier imports (removes unused `requests`, fixes `pathlib` placement)
- Management command also creates Major DB records for POR/SPA

Run on production after deploy:
  docker compose -f docker-compose.prod.yml exec -T web python manage.py clear_settled_paths --apply

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
DIodide added a commit that referenced this pull request Mar 28, 2026
The hoagie requirements data update (#514) changed category names in the
YAML files, breaking the verifier's settled path matching for existing
users. This adds a management command to clear stale settled paths and
also:

- Adds POR/SPA to AB_CONCENTRATIONS (new local YAML files)
- Cleans up verifier imports (removes unused `requests`, fixes `pathlib` placement)
- Management command also creates Major DB records for POR/SPA

Run on production after deploy:
  docker compose -f docker-compose.prod.yml exec -T web python manage.py clear_settled_paths --apply

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants