Problem
Many question concepts either have no Khan Academy course or link to spurious matches (e.g., similar keywords but wrong theme/topic). When a user clicks the Khan Academy button after answering a question, they may land on irrelevant or empty search results.
Based on exploration, it doesn't appear possible to automatically count Khan Academy search results client-side (no public API for search result counts, and CORS prevents scraping).
Proposed Solution
1. Local validation script
Write a script (scripts/validate_khan_links.py) that:
- Reads all domain JSON files and extracts unique
concepts_tested values
- For each concept, runs a search on Khan Academy (e.g.,
https://www.khanacademy.org/search?page_search_query=<concept>)
- Counts the number of search results (using headless browser or Khan Academy's internal API if discoverable)
- Outputs a report: concept → result count → recommended action
2. Add khan_academy_mode flag to question JSON
For each question, add a field:
{
"khan_academy_mode": "search" | "generic"
}
"search": The Khan Academy button initiates a search for the question's specific concept(s) (current behavior)
"generic": The Khan Academy button links to a generic course page for that sub-domain or domain (e.g., https://www.khanacademy.org/science/physics for physics questions)
3. Update quiz.js to respect the flag
In src/ui/quiz.js, when building the Khan Academy link:
- If
khan_academy_mode === "search" → use current search URL
- If
khan_academy_mode === "generic" → link to pre-configured domain course URL
4. Define generic fallback URLs per domain
In domain JSON or a config file, define fallback Khan Academy URLs:
{
"quantum-physics": "https://www.khanacademy.org/science/physics/quantum-physics",
"astrophysics": "https://www.khanacademy.org/science/cosmology-and-astronomy",
"biology": "https://www.khanacademy.org/science/biology"
}
Tasks
Notes
This should be done after all 50 questions per domain are generated, since the concept list needs to be finalized first.
Problem
Many question concepts either have no Khan Academy course or link to spurious matches (e.g., similar keywords but wrong theme/topic). When a user clicks the Khan Academy button after answering a question, they may land on irrelevant or empty search results.
Based on exploration, it doesn't appear possible to automatically count Khan Academy search results client-side (no public API for search result counts, and CORS prevents scraping).
Proposed Solution
1. Local validation script
Write a script (
scripts/validate_khan_links.py) that:concepts_testedvalueshttps://www.khanacademy.org/search?page_search_query=<concept>)2. Add
khan_academy_modeflag to question JSONFor each question, add a field:
{ "khan_academy_mode": "search" | "generic" }"search": The Khan Academy button initiates a search for the question's specific concept(s) (current behavior)"generic": The Khan Academy button links to a generic course page for that sub-domain or domain (e.g.,https://www.khanacademy.org/science/physicsfor physics questions)3. Update quiz.js to respect the flag
In
src/ui/quiz.js, when building the Khan Academy link:khan_academy_mode === "search"→ use current search URLkhan_academy_mode === "generic"→ link to pre-configured domain course URL4. Define generic fallback URLs per domain
In domain JSON or a config file, define fallback Khan Academy URLs:
{ "quantum-physics": "https://www.khanacademy.org/science/physics/quantum-physics", "astrophysics": "https://www.khanacademy.org/science/cosmology-and-astronomy", "biology": "https://www.khanacademy.org/science/biology" }Tasks
scripts/validate_khan_links.py— local script to check all concepts against Khan Academy searchkhan_academy_modefield to question generation pipelinequiz.jsto use the flag when building Khan Academy URLsNotes
This should be done after all 50 questions per domain are generated, since the concept list needs to be finalized first.