Skip to content

Latest commit

Β 

History

History
269 lines (205 loc) Β· 14.2 KB

File metadata and controls

269 lines (205 loc) Β· 14.2 KB

πŸ“‹ TODO β€” Coding for MBA

Unified roadmap β€” app features, curriculum expansion, DX improvements, and long-horizon ideas.
Stack: Vite 7 Β· React 19 Β· TypeScript 5.9 Β· motion Β· zustand Β· zod Β· react-hot-toast Β· canvas-confetti

Last updated: Mar 2, 2026


βœ… Completed

Audit update (2026-02-28): Re-validated items marked complete against current codebase and in-app surfacing. Tasks found partially implemented have been moved back to incomplete with notes.

Click to expand completed items

Quick Wins β€” already shipped

  • Toast feedback on progress save, exercise submission, theme toggle
  • Confetti on quiz ace, phase unlock, lesson complete, full-curriculum finale
  • Zod schemas for lesson frontmatter + build-time validation
  • Page transitions (AnimatePresence fade + slide)
  • Staggered card entrance, spring-physics progress bar, hover lift
  • Parallax hero, scroll-triggered fade-in lessons, animated stats counters
  • layoutId shared transitions, masonry stagger for concept graph
  • Zustand progress store + persist middleware
  • Quiz attempt tracking, per-question analytics, spaced repetition surfacing
  • User preferences store (theme, sidebar, font-size, code lang, density)
  • Time-on-page / study streak / weekly chart / total learning time
  • Glassmorphism dark-mode cards, animated gradient mesh hero
  • Breadcrumb trail, "continue where you left off" banner
  • Keyboard shortcut overlay (?)
  • XP + badge gamification layer on Progress page
  • Vite 7 module preload optimisation, bundle analysis script
  • Playwright visual regression + reduced-motion tests
  • Zustand + Zod unit tests
  • Strict TypeScript satisfies patterns, error boundaries with toast fallbacks

πŸ”΄ Priority 1 β€” Curriculum (do next)

These fix the most critical content gaps identified in the curriculum audit.

Phase Overviews β€” Depth Upgrades

Validation note (2026-02-27): Cross-checked Agent-P2, Agent-P8, and Agent-P9 overview updates against underlying Day_*/README.md lesson scope and sequence. Section coverage is now explicitly evidenced in the overview docs (skills matrix, ROI tables, scenario walkthroughs, exam prompts, cloud-native/capstone bridges), and all listed Phase 2/8/9 overview depth criteria are now satisfied.

  • Phase 2 Overview β€” expanded to Phase 5 depth standard (300+ lines, ROI table, 3-tier skills matrix, 5+ pitfalls, 4+ exam Qs)
  • Phase 8 Overview β€” scenario walkthroughs, 2+ milestone exam questions, expert track, and extras/ folder are now present
  • Phase 9 Overview β€” Cloud-native SQL section and curriculum capstone preview are present; >15 KB length is acceptable for this overview

Gap-Filling Lesson Days

  • Day_37B_Probability_and_Statistics_for_ML β€” distributions, Bayes theorem, CLT (prerequisite for Phase 5 Day 54)
  • Day_37C_Sklearn_Pipelines β€” Pipeline, ColumnTransformer, custom transformers, CV
  • Day_36B_Docker_Fundamentals β€” containers, images, Compose for data apps (Phase 3 bonus)
  • Day_60B_LLM_Fine_Tuning_and_PEFT β€” LoRA, QLoRA, Hugging Face PEFT library
  • Day_60C_RAG_and_Vector_Databases β€” embeddings, ChromaDB, LangChain RAG pipeline
  • Day_84B_dbt_Fundamentals β€” models, refs, tests, docs, dbt Cloud
  • Day_96B_NoSQL_Deep_Dive β€” MongoDB, Redis, Cassandra β€” when to use each

Extras/ Folders

  • Add extras/ to Phase 2 (sample DataFrames, advanced Pandas notebooks)
  • Add extras/ to Phase 5 (PEFT configs, RAG starters)
  • Add extras/ to Phase 8 (DDL scripts, sample datasets)
  • Add extras/ to Phase 9 (capstone data + solution scaffold)

🟑 Priority 2 β€” New Phases & Major Content

Phase 10 β€” Generative AI & LLM Engineering (Days 109–120) βœ… Complete

Phase already implemented. Audit & polish pass needed.

  • Verify all 12 day files meet the content depth standard (500+ words, 3+ exercises, 5 Q&A)
  • Add quiz.json to each Phase 10 day
  • Phase 10 Overview polish β€” ensure ROI table and expert track are present
  • Add extras/ with LLM starter notebooks and prompt library

Phase 11 β€” Cloud Data Engineering (Days 121–132) βœ… Complete

  • Day_121_Cloud_Fundamentals β€” AWS/GCP/Azure architecture, IAM, cost management
  • Day_122_Object_Storage β€” S3, GCS, Delta Lake, Iceberg table formats
  • Day_123_Cloud_Data_Warehouses β€” BigQuery, Snowflake, Redshift architecture
  • Day_124_dbt_at_Scale β€” incremental models, snapshots, advanced patterns
  • Day_125_Orchestration β€” Apache Airflow, Prefect, Dagster
  • Day_126_Streaming_Pipelines β€” Kafka, Pub/Sub, Kinesis, real-time ETL
  • Day_127_Lakehouse_Architecture β€” Databricks, Unity Catalog, Delta Live Tables
  • Day_128_Data_Contracts_and_Quality β€” Great Expectations, Soda, data SLAs
  • Day_129_Cloud_Security_and_Compliance β€” VPC, encryption, PII handling
  • Day_130_Cost_Engineering β€” query optimisation for $/TB, slot management
  • Day_131_Platform_Engineering β€” Terraform for data infrastructure
  • Day_132_Capstone_Cloud_Data_Pipeline β€” end-to-end cloud pipeline project
  • Phase 11 Overview (300+ lines) β€” expanded to 513 lines with day-by-day journey, 3-tier skills matrix, 6 pitfalls, 5 exam questions, scenario walkthroughs, and expert track

Phase 12 β€” Analytics Engineering & Data Products (Days 133–140) βœ… Complete

  • Day_133_Analytics_Engineer_Role β€” vs Data Analyst, Data Scientist, DE
  • Day_134_Semantic_and_Metrics_Layers β€” dbt Metrics, Cube.js, LookML
  • Day_135_Self_Serve_Analytics β€” empowering stakeholders without SQL
  • Day_136_Data_Mesh_Principles β€” domain ownership, data products
  • Day_137_Product_Analytics_Deep_Dive β€” retention, funnels, cohort analysis
  • Day_138_AB_Testing_at_Scale β€” statistical rigor, experimentation platforms
  • Day_139_Data_Products_and_Monetization β€” API-first data, embedded analytics
  • Day_140_Capstone_Data_Product β€” design a data product for a business unit
  • Phase 12 Overview

Additional Gap-Filling Days

  • Day_68_AI_Agents_and_Tool_Use β€” LangChain/LlamaIndex agents, function calling, ReAct (Phase 6)
  • Day_69_Responsible_AI_in_Practice β€” model cards, Fairlearn, audit reporting (Phase 6)
  • Day_84C_Reverse_ETL_and_Semantic_Layer β€” Hightouch concepts, operational analytics (Phase 7)
  • Day_96C_Streaming_SQL_Fundamentals β€” Kafka concepts, ksqlDB basics, real-time aggregations (Phase 8)
  • Day_108C_Cloud_Native_SQL β€” BigQuery ML, Snowflake Cortex, Redshift ML
  • Day_108B_Curriculum_Capstone β€” ingest β†’ clean β†’ model β†’ visualise β†’ deploy (all 9+ phases)

🟒 Priority 3 β€” App Features

Structured Quiz JSON Integration

Each lesson day needs a quiz.json sidecar for the app's quiz engine:

{
  "day": 1,
  "questions": [
    {
      "id": "d01q01",
      "type": "multiple_choice",
      "question": "What is a variable in Python?",
      "options": ["A loop", "A labeled container for data", "A function", "A module"],
      "answer": 1,
      "explanation": "Variables are named storage locations..."
    }
  ]
}
  • Script: scripts/generate-quiz-stubs.js β€” scaffold quiz.json for all days that don't have one
  • App: QuizEngine component reads quiz.json, replaces markdown mastery-check section
  • App: wrong-answer analytics surfaced in the spaced-repetition store

Phase 1 quiz.json β€” completed (2026-03-02)

  • Day_01_Introduction β€” quiz.json (5 questions: print, arithmetic, exponentiation, print(), type conversion)
  • Day_02_Variables_Builtin_Functions β€” quiz.json (5 questions: data types, len(), naming rules, int(), reassignment)
  • Day_03_Operators β€” quiz.json (5 questions: PEMDAS, modulo, !=, +=, logical and)
  • Day_04_Strings β€” quiz.json (5 questions: slicing, f-strings, strip/lower, immutability, title())
  • Day_05_Lists β€” quiz.json (5 questions: slicing, aliasing, append(), sorted(), max())
  • Day_06_Tuples β€” quiz.json (5 questions: single-element tuple, immutability, unpacking, use cases, sum())
  • Day_07_Sets β€” quiz.json (5 questions: uniqueness, intersection, empty set, symmetric difference, hashability)
  • Day_08_Dictionaries β€” quiz.json (5 questions: get(), items(), comprehension, KeyError, len())
  • Day_09_Conditionals β€” quiz.json (5 questions: and logic, elif, falsy values, ternary, business scenario)
  • Day_10_Loops β€” quiz.json (5 questions: range(), break, enumerate(), while vs for, zip())
  • Day_11_Functions β€” quiz.json (5 questions: mutable default, **kwargs, pure functions, composition, lambda)
  • Day_11B_Generators_Iterators β€” quiz.json (5 questions: iterator protocol, memory advantage, yield, yield from, pipeline)
  • Day_11C_Debugging_Workflows β€” quiz.json (5 questions: traceback reading, KeyError, debug workflow, breakpoint(), logging)
  • Day_12_List_Comprehension β€” quiz.json (5 questions: basic, filter, generator expr, nested, readability)

Capstone Project Scaffolds (content/projects/)

  • 01_python_data_pipeline/ β€” Phase 1–2 skills showcase
  • 02_web_dashboard/ β€” Phase 3 Flask/Streamlit project
  • 03_ml_churn_predictor/ β€” Phase 4–5 end-to-end ML model
  • 04_bi_analytics_suite/ β€” Phase 6–7 Tableau/Power BI + SQL
  • 05_sql_data_warehouse/ β€” Phase 8–9 full DDL + ETL
  • 06_llm_data_assistant/ β€” Phase 10 RAG / agent demo

MBA Case Studies (content/case-studies/) βœ… Complete

Each case study includes a comprehensive README.md with hand-holding walkthrough (step-by-step guidance with checkpoints), a starter.py scaffold, and a data_generator.py for synthetic data.

  • 01 β€” Retail Customer Churn (Logistic Regression, XGBoost) β€” Phase 4–5
  • 02 β€” Finance Fraud Detection (Anomaly Detection, GNN) β€” Phase 5
  • 03 β€” Healthcare Patient Risk (Ensemble, Probabilistic) β€” Phase 5
  • 04 β€” E-Commerce Recommendations (Collaborative Filtering) β€” Phase 5
  • 05 β€” Marketing Campaign Attribution (A/B Testing, Causal Inference) β€” Phase 6
  • 06 β€” Operations Demand Forecasting (Time Series, ARIMA, Prophet) β€” Phase 5
  • 07 β€” HR Attrition Prediction (Classification, SHAP) β€” Phase 4–5
  • 08 β€” SaaS Growth Analytics (Cohorts, Product Analytics) β€” Phase 7
  • 09 β€” Supply Chain Inventory (LP, Simulation) β€” Phase 4
  • 10 β€” Banking Credit Scoring (Scorecard, Fairness) β€” Phase 6

Collaborative Features

  • Shareable progress link (base64-encoded state)
  • "Challenge a friend" β€” send quiz links
  • Discussion prompts at the end of each lesson

Offline / PWA Support

  • Service Worker for offline lesson reading
  • Cache lesson markdown on first visit
  • Offline progress tracking with sync-on-reconnect
  • PWA manifest for "Add to Home Screen"

πŸ”΅ Priority 4 β€” DX & Infrastructure

Content Quality Automation

  • scripts/audit-lessons.js β€” verify all days meet depth standard (500+ words, 3 exercises, 5 Q&A)
  • scripts/generate-quiz-stubs.js β€” scaffold missing quiz.json files
  • scripts/check-phase-overviews.js β€” flag overviews below 300 lines / 10 KB
  • CI gate: fail build if any lesson fails the depth audit

Cross-Phase "Career Tracks" Page

  • Design "career tracks" routing: Data Scientist / Analytics Engineer / ML Engineer tracks
  • App page linking days by specialisation
  • "What's Next" sidebar section on each phase overview page

Advanced Visualisations

  • Skill radar chart on Progress page
  • Heatmap calendar of study activity (GitHub-style)
  • Animated dependency tree of tech concepts
  • 3D concept graph (Three.js / React Three Fiber) β€” stretch goal

Testing

  • Increase unit test coverage to 80%+
  • Snapshot tests for all major page components
  • Accessibility audit (axe-core) in CI

2026 Market Alignment β€” Gap Tracker

Skill Coverage 2026 Demand Action
Python fundamentals βœ… Phase 1–2 High β€”
Pandas / NumPy βœ… Phase 2 High β€”
ML fundamentals βœ… Phase 4–5 High β€”
Deep learning βœ… Phase 5 High β€”
MLOps βœ… Phase 5 (Day 50, 65) Very High Minor depth increase
LLMs / GPT APIs βœ… Phase 10 Critical Polish pass
RAG & Vector DBs βœ… Phase 10 (Day 112) Critical Add Day 60C cross-ref
AI Agents βœ… Phase 10 (Day 115) Critical Add Day 67B Phase 6 bridge
dbt ⚠️ Day 84B (planned) High Implement P1
Cloud (AWS/GCP/Azure) ⚠️ Phase 8 surface only Very High Add Phase 11
Kafka / Streaming ⚠️ Day 96C (planned) High Implement P2
BI / Tableau βœ… Phase 6–7 High β€”
SQL mastery βœ… Phase 8–9 High β€”
Data governance βœ… Phase 7–8 High β€”
Responsible AI ⚠️ Day 62 (partial) High Add Day 67C Phase 6

🌌 Ambitious / Long-Horizon

  • Multi-user mode: instructor dashboard, cohort progress overview
  • AI-graded exercise submissions (code execution sandbox)
  • Video lesson stubs β€” embed Loom / YouTube per day
  • Phase 13: Financial Modelling & Quant Finance (Python, Monte Carlo, Black-Scholes)
  • Phase 14: Web3 & Decentralised Data (Solidity basics, on-chain analytics)
  • Localisation: Spanish & Mandarin translations of lesson summaries
  • Mobile app (React Native): offline-first lesson reader with push-notification streaks

This is a living document. Update priorities as phases are completed or the market shifts.