GitHub - subwaymatch/jupygrader: AI-assisted grading for Jupyter notebooks: automatically evaluate code and free-response answers with just a few lines of Python.

📋 Table of Contents

📝 Summary
✨ Key Features
📦 Installation
🔄 Update Jupygrader
🚀 Usage
- Basic usage
- Specify the output directory
📒 Create an autogradable notebook
🤖 AI-Assisted Grading
🔧 Utility functions
- Replace test cases
📄 License

📝 Summary

Jupygrader is a Python package for automated grading of Jupyter notebooks. It provides a framework to:

Execute and grade Jupyter notebooks containing student work and test cases
Generate comprehensive reports in multiple formats (JSON, HTML, TXT)
Extract student code from notebooks into separate Python files
Verify notebook integrity by computing hashes of test cases and submissions
Grade with AI assistance — use an LLM to grade manual items, review failures, or evaluate notebooks entirely without execution

✨ Key Features

Executes notebooks in a controlled, temporary environment
Preserves the original notebook while creating graded versions
Adds grader scripts to notebooks to evaluate test cases
Supports multiple grading modes:
- Automatic grading via assertions and tests
- Manual grading
- Hybrid (automatic + manual)
- AI-assisted grading (full or partial)
Generates detailed grading results including:
- Individual test case scores
- Overall scores and summaries
- Success/failure status of each test
Produces multiple output formats for instructors to review:
- Graded notebook (.ipynb)
- HTML report
- JSON result data
- Plaintext summary
- Extracted Python code
Includes metadata like Python version, platform, and file hashes for verification

📦 Installation

pip install jupygrader

🔄 Update Jupygrader

pip install --upgrade jupygrader

🚀 Usage

Basic usage

from jupygrader import grade_notebooks

notebook_file_path = 'path/to/notebook.ipynb'
grade_notebooks([notebook_file_path])

Supplying a pathlib.Path() object is supported.

from jupygrader import grade_notebooks
from pathlib import Path

notebook_path = Path('path/to/notebook.ipynb')
grade_notebooks([notebook_path])

If the output_path is not specified, the output files will be stored to the same directory as the notebook file.

During grading, Jupygrader preprocesses code cells and comments out lines that start with IPython shell/magic prefixes (! and %). This prevents notebook-only commands from causing syntax errors in the Python-based grading pipeline.

Specify the output directory

from jupygrader import grade_notebooks

grade_notebooks([{
    "notebook_path": 'path/to/notebook.ipynb',
    "output_path": 'path/to/output'
}])

📒 Create an autogradable notebook

The instructor authors only one "solution" notebook, which contains both the solution code and test cases for all graded parts.

Jupygrader provides a simple drag-and-drop interface to generate a student-facing notebook that removes the solution code and obfuscates test cases if required.

Code cell for learners

Any code between # YOUR CODE BEGINS and # YOUR CODE ENDS are stripped in the student version.

import pandas as pd

# YOUR CODE BEGINS
sample_series = pd.Series([-20, -10, 10, 20])
# YOUR CODE ENDS

print(sample_series)

nbgrader syntax (### BEGIN SOLUTION, ### END SOLUTION) is also supported.

import pandas as pd

### BEGIN SOLUTION
sample_series = pd.Series([-20, -10, 10, 20])
### END SOLUTION

print(sample_series)

In the student-facing notebook, the code cell will look like:

import pandas as pd

# YOUR CODE BEGINS

# YOUR CODE ENDS

print(sample_series)

Grader-only cells

To keep setup notes or helper code in the instructor notebook only, start any cell with one of the following markers. The full cell will be removed in the generated student version:

# GRADER_ONLY (case-insensitive)
# grader_only (case-insensitive)
! grader_only (case-insensitive)
_grader_only = True (case-sensitive; whitespace is ignored)

Graded test cases

A graded test case requires a test case name and an assigned point value.

The _test_case variable should store the name of the test case.
The _points variable should store the number of points, either as an integer or a float.

_test_case = 'create-a-pandas-series'
_points = 2

pd.testing.assert_series_equal(sample_series, pd.Series([-20, -10, 10, 20]))

Manually graded items

Mark a test case with _grade_manually = True to flag it for human (or AI) review instead of assertion-based grading.

_test_case = 'explain-your-approach'
_points = 5
_grade_manually = True

# Students write a free-response answer here

Obfuscate test cases

If you want to prevent learners from seeing the test case code, you can optionally set _obfuscate = True to base64-encode the test cases.

Note that this provides only basic obfuscation, and students can easily decode the string to reveal the original code.

We may introduce an encryption method in the future.

Instructor notebook

_test_case = 'create-a-pandas-series'
_points = 2
_obfuscate = True

pd.testing.assert_series_equal(sample_series, pd.Series([-20, -10, 10, 20]))

Student notebook

# DO NOT CHANGE THE CODE IN THIS CELL
_test_case = 'create-a-pandas-series'
_points = 2
_obfuscate = True

import base64 as _b64
_64 = _b64.b64decode('cGQudGVzdGluZy5hc3NlcnRfc2VyaWVzX2VxdWFsKHNhbXBsZV9zZXJpZXMsIHBkLlNlcmllcyhbLT\
IwLCAtMTAsIDEwLCAyMF0pKQ==')
eval(compile(_64, '<string>', 'exec'))

Add hidden test cases

Hidden test cases only run while grading.

Original test case

_test_case = 'create-a-pandas-series'
_points = 2

### BEGIN HIDDEN TESTS
pd.testing.assert_series_equal(sample_series, pd.Series([-20, -10, 10, 20]))
### END HIDDEN TESTS

Converted (before obfuscation)

_test_case = 'create-a-pandas-series'
_points = 2

if 'is_jupygrader_env' in globals():
    pd.testing.assert_series_equal(sample_series, pd.Series([-20, -10, 10, 20]))

🤖 AI-Assisted Grading

Jupygrader can use an OpenAI-compatible model to assist with grading. Set the ai_mode parameter to one of the following string values:

`ai_mode`	Description
`"off"`	No AI grading (default)
`"full"`	AI grades all test cases based on notebook content — no execution required
`"manual_only"`	AI grades test cases marked `_grade_manually = True`
`"review_failed"`	AI reviews auto-graded test cases that failed
`"manual_and_failed"`	AI grades both manual items and failed test cases

Note: openai_model is required whenever ai_mode is not "off". Omitting it raises a ValueError.

Full AI grading

Use ai_mode="full" to have the AI evaluate every test case based solely on the notebook's content, without executing it. This is ideal for open-ended assignments, essay-style responses, or notebooks that include general instructions rather than assertion-based tests.

import openai
from jupygrader import grade_notebooks

client = openai.OpenAI(api_key="your-api-key")

results = grade_notebooks(
    ["submissions/student1.ipynb", "submissions/student2.ipynb"],
    ai_mode="full",
    openai_client=client,
    openai_model="gpt-4o",
)

In "full" mode, test cases are parsed directly from the notebook's source cells (no execution). Notebooks without any test case cells are still processed and output artifacts are generated.

Review failed test cases

Use ai_mode="review_failed" to have the AI explain why auto-graded test cases failed and optionally award partial credit.

import openai
from jupygrader import grade_notebooks

client = openai.OpenAI(api_key="your-api-key")

results = grade_notebooks(
    ["submissions/student1.ipynb", "submissions/student2.ipynb"],
    ai_mode="review_failed",
    openai_client=client,
    openai_model="gpt-4o",
)

Grade manual items

Use ai_mode="manual_only" to have the AI grade items marked _grade_manually = True in the notebook.

import openai
from jupygrader import grade_notebooks

client = openai.OpenAI(api_key="your-api-key")

results = grade_notebooks(
    ["submissions/student1.ipynb", "submissions/student2.ipynb"],
    ai_mode="manual_only",
    openai_client=client,
    openai_model="gpt-4o",
)

Grade both manual and failed items

Use ai_mode="manual_and_failed" to combine both workflows in a single pass.

import openai
from jupygrader import grade_notebooks

client = openai.OpenAI(api_key="your-api-key")

results = grade_notebooks(
    ["submissions/student1.ipynb", "submissions/student2.ipynb"],
    ai_mode="manual_and_failed",
    openai_client=client,
    openai_model="gpt-4o",
)

Custom grading prompt

Pass custom_prompt to give the AI model additional context or grading criteria. This works with all AI grading modes.

import openai
from jupygrader import grade_notebooks

client = openai.OpenAI(api_key="your-api-key")

results = grade_notebooks(
    ["submissions/student1.ipynb"],
    ai_mode="full",
    openai_client=client,
    openai_model="gpt-4o",
    custom_prompt=(
        "This is a data analysis assignment. "
        "Award full points if the student produces a correct result, even if the approach differs. "
        "Deduct points for hard-coded values."
    ),
)

🔧 Utility functions

Replace test cases

If a test case needs to be updated before grading, use the jupygrader.replace_test_case() function.

This is useful when learners have already submitted their Jupyter notebooks, but the original notebook contains an incorrect test case.

nb = nbformat.read(notebook_path, as_version=4)

jupygrader.replace_test_case(nb, 'q1', '_test_case = "q1"\n_points = 6\n\nassert my_var == 3')

Below is a sample code snippet demonstrating how to replace multiple test cases using a dictionary.

nb = nbformat.read(notebook_path, as_version=4)

new_test_cases = {
    'test_case_01': '_test_case = "test_case_01"\n_points = 6\n\npass',
    'test_case_02': '_test_case = "test_case_02"\n_points = 3\n\npass'
}

for tc_name, new_tc_code in new_test_cases.items():
    jupygrader.replace_test_case(nb, tc_name, new_tc_code)

📄 License

jupygrader is distributed under the terms of the MIT license.

Name		Name	Last commit message	Last commit date
Latest commit History 145 Commits
.github/workflows		.github/workflows
docs		docs
sample-graded-results		sample-graded-results
src/jupygrader		src/jupygrader
tests		tests
.env.test.example		.env.test.example
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📋 Table of Contents

📝 Summary

✨ Key Features

📦 Installation

🔄 Update Jupygrader

🚀 Usage

Basic usage

Specify the output directory

📒 Create an autogradable notebook

Code cell for learners

Grader-only cells

Graded test cases

Manually graded items

Obfuscate test cases

Add hidden test cases

Original test case

Converted (before obfuscation)

🤖 AI-Assisted Grading

Full AI grading

Review failed test cases

Grade manual items

Grade both manual and failed items

Custom grading prompt

🔧 Utility functions

Replace test cases

📄 License

About

Uh oh!

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

📋 Table of Contents

📝 Summary

✨ Key Features

📦 Installation

🔄 Update Jupygrader

🚀 Usage

Basic usage

Specify the output directory

📒 Create an autogradable notebook

Code cell for learners

Grader-only cells

Graded test cases

Manually graded items

Obfuscate test cases

Add hidden test cases

Original test case

Converted (before obfuscation)

🤖 AI-Assisted Grading

Full AI grading

Review failed test cases

Grade manual items

Grade both manual and failed items

Custom grading prompt

🔧 Utility functions

Replace test cases

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages