minicode

Install

Install uv.

Creating benchmark splits locally

Run all commands from the repository root directory.

CodeContests

uv run python -m minicode.setup_codecontests

Small repositories

uv run python -m minicode.setup_repos

Large repositories

uv run python -m minicode.setup_large_repos

Run agent baseline

Make sure that .env exists in the main directory with TOGETHER_API_KEY and OPENAI_API_KEY and ANTHROPIC_API_KEY.

CodeContests

bash scripts/codecontests/run_claude.sh
# or
bash scripts/codecontests/run_codex.sh

To get the evaluation results, run

uv run python scripts/codecontests/summarize_eval.py

Small repositories

bash scripts/small_repos/run_codex.sh
# or
bash scripts/small_repos/run_claude.sh

Large repositories

At the moment, we run the agents locally. To get <repo_name>, first run the setup script for large repos above, then check the large_repos directory.

bash scripts/large_repos/run_claude.sh <repo_name>

To get the evaluation results, run

uv run python scripts/large_repos/summarize_eval.py <repo_name>

Other

Repositories were synthesized via Claude 3.7 and Claude Code here

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

minicode

Install

Creating benchmark splits locally

Run agent baseline

Other

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

minicode

Install

Creating benchmark splits locally

Run agent baseline

Other