Skip to content

Latest commit

 

History

History
58 lines (47 loc) · 1.24 KB

File metadata and controls

58 lines (47 loc) · 1.24 KB

minicode

Install

Install uv.

Creating benchmark splits locally

Run all commands from the repository root directory.

  1. CodeContests
uv run python -m minicode.setup_codecontests
  1. Small repositories
uv run python -m minicode.setup_repos
  1. Large repositories
uv run python -m minicode.setup_large_repos

Run agent baseline

Make sure that .env exists in the main directory with TOGETHER_API_KEY and OPENAI_API_KEY and ANTHROPIC_API_KEY.

  1. CodeContests
bash scripts/codecontests/run_claude.sh
# or
bash scripts/codecontests/run_codex.sh

To get the evaluation results, run

uv run python scripts/codecontests/summarize_eval.py
  1. Small repositories
bash scripts/small_repos/run_codex.sh
# or
bash scripts/small_repos/run_claude.sh
  1. Large repositories

At the moment, we run the agents locally. To get <repo_name>, first run the setup script for large repos above, then check the large_repos directory.

bash scripts/large_repos/run_claude.sh <repo_name>

To get the evaluation results, run

uv run python scripts/large_repos/summarize_eval.py <repo_name>

Other

  • Repositories were synthesized via Claude 3.7 and Claude Code here