-
Notifications
You must be signed in to change notification settings - Fork 6
Add benchmarks workflow and CI for collector throughput #224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,78 @@ | ||
| name: Benchmarks | ||
|
|
||
| on: | ||
| workflow_dispatch: | ||
|
|
||
| concurrency: | ||
| group: ${{ github.workflow }}-${{ github.ref }} | ||
| cancel-in-progress: true | ||
|
|
||
| jobs: | ||
| benchmark: | ||
| runs-on: ubuntu-latest | ||
| timeout-minutes: 30 | ||
|
|
||
| services: | ||
| postgres: | ||
| image: postgres:16 | ||
| env: | ||
| POSTGRES_USER: postgres | ||
| POSTGRES_PASSWORD: postgres | ||
| POSTGRES_DB: postgres | ||
| ports: ["5432:5432"] | ||
| options: >- | ||
| --health-cmd pg_isready | ||
| --health-interval 10s | ||
| --health-timeout 5s | ||
| --health-retries 5 | ||
| --shm-size=256mb | ||
|
|
||
| steps: | ||
| - name: Checkout | ||
| uses: actions/checkout@v4 | ||
|
|
||
| - name: Install uv | ||
| uses: astral-sh/setup-uv@v7 | ||
| with: | ||
| python-version: "3.13" | ||
|
|
||
| - name: Cache uv | ||
| uses: actions/cache@v4 | ||
| with: | ||
| path: ~/.cache/uv | ||
| key: ${{ runner.os }}-uv-benchmark-${{ hashFiles('requirements-dev.lock') }} | ||
| restore-keys: | | ||
| ${{ runner.os }}-uv-benchmark- | ||
| ${{ runner.os }}-uv- | ||
|
|
||
| - name: Install dependencies | ||
| env: | ||
| SETUPTOOLS_SCM_WRITE_TO_SOURCE: "1" | ||
| run: | | ||
| uv venv | ||
| uv pip install -r requirements-dev.lock | ||
| uv pip install -e . | ||
|
|
||
| - name: Run benchmarks | ||
| env: | ||
| DATABASE_URL: postgres://postgres:postgres@127.0.0.1:5432/postgres | ||
| SECRET_KEY: for-testing-only | ||
| DJANGO_SETTINGS_MODULE: config.test_settings | ||
| RUN_BENCHMARKS: "1" | ||
| run: | | ||
| uv run pytest benchmarks/ -m benchmark --benchmark-only \ | ||
| --benchmark-json=bench.json -v \ | ||
| --benchmark-disable-gc | ||
|
|
||
| - name: Compare to baselines | ||
| if: success() | ||
| run: | | ||
| uv run python benchmarks/compare_to_baseline.py bench.json benchmarks/baselines.json | ||
|
|
||
| - name: Upload benchmark JSON | ||
| if: always() | ||
| uses: actions/upload-artifact@v4 | ||
| with: | ||
| name: benchmark-json | ||
| path: bench.json | ||
| retention-days: 30 | ||
| Original file line number | Diff line number | Diff line change | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -34,11 +34,11 @@ Each Django app that has **models** provides a **`services.py`** module. This is | |||||||||||
| | `cppa_slack_tracker` | `cppa_slack_tracker/services.py` | Slack teams, channels, messages, membership. | | ||||||||||||
| | `wg21_paper_tracker` | `wg21_paper_tracker/services.py` | WG21 papers, authors, mailings. | | ||||||||||||
|
|
||||||||||||
| For a full list of functions, parameter/return types, and validation (e.g. empty `name` raises `ValueError`), see **[Service_API.md](Service_API.md)** and the per-app docs in **[service_api/](service_api/)** (index: [service_api/README.md](service_api/README.md)). DTO protocols shared across trackers are documented in **[service_api/core_protocols.md](service_api/core_protocols.md)** (generated from `core/protocols.py`). | ||||||||||||
| For a full list of functions, parameter/return types, and validation (e.g. empty `name` raises `ValueError`), see **[docs/Service_API.md](docs/Service_API.md)** and the per-app docs in **[docs/service_api/](docs/service_api/)** (index: [docs/service_api/README.md](docs/service_api/README.md)). | ||||||||||||
|
|
||||||||||||
| ### Regenerating service API docs | ||||||||||||
|
|
||||||||||||
| Reference tables in `docs/service_api/*.md` are produced by **[`scripts/generate_service_docs.py`](../scripts/generate_service_docs.py)** from each app’s `services.py` and from `core/protocols.py`. | ||||||||||||
| Reference tables in `docs/service_api/*.md` are produced by **[`scripts/generate_service_docs.py`](scripts/generate_service_docs.py)** from each app’s `services.py` and from `core/protocols.py`. | ||||||||||||
|
|
||||||||||||
| - **Markers:** Each file contains `<!-- SERVICE_API:GENERATED:START -->` … `<!-- SERVICE_API:GENERATED:END -->`. The script replaces **only** that region. Put hand-written notes (usage, cross-app warnings, command help) **below** the `END` marker. | ||||||||||||
| - **Regenerate locally:** `python scripts/generate_service_docs.py` (optional: `--app <django_app_label>` for one module). | ||||||||||||
|
|
@@ -65,22 +65,47 @@ Reference tables in `docs/service_api/*.md` are produced by **[`scripts/generate | |||||||||||
|
|
||||||||||||
| ### Testing | ||||||||||||
|
|
||||||||||||
| - **Running tests:** From the project root, install dev deps (`pip install -r requirements-dev.lock` or `uv pip install -r requirements-dev.lock`), start the test database (`docker compose -f docker-compose.test.yml up -d`), set `DATABASE_URL` (and `SECRET_KEY` for the process) as in [README.md](../README.md#running-tests), then run `python -m pytest`. Tests **always use PostgreSQL** (`config.test_settings`); there is no SQLite fallback. | ||||||||||||
| - See [README.md](../README.md#running-tests) and [Development_guideline.md](Development_guideline.md#testing-workflow) for full commands and options. | ||||||||||||
| - **Running tests:** From the project root, install dev deps (`pip install -r requirements-dev.lock` or `uv pip install -r requirements-dev.lock`), start the test database (`docker compose -f docker-compose.test.yml up -d`), set `DATABASE_URL` (and `SECRET_KEY` for the process) as in [README.md](README.md#running-tests), then run `python -m pytest`. Tests **always use PostgreSQL** (`config.test_settings`); there is no SQLite fallback. | ||||||||||||
| - See [README.md](README.md#running-tests) and [docs/Development_guideline.md](docs/Development_guideline.md#testing-workflow) for full commands and options. | ||||||||||||
| - **Unit tests for `services.py`:** Call the service functions and assert on the database (or mocks) as needed. | ||||||||||||
| - **Other tests:** Prefer service functions when setting up data. If you must create models directly for tests, keep it in test code (e.g. fixtures or test helpers) and avoid doing the same in production code. | ||||||||||||
|
|
||||||||||||
| ### Performance benchmarks | ||||||||||||
|
|
||||||||||||
| Throughput checks live under [`benchmarks/`](benchmarks/) and use **`pytest-benchmark`**. They are **not** collected during normal `pytest` runs: set **`RUN_BENCHMARKS=1`** so the root [`conftest.py`](conftest.py) stops ignoring that directory (see `collect_ignore`). Tests are marked with **`@pytest.mark.benchmark`**. | ||||||||||||
|
|
||||||||||||
| **Prerequisites:** Same as unit tests: PostgreSQL, `DATABASE_URL`, `SECRET_KEY`, `DJANGO_SETTINGS_MODULE=config.test_settings` (see [README.md](README.md#running-tests)). | ||||||||||||
|
|
||||||||||||
| **Run locally** (from repo root, with Postgres up): | ||||||||||||
|
|
||||||||||||
| ```bash | ||||||||||||
| export RUN_BENCHMARKS=1 | ||||||||||||
| export DATABASE_URL=postgres://postgres:postgres@127.0.0.1:5433/postgres | ||||||||||||
| export SECRET_KEY=for-local-only | ||||||||||||
| export DJANGO_SETTINGS_MODULE=config.test_settings | ||||||||||||
| # Optional: batch size (default 50; match benchmarks/baselines.json "n") | ||||||||||||
| export BENCHMARK_COMMIT_N=50 | ||||||||||||
|
|
||||||||||||
| uv run pytest benchmarks/ -m benchmark --benchmark-only \ | ||||||||||||
| --benchmark-json=bench.json -v | ||||||||||||
|
Comment on lines
+89
to
+90
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧩 Analysis chain🏁 Script executed: #!/bin/bash
# Check consistency of benchmark GC flag between workflow and docs.
rg -n -- '--benchmark-disable-gc' .github/workflows/benchmarks.yml CONTRIBUTING.md
# Expected after fix: matches in both filesRepository: cppalliance/boost-data-collector Length of output: 147 🏁 Script executed: #!/bin/bash
# Get context around the benchmark command in CONTRIBUTING.md (lines 85-95)
echo "=== CONTRIBUTING.md (lines 85-95) ==="
sed -n '85,95p' CONTRIBUTING.md
# Get context around the flag in the CI workflow (lines 60-70)
echo -e "\n=== .github/workflows/benchmarks.yml (lines 60-70) ==="
sed -n '60,70p' .github/workflows/benchmarks.ymlRepository: cppalliance/boost-data-collector Length of output: 1371 Align local benchmark command with CI benchmark settings. Line 89–90 omits Suggested doc update uv run pytest benchmarks/ -m benchmark --benchmark-only \
- --benchmark-json=bench.json -v
+ --benchmark-json=bench.json -v \
+ --benchmark-disable-gc📝 Committable suggestion
Suggested change
🤖 Prompt for AI Agents |
||||||||||||
| uv run python benchmarks/compare_to_baseline.py bench.json benchmarks/baselines.json | ||||||||||||
| ``` | ||||||||||||
|
|
||||||||||||
| **Baselines:** [`benchmarks/baselines.json`](benchmarks/baselines.json) stores maximum acceptable **median** seconds per scenario (for the configured `n`). The compare script fails if any median exceeds `baseline_median × 1.25` (more than 25% slower than the reference). After a deliberate performance change or a CI image upgrade, update `median_seconds` (and `n` if you change `BENCHMARK_COMMIT_N`) using `stats.median` from the generated JSON. | ||||||||||||
|
|
||||||||||||
| **CI:** The [`.github/workflows/benchmarks.yml`](.github/workflows/benchmarks.yml) workflow runs on **`workflow_dispatch`** only, uploads `bench.json` as an artifact, and runs the compare step on success. | ||||||||||||
|
|
||||||||||||
| ## Other guidelines | ||||||||||||
|
|
||||||||||||
| - **Branching:** Create feature branches from `develop`. Open pull requests against `develop`. See [Development_guideline.md](Development_guideline.md). | ||||||||||||
| - **Branching:** Create feature branches from `develop`. Open pull requests against `develop`. See [docs/Development_guideline.md](docs/Development_guideline.md). | ||||||||||||
| - **Code style:** Use Python 3.11+ and follow Django and project conventions. Use the project’s logging (`logging.getLogger(__name__)`). Before pushing, run **`uv run pyright`** (with dev deps) for the paths covered by **`pyrightconfig.json`**, and ensure CI’s **lint** / **pyright** / **test** jobs would pass. | ||||||||||||
| - **Database:** Use the Django ORM and migrations. Writes only through the service layer as above. | ||||||||||||
| - **Docs:** Update this doc (and app `services.py` docstrings) when adding new apps or changing the write rules. After changing `services.py` or `core/protocols.py`, run `python scripts/generate_service_docs.py` and commit the updated `docs/service_api/` files. | ||||||||||||
| - **Docs:** Update this file (and app `services.py` docstrings) when adding new apps or changing the write rules. After changing `services.py` or `core/protocols.py`, run `python scripts/generate_service_docs.py` and commit the updated `docs/service_api/` files. | ||||||||||||
|
|
||||||||||||
| ## Related documentation | ||||||||||||
|
|
||||||||||||
| - [Service_API.md](Service_API.md) – API reference for all service layer functions. | ||||||||||||
| - [Development_guideline.md](Development_guideline.md) – Setup, workflow, adding apps. | ||||||||||||
| - [Workflow.md](Workflow.md) – Execution order and collectors. | ||||||||||||
| - [Schema.md](Schema.md) – Database schema. | ||||||||||||
| - [cross-app-dependencies.md](cross-app-dependencies.md) – Complete map of every cross-app FK, MTI, ORM read, and Python import dependency, plus `import-linter` recommendations. | ||||||||||||
| - [docs/Service_API.md](docs/Service_API.md) – API reference for all service layer functions. | ||||||||||||
| - [docs/Development_guideline.md](docs/Development_guideline.md) – Setup, workflow, adding apps. | ||||||||||||
| - [docs/Workflow.md](docs/Workflow.md) – Execution order and collectors. | ||||||||||||
| - [docs/Schema.md](docs/Schema.md) – Database schema. | ||||||||||||
| - [docs/cross-app-dependencies.md](docs/cross-app-dependencies.md) – Complete map of every cross-app FK, MTI, ORM read, and Python import dependency, plus `import-linter` recommendations. | ||||||||||||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,132 @@ | ||
| { | ||
| "machine_info": { | ||
| "node": "Leos-Mac-mini.local", | ||
| "processor": "arm", | ||
| "machine": "arm64", | ||
| "python_compiler": "Clang 21.1.4 ", | ||
| "python_implementation": "CPython", | ||
| "python_implementation_version": "3.13.12", | ||
| "python_version": "3.13.12", | ||
| "python_build": [ | ||
| "main", | ||
| "Mar 10 2026 18:26:32" | ||
| ], | ||
| "release": "25.4.0", | ||
| "system": "Darwin", | ||
| "cpu": { | ||
| "python_version": "3.13.12.final.0 (64 bit)", | ||
| "cpuinfo_version": [ | ||
| 9, | ||
| 0, | ||
| 0 | ||
| ], | ||
| "cpuinfo_version_string": "9.0.0", | ||
| "arch": "ARM_8", | ||
| "bits": 64, | ||
| "count": 10, | ||
| "arch_string_raw": "arm64", | ||
| "brand_raw": "Apple M4" | ||
| } | ||
| }, | ||
| "commit_info": { | ||
| "id": "7bf1b7ea6657990eef44fdb362b762abb16e41ba", | ||
| "time": "2026-05-18T20:05:08-04:00", | ||
| "author_time": "2026-05-18T20:05:08-04:00", | ||
| "dirty": true, | ||
| "project": "boost-data-collector", | ||
| "branch": "develop" | ||
| }, | ||
|
Comment on lines
+2
to
+38
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do not commit raw benchmark result artifacts with local machine metadata. This file includes host-identifying data (e.g., Line 3) and ephemeral local run state. It should be produced in CI/local runs and stored as an artifact, not tracked in the repo. 🤖 Prompt for AI Agents |
||
| "benchmarks": [ | ||
| { | ||
| "group": null, | ||
| "name": "test_process_commit_data_batch", | ||
| "fullname": "benchmarks/test_github_commits_throughput.py::test_process_commit_data_batch", | ||
| "params": null, | ||
| "param": null, | ||
| "extra_info": { | ||
| "n": 50 | ||
| }, | ||
| "options": { | ||
| "disable_gc": false, | ||
| "timer": "perf_counter", | ||
| "min_rounds": 5, | ||
| "max_time": 1.0, | ||
| "min_time": 5e-06, | ||
| "warmup": false | ||
| }, | ||
| "stats": { | ||
| "min": 0.13009395799599588, | ||
| "max": 0.16657558304723352, | ||
| "mean": 0.14227045823354273, | ||
| "stddev": 0.01457181655810832, | ||
| "rounds": 5, | ||
| "median": 0.13689958304166794, | ||
| "iqr": 0.01724434396601282, | ||
| "q1": 0.1326302083034534, | ||
| "q3": 0.14987455226946622, | ||
| "iqr_outliers": 0, | ||
| "stddev_outliers": 1, | ||
| "outliers": "1;0", | ||
| "ld15iqr": 0.13009395799599588, | ||
| "hd15iqr": 0.16657558304723352, | ||
| "ops": 7.0288660936092535, | ||
| "total": 0.7113522911677137, | ||
| "data": [ | ||
| 0.16657558304723352, | ||
| 0.1334756250726059, | ||
| 0.13009395799599588, | ||
| 0.13689958304166794, | ||
| 0.14430754201021045 | ||
| ], | ||
| "iterations": 1 | ||
| } | ||
| }, | ||
| { | ||
| "group": null, | ||
| "name": "test_service_bulk_commits_and_file_changes", | ||
| "fullname": "benchmarks/test_service_bulk_insert.py::test_service_bulk_commits_and_file_changes", | ||
| "params": null, | ||
| "param": null, | ||
| "extra_info": { | ||
| "n": 50 | ||
| }, | ||
| "options": { | ||
| "disable_gc": false, | ||
| "timer": "perf_counter", | ||
| "min_rounds": 5, | ||
| "max_time": 1.0, | ||
| "min_time": 5e-06, | ||
| "warmup": false | ||
| }, | ||
| "stats": { | ||
| "min": 0.10591337503865361, | ||
| "max": 0.1513816670048982, | ||
| "mean": 0.13538706267718226, | ||
| "stddev": 0.01819949434483927, | ||
| "rounds": 6, | ||
| "median": 0.14058843749808148, | ||
| "iqr": 0.02617037494201213, | ||
| "q1": 0.12384004204068333, | ||
| "q3": 0.15001041698269546, | ||
| "iqr_outliers": 0, | ||
| "stddev_outliers": 1, | ||
| "outliers": "1;0", | ||
| "ld15iqr": 0.10591337503865361, | ||
| "hd15iqr": 0.1513816670048982, | ||
| "ops": 7.386230118489284, | ||
| "total": 0.8123223760630935, | ||
| "data": [ | ||
| 0.1513816670048982, | ||
| 0.15001041698269546, | ||
| 0.13251695793587714, | ||
| 0.12384004204068333, | ||
| 0.1486599170602858, | ||
| 0.10591337503865361 | ||
| ], | ||
| "iterations": 1 | ||
| } | ||
| } | ||
| ], | ||
| "datetime": "2026-05-19T18:09:23.360634+00:00", | ||
| "version": "5.2.3" | ||
| } | ||
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,13 @@ | ||||||||||||||||||||||||||
| { | ||||||||||||||||||||||||||
| "description": "Maximum acceptable median wall time (seconds) per scenario at BENCHMARK_COMMIT_N. Update median_seconds after intentional perf work or when CI hardware changes; copy medians from --benchmark-json stats.median.", | ||||||||||||||||||||||||||
| "benchmarks": { | ||||||||||||||||||||||||||
| "benchmarks/test_github_commits_throughput.py::test_process_commit_data_batch": { | ||||||||||||||||||||||||||
| "median_seconds": 45.0, | ||||||||||||||||||||||||||
| "n": 50 | ||||||||||||||||||||||||||
| }, | ||||||||||||||||||||||||||
| "benchmarks/test_service_bulk_insert.py::test_service_bulk_commits_and_file_changes": { | ||||||||||||||||||||||||||
| "median_seconds": 35.0, | ||||||||||||||||||||||||||
| "n": 50 | ||||||||||||||||||||||||||
|
Comment on lines
+5
to
+10
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Baseline medians are so high that regression checks become ineffective. Line 5 ( 🔧 Proposed fix- "median_seconds": 45.0,
+ "median_seconds": 0.137,
@@
- "median_seconds": 35.0,
+ "median_seconds": 0.141,📝 Committable suggestion
Suggested change
🤖 Prompt for AI Agents |
||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
Repository: cppalliance/boost-data-collector
Length of output: 230
Pin all workflow actions to full commit SHAs.
Lines 32, 35, 40, and 74 use floating version tags (
@v4,@v7) instead of immutable commit SHAs, which violates supply-chain pinning policy and allows uncontrolled updates to action implementations.Suggested fix
Replace each
@v*tag with the pinned commit SHA from the action's releases page.🧰 Tools
🪛 zizmor (1.25.2)
[warning] 31-32: credential persistence through GitHub Actions artifacts (artipacked): does not set persist-credentials: false
(artipacked)
[error] 32-32: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)
(unpinned-uses)
🤖 Prompt for AI Agents