Skip to content

Add benchmarks workflow and CI for collector throughput#224

Open
leostar0412 wants to merge 2 commits into
cppalliance:developfrom
leostar0412:feat/benchmarks-and-ci
Open

Add benchmarks workflow and CI for collector throughput#224
leostar0412 wants to merge 2 commits into
cppalliance:developfrom
leostar0412:feat/benchmarks-and-ci

Conversation

@leostar0412
Copy link
Copy Markdown
Collaborator

@leostar0412 leostar0412 commented May 20, 2026

Summary

  • Add benchmark suite and baseline comparison for key collector paths.
  • Add GitHub Actions workflow to run benchmarks in CI.

Test plan

  • CI workflow passes on this branch.
  • Benchmarks run locally (pytest / project benchmark command as documented).

Summary by CodeRabbit

  • New Features

    • Added automated performance benchmarking workflow that runs on demand in CI
    • Support for local benchmark testing with baseline comparison and regression detection
  • Documentation

    • Updated contributing guidelines with benchmarking instructions and prerequisites
    • Corrected documentation links throughout the project

Review Change Stack

@leostar0412 leostar0412 self-assigned this May 20, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 20, 2026

📝 Walkthrough

Walkthrough

This pull request introduces a complete performance benchmarking infrastructure alongside documentation alignment. It adds pytest-based benchmark tests for measuring commit processing and service-layer write throughput, baseline tracking with automated comparison via CI, and updates all documentation references from legacy paths to a unified CONTRIBUTING.md location.

Changes

Benchmark Infrastructure and Documentation Updates

Layer / File(s) Summary
Benchmark framework, tests, and CI setup
benchmarks/conftest.py, benchmarks/test_github_commits_throughput.py, benchmarks/test_service_bulk_insert.py, benchmarks/compare_to_baseline.py, benchmarks/baselines.json, bench.json, .github/workflows/benchmarks.yml, pytest.ini, conftest.py, pyproject.toml, requirements-dev.in
New benchmark_commit_n fixture controls iterations; two benchmark tests measure _process_commit_data throughput over n payloads and service-layer batch commit+file-change writes. Baseline JSON tracks expected median timings with configurable regression ratio (1.25×). CI workflow provisions PostgreSQL, runs benchmarks with pytest-benchmark, executes comparison script on success, and uploads artifact. Configuration excludes benchmarks from package discovery and adds pytest marker.
Contributing guide with benchmark documentation
CONTRIBUTING.md
Service API documentation paths updated to docs/Service_API.md and docs/service_api/. Testing section aligned with README.md guidance. New "Performance benchmarks" section documents benchmark collection behavior, RUN_BENCHMARKS=1 requirement, local test execution, baseline/threshold mechanics, and CI workflow steps.
Service module docstring link corrections
boost_library_tracker/services.py, boost_mailing_list_tracker/services.py, boost_usage_tracker/services.py, cppa_pinecone_sync/services.py, cppa_slack_tracker/services.py, cppa_user_tracker/services.py, cppa_youtube_script_tracker/services.py, github_activity_tracker/services.py
Module-level docstrings updated from docs/Contributing.md to CONTRIBUTING.md for the project-wide rule that all model writes go through the service module.
Documentation file link corrections
README.md, docs/How_to_add_a_collector.md, docs/Onboarding.md, docs/README.md, docs/Service_API.md, docs/boost_library_docs_tracker.md, docs/cross-app-dependencies.md, docs/service_api/README.md, docs/service_api/boost_usage_tracker.md, docs/service_api/clang_github_tracker.md, docs/service_api/cppa_pinecone_sync.md, docs/service_api/cppa_user_tracker.md, docs/service_api/discord_activity_tracker.md, docs/service_api/github_activity_tracker.md
All internal documentation links corrected from legacy Contributing.md / docs/Contributing.md paths to CONTRIBUTING.md with appropriate relative paths (e.g., ../CONTRIBUTING.md or ../../CONTRIBUTING.md from nested docs directories).

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly Related PRs

Suggested Reviewers

  • jonathanMLDev
  • wpak-ai

🐰 Benchmarks hop in with baseline grace,
Tests measure commits at steady pace,
Docs aligned from docs to root so clean,
Contributing guides where they should convene!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 37.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main changes: adding a benchmarks workflow and CI for collector throughput measurement.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
docs/Service_API.md (1)

43-43: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix the remaining stale Contributing.md link in this file.

Line 43 still points to Contributing.md, which is inconsistent with the canonical root CONTRIBUTING.md path and can break on case-sensitive environments.

Suggested fix
-Tables in each file are **generated** from source; see [Contributing.md](Contributing.md#regenerating-service-api-docs).
+Tables in each file are **generated** from source; see [CONTRIBUTING.md](../CONTRIBUTING.md#regenerating-service-api-docs).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/Service_API.md` at line 43, Update the stale link that points to
"Contributing.md" to use the canonical uppercase path "CONTRIBUTING.md" in the
docs/Service_API.md content (the line that currently reads "see
[Contributing.md](Contributing.md#regenerating-service-api-docs)"); replace both
the filename and its fragment target if needed so the link becomes "see
[CONTRIBUTING.md](CONTRIBUTING.md#regenerating-service-api-docs)" to avoid
case-sensitivity issues.
🧹 Nitpick comments (1)
.github/workflows/benchmarks.yml (1)

31-33: ⚡ Quick win

Add persist-credentials: false to the checkout step for GitHub Actions.

The checkout action retains git credentials by default. Set persist-credentials: false to minimize token exposure in this workflow.

Suggested patch
       - name: Checkout
         uses: actions/checkout@v4
+        with:
+          persist-credentials: false
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/benchmarks.yml around lines 31 - 33, The Checkout step
using actions/checkout@v4 currently retains credentials by default; update the
"Checkout" step to include persist-credentials: false (i.e., add the key
persist-credentials with value false under the uses: actions/checkout@v4 entry)
so the checkout action does not persist git credentials into the workflow
environment.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/benchmarks.yml:
- Line 32: The workflow uses floating action tags (e.g., actions/checkout@v4,
actions/setup-node@v7, actions/cache@v4 and the other action at around line 74)
which must be replaced with immutable commit SHAs; locate each uses: entry for
actions/checkout, actions/setup-node, actions/cache and the other referenced
action and replace the `@v`* tag with the corresponding full commit SHA from that
action's GitHub releases/tags page so the workflow pins to a specific commit SHA
instead of a floating version.

In `@bench.json`:
- Around line 2-38: The bench.json file contains sensitive local machine
metadata under machine_info and commit_info and must not be committed; remove
bench.json from the repo, add its filename or pattern to .gitignore, and move
generation of this artifact to CI/artifacts rather than source control; if this
file was already pushed, purge it from history using a history-rewriting tool
(git filter-repo or BFG) or remove it via git rm --cached and force-push, and
update CI (the job that produces bench.json) to upload it as a build artifact
instead of committing.

In `@benchmarks/baselines.json`:
- Around line 5-10: The baseline median_seconds for the benchmarks are
unrealistically high and make regression checks useless; update the
"median_seconds" values for the affected entries (the first entry with
"median_seconds": 45.0 and the
"benchmarks/test_service_bulk_insert.py::test_service_bulk_commits_and_file_changes"
entry with "median_seconds": 35.0) to the actual recorded medians from this PR’s
bench.json (approximately 0.1369 and 0.1406 respectively) while leaving other
fields (like "n": 50) unchanged so CI regression thresholds are meaningful.

In `@benchmarks/compare_to_baseline.py`:
- Line 77: The error string uses a non-ASCII multiplication character "×" which
triggers Ruff RUF001; update the formatted message where f"(baseline
{float(ref):.6f}s × {args.regression_ratio})" is constructed (referencing
variables ref and args.regression_ratio) to use an ASCII character such as "x"
or "*" instead (e.g. "x") so the text becomes f"(baseline {float(ref):.6f}s x
{args.regression_ratio})".

In `@benchmarks/test_service_bulk_insert.py`:
- Line 28: The current commit hash generation in the 'hashes' list uses
f"svcbulk{i:056d}"[:40], which truncates the variable part and makes all entries
identical; update the expression that builds 'hashes' (the list comprehension
assigned to the variable hashes) so the varying suffix is preserved—for example
use the trailing slice f"svcbulk{i:056d}"[-40:] or otherwise include i in the
kept portion (or replace with a deterministic hash like
hashlib.sha1(f"svcbulk{i}".encode()).hexdigest()[:40]) so each iteration
produces a distinct commit_hash.

In `@CONTRIBUTING.md`:
- Around line 89-90: The documented local benchmark command (the pytest
invocation shown: "uv run pytest benchmarks/ -m benchmark --benchmark-only
--benchmark-json=bench.json -v") is missing the CI-only flag; update that
command in CONTRIBUTING.md to include --benchmark-disable-gc so local runs
disable the GC exactly like the CI workflow, ensuring comparable results.

In `@docs/service_api/cppa_pinecone_sync.md`:
- Line 5: In docs/service_api/cppa_pinecone_sync.md update the remaining legacy
CONTRIBUTING link by replacing the incorrect "../Contributing.md" occurrence
(the link text on Line 25) with the correct "../../CONTRIBUTING.md" so both
references use the canonical uppercase CONTRIBUTING.md path; search for the
string "../Contributing.md" and change it to "../../CONTRIBUTING.md" to ensure
consistency with the earlier fix.

---

Outside diff comments:
In `@docs/Service_API.md`:
- Line 43: Update the stale link that points to "Contributing.md" to use the
canonical uppercase path "CONTRIBUTING.md" in the docs/Service_API.md content
(the line that currently reads "see
[Contributing.md](Contributing.md#regenerating-service-api-docs)"); replace both
the filename and its fragment target if needed so the link becomes "see
[CONTRIBUTING.md](CONTRIBUTING.md#regenerating-service-api-docs)" to avoid
case-sensitivity issues.

---

Nitpick comments:
In @.github/workflows/benchmarks.yml:
- Around line 31-33: The Checkout step using actions/checkout@v4 currently
retains credentials by default; update the "Checkout" step to include
persist-credentials: false (i.e., add the key persist-credentials with value
false under the uses: actions/checkout@v4 entry) so the checkout action does not
persist git credentials into the workflow environment.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 17f058c6-9386-4b86-aedf-5a5e3b4e4528

📥 Commits

Reviewing files that changed from the base of the PR and between ba453d9 and 94302c7.

⛔ Files ignored due to path filters (1)
  • requirements-dev.lock is excluded by !**/*.lock
📒 Files selected for processing (34)
  • .github/workflows/benchmarks.yml
  • CONTRIBUTING.md
  • README.md
  • bench.json
  • benchmarks/baselines.json
  • benchmarks/compare_to_baseline.py
  • benchmarks/conftest.py
  • benchmarks/test_github_commits_throughput.py
  • benchmarks/test_service_bulk_insert.py
  • boost_library_tracker/services.py
  • boost_mailing_list_tracker/services.py
  • boost_usage_tracker/services.py
  • conftest.py
  • cppa_pinecone_sync/services.py
  • cppa_slack_tracker/services.py
  • cppa_user_tracker/services.py
  • cppa_youtube_script_tracker/services.py
  • docs/How_to_add_a_collector.md
  • docs/Onboarding.md
  • docs/README.md
  • docs/Service_API.md
  • docs/boost_library_docs_tracker.md
  • docs/cross-app-dependencies.md
  • docs/service_api/README.md
  • docs/service_api/boost_usage_tracker.md
  • docs/service_api/clang_github_tracker.md
  • docs/service_api/cppa_pinecone_sync.md
  • docs/service_api/cppa_user_tracker.md
  • docs/service_api/discord_activity_tracker.md
  • docs/service_api/github_activity_tracker.md
  • github_activity_tracker/services.py
  • pyproject.toml
  • pytest.ini
  • requirements-dev.in


steps:
- name: Checkout
uses: actions/checkout@v4
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify no floating action refs remain in this workflow.
rg -nP '^\s*uses:\s*[^@\s]+@(?!(?:[0-9a-f]{40})\b)' .github/workflows/benchmarks.yml
# Expected after fix: no output

Repository: cppalliance/boost-data-collector

Length of output: 230


Pin all workflow actions to full commit SHAs.

Lines 32, 35, 40, and 74 use floating version tags (@v4, @v7) instead of immutable commit SHAs, which violates supply-chain pinning policy and allows uncontrolled updates to action implementations.

Suggested fix
- uses: actions/checkout@v4
+ uses: actions/checkout@<40-char-sha>

- uses: astral-sh/setup-uv@v7
+ uses: astral-sh/setup-uv@<40-char-sha>

- uses: actions/cache@v4
+ uses: actions/cache@<40-char-sha>

- uses: actions/upload-artifact@v4
+ uses: actions/upload-artifact@<40-char-sha>

Replace each @v* tag with the pinned commit SHA from the action's releases page.

🧰 Tools
🪛 zizmor (1.25.2)

[warning] 31-32: credential persistence through GitHub Actions artifacts (artipacked): does not set persist-credentials: false

(artipacked)


[error] 32-32: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)

(unpinned-uses)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/benchmarks.yml at line 32, The workflow uses floating
action tags (e.g., actions/checkout@v4, actions/setup-node@v7, actions/cache@v4
and the other action at around line 74) which must be replaced with immutable
commit SHAs; locate each uses: entry for actions/checkout, actions/setup-node,
actions/cache and the other referenced action and replace the `@v`* tag with the
corresponding full commit SHA from that action's GitHub releases/tags page so
the workflow pins to a specific commit SHA instead of a floating version.

Comment thread bench.json
Comment on lines +2 to +38
"machine_info": {
"node": "Leos-Mac-mini.local",
"processor": "arm",
"machine": "arm64",
"python_compiler": "Clang 21.1.4 ",
"python_implementation": "CPython",
"python_implementation_version": "3.13.12",
"python_version": "3.13.12",
"python_build": [
"main",
"Mar 10 2026 18:26:32"
],
"release": "25.4.0",
"system": "Darwin",
"cpu": {
"python_version": "3.13.12.final.0 (64 bit)",
"cpuinfo_version": [
9,
0,
0
],
"cpuinfo_version_string": "9.0.0",
"arch": "ARM_8",
"bits": 64,
"count": 10,
"arch_string_raw": "arm64",
"brand_raw": "Apple M4"
}
},
"commit_info": {
"id": "7bf1b7ea6657990eef44fdb362b762abb16e41ba",
"time": "2026-05-18T20:05:08-04:00",
"author_time": "2026-05-18T20:05:08-04:00",
"dirty": true,
"project": "boost-data-collector",
"branch": "develop"
},
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not commit raw benchmark result artifacts with local machine metadata.

This file includes host-identifying data (e.g., Line 3) and ephemeral local run state. It should be produced in CI/local runs and stored as an artifact, not tracked in the repo.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@bench.json` around lines 2 - 38, The bench.json file contains sensitive local
machine metadata under machine_info and commit_info and must not be committed;
remove bench.json from the repo, add its filename or pattern to .gitignore, and
move generation of this artifact to CI/artifacts rather than source control; if
this file was already pushed, purge it from history using a history-rewriting
tool (git filter-repo or BFG) or remove it via git rm --cached and force-push,
and update CI (the job that produces bench.json) to upload it as a build
artifact instead of committing.

Comment thread benchmarks/baselines.json
Comment on lines +5 to +10
"median_seconds": 45.0,
"n": 50
},
"benchmarks/test_service_bulk_insert.py::test_service_bulk_commits_and_file_changes": {
"median_seconds": 35.0,
"n": 50
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Baseline medians are so high that regression checks become ineffective.

Line 5 (45.0) and Line 9 (35.0) are far above the recorded medians in this PR’s bench.json (~0.1369s and 0.1406s). With a 1.25 ratio, this would allow massive slowdowns without failing CI.

🔧 Proposed fix
-      "median_seconds": 45.0,
+      "median_seconds": 0.137,
@@
-      "median_seconds": 35.0,
+      "median_seconds": 0.141,
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"median_seconds": 45.0,
"n": 50
},
"benchmarks/test_service_bulk_insert.py::test_service_bulk_commits_and_file_changes": {
"median_seconds": 35.0,
"n": 50
"median_seconds": 0.137,
"n": 50
},
"benchmarks/test_service_bulk_insert.py::test_service_bulk_commits_and_file_changes": {
"median_seconds": 0.141,
"n": 50
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@benchmarks/baselines.json` around lines 5 - 10, The baseline median_seconds
for the benchmarks are unrealistically high and make regression checks useless;
update the "median_seconds" values for the affected entries (the first entry
with "median_seconds": 45.0 and the
"benchmarks/test_service_bulk_insert.py::test_service_bulk_commits_and_file_changes"
entry with "median_seconds": 35.0) to the actual recorded medians from this PR’s
bench.json (approximately 0.1369 and 0.1406 respectively) while leaving other
fields (like "n": 50) unchanged so CI regression thresholds are meaningful.

if median > limit:
failures.append(
f"{fullname}: median {median:.6f}s exceeds limit {limit:.6f}s "
f"(baseline {float(ref):.6f}s × {args.regression_ratio})"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Replace ambiguous multiplication symbol in error text.

Line 77 uses ×, which Ruff flags as RUF001; this can break CI linting.

🔧 Proposed fix
-                    f"(baseline {float(ref):.6f}s × {args.regression_ratio})"
+                    f"(baseline {float(ref):.6f}s x {args.regression_ratio})"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
f"(baseline {float(ref):.6f}s × {args.regression_ratio})"
f"(baseline {float(ref):.6f}s x {args.regression_ratio})"
🧰 Tools
🪛 Ruff (0.15.13)

[warning] 77-77: String contains ambiguous × (MULTIPLICATION SIGN). Did you mean x (LATIN SMALL LETTER X)?

(RUF001)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@benchmarks/compare_to_baseline.py` at line 77, The error string uses a
non-ASCII multiplication character "×" which triggers Ruff RUF001; update the
formatted message where f"(baseline {float(ref):.6f}s ×
{args.regression_ratio})" is constructed (referencing variables ref and
args.regression_ratio) to use an ASCII character such as "x" or "*" instead
(e.g. "x") so the text becomes f"(baseline {float(ref):.6f}s x
{args.regression_ratio})".

repo = github_repository
account = github_account
commit_at = datetime(2024, 6, 1, tzinfo=timezone.utc)
hashes = [f"svcbulk{i:056d}"[:40] for i in range(n)]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Commit hash generation is constant across iterations.

f"svcbulk{i:056d}"[:40] truncates away the changing suffix, so all entries collapse to the same commit_hash for typical n values. This turns the benchmark into repeated updates of one commit instead of N distinct commits.

🔧 Proposed fix
-    hashes = [f"svcbulk{i:056d}"[:40] for i in range(n)]
+    hashes = [f"{i:040x}" for i in range(n)]
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@benchmarks/test_service_bulk_insert.py` at line 28, The current commit hash
generation in the 'hashes' list uses f"svcbulk{i:056d}"[:40], which truncates
the variable part and makes all entries identical; update the expression that
builds 'hashes' (the list comprehension assigned to the variable hashes) so the
varying suffix is preserved—for example use the trailing slice
f"svcbulk{i:056d}"[-40:] or otherwise include i in the kept portion (or replace
with a deterministic hash like
hashlib.sha1(f"svcbulk{i}".encode()).hexdigest()[:40]) so each iteration
produces a distinct commit_hash.

Comment thread CONTRIBUTING.md
Comment on lines +89 to +90
uv run pytest benchmarks/ -m benchmark --benchmark-only \
--benchmark-json=bench.json -v
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check consistency of benchmark GC flag between workflow and docs.
rg -n -- '--benchmark-disable-gc' .github/workflows/benchmarks.yml CONTRIBUTING.md
# Expected after fix: matches in both files

Repository: cppalliance/boost-data-collector

Length of output: 147


🏁 Script executed:

#!/bin/bash
# Get context around the benchmark command in CONTRIBUTING.md (lines 85-95)
echo "=== CONTRIBUTING.md (lines 85-95) ==="
sed -n '85,95p' CONTRIBUTING.md

# Get context around the flag in the CI workflow (lines 60-70)
echo -e "\n=== .github/workflows/benchmarks.yml (lines 60-70) ==="
sed -n '60,70p' .github/workflows/benchmarks.yml

Repository: cppalliance/boost-data-collector

Length of output: 1371


Align local benchmark command with CI benchmark settings.

Line 89–90 omits --benchmark-disable-gc, while the CI workflow includes it. This flag disables garbage collection during benchmarks, ensuring local and CI results are directly comparable.

Suggested doc update
 uv run pytest benchmarks/ -m benchmark --benchmark-only \
-  --benchmark-json=bench.json -v
+  --benchmark-json=bench.json -v \
+  --benchmark-disable-gc
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
uv run pytest benchmarks/ -m benchmark --benchmark-only \
--benchmark-json=bench.json -v
uv run pytest benchmarks/ -m benchmark --benchmark-only \
--benchmark-json=bench.json -v \
--benchmark-disable-gc
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@CONTRIBUTING.md` around lines 89 - 90, The documented local benchmark command
(the pytest invocation shown: "uv run pytest benchmarks/ -m benchmark
--benchmark-only --benchmark-json=bench.json -v") is missing the CI-only flag;
update that command in CONTRIBUTING.md to include --benchmark-disable-gc so
local runs disable the GC exactly like the CI workflow, ensuring comparable
results.

Module: `cppa_pinecone_sync.services`

All creates/updates/deletes for `PineconeFailList` and `PineconeSyncStatus` must go through this module. See [Contributing.md](../Contributing.md).
All creates/updates/deletes for `PineconeFailList` and `PineconeSyncStatus` must go through this module. See [CONTRIBUTING.md](../../CONTRIBUTING.md).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Update the remaining legacy Contributing link in this file.

Line 5 is corrected, but Line 25 still points to ../Contributing.md. Please switch it to ../../CONTRIBUTING.md so this page doesn’t keep one stale path.

Suggested patch
- - [Contributing](../Contributing.md)
+ - [CONTRIBUTING.md](../../CONTRIBUTING.md)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/service_api/cppa_pinecone_sync.md` at line 5, In
docs/service_api/cppa_pinecone_sync.md update the remaining legacy CONTRIBUTING
link by replacing the incorrect "../Contributing.md" occurrence (the link text
on Line 25) with the correct "../../CONTRIBUTING.md" so both references use the
canonical uppercase CONTRIBUTING.md path; search for the string
"../Contributing.md" and change it to "../../CONTRIBUTING.md" to ensure
consistency with the earlier fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant