Add bencher.ci module for CI regression gating by blooop · Pull Request #941 · blooop/bencher

blooop · 2026-05-18T10:54:37Z

Summary

Extracts generic CI integration utilities into a new bencher.ci module. These functions bridge the gap between bencher's regression detection engine and CI workflow plumbing (GitHub Actions, Jenkins, etc.):

write_performance_summary() — serialize a RegressionReport to a pipe-delimited text file that CI can parse, with optional per-metric filtering and custom thresholds
render_regression_plots() — render diagnostic PNGs for regressed metrics (for embedding in PR comments)
warn_on_regressions() — one-call convenience wrapper: write summary + render plots + emit pytest warnings
parse_performance_summary() — parse the summary file back into structured dicts
generate_regression_comment() — produce a GitHub-flavored Markdown PR comment with status icons, table, plot links, and next-steps guidance

Motivation

Projects using bencher for benchmark regression detection end up reimplementing the same CI glue: writing structured summaries, generating PR comments with regression tables, rendering plot PNGs. This was ~150 lines of fragile bash in CI workflows plus ~100 lines of Python in downstream projects. Moving it upstream means:

One tested implementation instead of per-project copies
New detection methods (adaptive, delta, absolute) automatically render correctly in comments
method_cells() stays the single source of truth for how each method is displayed

Design choices

No new dependencies — uses only stdlib + existing bencher imports
metrics_filter is optional — without it, every result in the report is written. With it, only matching metrics appear and per-metric thresholds override the report-level ones
bench_name prefix — allows qualified names like "bench_planning/planning_time" so identically-named metrics from different benchmarks are distinguishable
append=True default — multiple benchmark tests write to the same summary file during a CI run
Comment generation is pure Python — replaces bash + awk + sed in CI YAML, testable and maintainable

Example usage

from bencher.ci import warn_on_regressions, generate_regression_comment

# In a pytest benchmark test:
warn_on_regressions(
    bench.results[-1].regression_report,
    summary_path=Path("reports/performance_summary.txt"),
    plot_dir=Path("reports/regression_plots"),
    bench_name="bench_robot_planning",
    metrics_filter={"bench_robot_planning/planning_time": 20.0},
)

# In a CI step (or Python script called from CI):
comment_md = generate_regression_comment(
    "reports/performance_summary.txt",
    report_url="https://reports.example.com/latest/index.html",
    plot_url_prefix="https://reports.example.com/latest/regression_plots",
)

Test plan

26 new tests covering all functions (test/test_ci.py)
Existing 103 regression tests still pass
Verify linting passes (pixi run ci)

🤖 Generated with Claude Code

Summary by Sourcery

Introduce a bencher.ci module providing reusable CI utilities for benchmark regression gating, along with comprehensive tests.

New Features:

Add write_performance_summary to serialize regression reports into CI-friendly pipe-delimited summaries with optional filtering and thresholds.
Add render_regression_plots to generate PNG diagnostics for regressed metrics, with optional metric filtering and bench name prefixing.
Add warn_on_regressions as a convenience wrapper to emit pytest warnings and optionally produce summaries and plots from a regression report.
Add parse_performance_summary to read performance summary files back into structured dictionaries for further processing.
Add generate_regression_comment to build GitHub-flavored Markdown PR comments summarizing benchmark regressions and linking reports and plots.

Tests:

Add dedicated test suite in test/test_ci.py covering summary writing, plot rendering, regression warnings, summary parsing, and comment generation behavior.

Generic CI integration utilities for benchmark regression workflows: - write_performance_summary(): serialize RegressionReport to a pipe-delimited text file parseable by CI workflows (GitHub Actions, Jenkins, etc.), with optional per-metric filtering and thresholds - render_regression_plots(): render diagnostic PNGs for regressed metrics, suitable for embedding in PR comments - warn_on_regressions(): convenience wrapper that writes the summary, renders plots, and emits pytest warnings in one call - parse_performance_summary(): parse the summary file back into structured dicts - generate_regression_comment(): produce a GitHub-flavored Markdown PR comment with status icons, regression table, plot links, and actionable next-steps — replaces fragile shell-script comment generation in CI workflows All functions are exported from the top-level bencher package. No new dependencies required.

sourcery-ai · 2026-05-18T10:55:13Z

Reviewer's Guide

Introduces a new bencher.ci module that centralizes CI-facing regression utilities (summary serialization/parsing, plot rendering, warnings, and Markdown comment generation) and adds comprehensive tests covering the new functionality.

Sequence diagram for warn_on_regressions CI gating flow

sequenceDiagram
    actor PytestTest
    participant warn_on_regressions
    participant write_performance_summary
    participant render_regression_plots
    participant RegressionReport as report
    participant warnings

    PytestTest->>warn_on_regressions: warn_on_regressions(report, summary_path, plot_dir, metrics_filter, bench_name)

    alt summary_path is not None
        warn_on_regressions->>write_performance_summary: write_performance_summary(report, summary_path, metrics_filter, bench_name)
        write_performance_summary-->>warn_on_regressions: lines
    end

    alt plot_dir is not None
        warn_on_regressions->>render_regression_plots: render_regression_plots(report, plot_dir, metrics_filter, bench_name)
        render_regression_plots-->>warn_on_regressions: rendered
    end

    warn_on_regressions->>RegressionReport: has_regressions
    alt report.has_regressions
        warn_on_regressions->>RegressionReport: summary()
        RegressionReport-->>warn_on_regressions: summary
        warn_on_regressions->>warnings: warn("Benchmark regression detected:\n" + summary)
    end

    warn_on_regressions-->>PytestTest: return

File-Level Changes

Change	Details	Files
Add bencher.ci module providing CI integration helpers around RegressionReport/RegressionResult.	Implement write_performance_summary to emit a pipe-delimited summary format with optional metric filtering, bench-name prefixing, threshold overrides, and append/overwrite control. Implement render_regression_plots to generate PNGs only for regressed (and optionally filtered) metrics, handling errors and returning a mapping from qualified metric name to file path. Implement warn_on_regressions as a convenience wrapper that optionally writes summaries/plots and always emits a pytest-visible warning when the report contains regressions. Implement parse_performance_summary to safely read the summary file into structured rows, skipping blank/malformed lines. Implement generate_regression_comment and helpers to turn summary rows into GitHub-flavored Markdown with status icons, regression categorization, optional plot links, and guidance text.	`bencher/ci.py`
Add tests validating CI utilities behavior and integration with regression detection.	Add fixtures and helper constructors for RegressionResult/RegressionReport used across tests. Test write_performance_summary behavior including filtering, threshold overrides, append vs overwrite, directory creation, bench-name prefixing, and handling of absolute methods and empty reports. Test render_regression_plots for correct rendering of regressed metrics, skipping non-regressed metrics, and respecting metric filters and bench-name prefixes. Test warn_on_regressions for warning emission, non-emission on clean reports, and side effects of writing summaries and plots when paths are provided. Test parse_performance_summary and generate_regression_comment for correct parsing, table generation, regression/improvement categorization, absolute-method rendering, plot-link generation, and empty-input handling.	`test/test_ci.py`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

github-actions · 2026-05-18T10:57:13Z

Performance Report for `4ab2c9f`

Metric	Value
Total tests	1423
Total time	111.34s
Mean	0.0782s
Median	0.0010s

Top 10 slowest tests

Test	Time (s)
`test.test_bench_examples.TestBenchExamples::test_example_meta`	18.045
`test.test_over_time_save_perf::test_save_faster_without_aggregated_tab`	5.238
`test.test_hash_persistent.TestCrossProcessDeterminism::test_hash_stable_across_two_processes[ResultBool]`	4.332
`test.test_generated_examples::test_generated_example[cartesian_animation/example_cartesian_animation.py]`	3.151
`test.test_generated_examples::test_generated_example[result_types/result_image/example_result_image_to_video.py]`	2.710
`test.test_generated_examples::test_generated_example[regression/example_regression_tuning_noise.py]`	2.621
`test.test_generated_examples::test_generated_example[regression/example_regression_tuning_drift.py]`	2.611
`test.test_over_time_repeats.TestMaxSliderPoints::test_default_subsampling_caps_at_max`	2.448
`test.test_generated_examples::test_generated_example[regression/example_regression_tuning_step.py]`	2.257
`test.test_time_event_curve.TestTimeEventCurvePlot::test_curve_with_string_time_src_and_cat`	1.090

Full report

Updated by Performance Tracking workflow

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add bencher.ci module for CI regression gating#941

Add bencher.ci module for CI regression gating#941
blooop wants to merge 1 commit into
mainfrom
feat/ci-utils-and-report-index

blooop commented May 18, 2026 •

edited by sourcery-ai Bot

Loading

Uh oh!

sourcery-ai Bot commented May 18, 2026

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

github-actions Bot commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

blooop commented May 18, 2026 • edited by sourcery-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Design choices

Example usage

Test plan

Summary by Sourcery

Uh oh!

sourcery-ai Bot commented May 18, 2026

Reviewer's Guide

Sequence diagram for warn_on_regressions CI gating flow

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

github-actions Bot commented May 18, 2026

Performance Report for 4ab2c9f

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

blooop commented May 18, 2026 •

edited by sourcery-ai Bot

Loading

Performance Report for `4ab2c9f`