Plot plugin infrastructure (tier 0, additive) by blooop · Pull Request #932 · blooop/bencher

blooop · 2026-04-26T17:53:25Z

Summary

Lays down the plot plugin registry, contract, and discovery — purely additive. The registry exists but no existing code path queries it yet. Built-in chart types migrate onto this mechanism in subsequent PRs.

This is the first piece of a larger plan to replace the inheritance-based rendering system in bencher/results/ with a plugin registry, so third parties can ship plot backends in their own repos and users can write one-off plugins inline without touching upstream.

Design decisions (resolved before implementation)

Plugin granularity: hybrid — one plugin = (chart-type × backend) pair with a match rule and a render function. Backend is a namespace string for grouping and optional-dep gating.
Output target: single — pn.viewable.Viewable / Panel HTML report. Plugins own internal composition (linked hv.Layout, plotly.subplots, full Rerun blueprints); bencher does Panel-level outer composition only. Non-HTML outputs (full-Rerun export, future PDF) consume bench.dataset / BenchData outside the plugin system.
Plugin contract: (name, backend, match: PlotFilter, priority, requires: frozenset[str], render(BenchData) -> pn.viewable.Viewable).
BenchData: frozen value type — dataset, input/result vars, plt_cnt_cfg, run_meta, plus optional optimizer_study, baseline_runs, cache. One contract covers chart and meta-views; plugins gate on availability via PlotFilter.requires.
Discovery: entry-point group bencher.plot_plugins (lazy, on first lookup) + bencher.register_plugin(...) for in-script plugins.
Failure modes: missing optional deps → skip with logged warning. Render exceptions → substitute a visible error pane (strict=True re-raises for development).
Migration: gradual, no flag, six tiers (trivial Holoviews charts → 3D/Plotly → distribution → data/media → schema-stretching like Optuna/regression/Explorer → Rerun). The existing to_<type>() API surface is preserved as thin shims dispatching through the registry; zero user-visible breakage.

What's in this PR

File	Purpose
`bencher/plugins/bench_data.py`	`BenchData` (frozen), `RunMeta`, `CacheHandle` protocol, `BenchData.fake()` test constructor
`bencher/plugins/plugin.py`	`PlotPlugin` protocol, `@plot_plugin` decorator (function form)
`bencher/plugins/registry.py`	`PluginRegistry` — register, override-by-name, lazy entry-point discovery, `select(...)` with priority/include/exclude/backend/only filters, `render(...)` with error-pane substitution and `strict=True` opt-in
`bencher/plugins/__init__.py`	Public surface
`bencher/__init__.py`	Re-exports plugin API at the top level
`test/test_plugins.py`	23 unit tests covering registration, override semantics, selection, capability gating, render happy/error paths, lazy entry-point loading, factory loading

What is not changed

No existing result class (LineResult, HeatmapResult, etc.) is touched.
The inheritance MRO in bench_result.py is unchanged.
No existing call sites query the registry.
No public API has changed shape; only new symbols are exposed.

Test plan

pixi run format — clean
pixi run ruff-lint — clean
pixi run pylint — 10.00/10
pixi run ty — no diagnostics on new files
pixi run pytest test/test_plugins.py — 23/23 pass
Full test suite (1212 tests, excluding doc-generation): all pass
import bencher smoke test: plugin API reachable at the top level

Next up

Tier 1 — migrate LineResult, HeatmapResult, CurveResult, BarResult, ScatterResult, BandResult, TableResult onto the registry. The existing to_<type>() methods stay as thin shims; behavior unchanged. That migration validates the contract before tackling the schema-stretching cases (Optuna, regression reports, Explorer, Rerun).

🤖 Generated with Claude Code

Summary by Sourcery

Introduce a new plot plugin infrastructure with a registry, plugin contract, and shared BenchData value type, without yet wiring it into existing plotting code paths.

New Features:

Add BenchData, RunMeta, and CacheHandle protocol as the stable data contract passed into plot plugins.
Define a PlotPlugin protocol and plot_plugin decorator for function-based plot plugins, with optional auto-registration.
Introduce a PluginRegistry with global accessors for registering, selecting, and rendering plot plugins, including entry-point–based discovery and error-pane substitution on failures.
Expose the new plugin infrastructure and plotting filter helpers via the bencher.plugins package public API.

Tests:

Add a comprehensive test suite for BenchData, plugin registration and overriding, selection and capability gating, rendering behavior, global registry helpers, and entry-point discovery.

Lays down the registry, contract, and discovery for the plot plugin system without changing any existing behavior. The registry is created but no existing code path queries it yet — built-in chart types migrate onto this mechanism in subsequent PRs. Public surface (re-exported at the top level): - BenchData: frozen value type handed to plugin render() — the stable contract - PlotPlugin: protocol all plugins satisfy (name, backend, match, priority, requires, render) - @plot_plugin: decorator for the function form - register_plugin / unregister_plugin / get_registry: explicit registration Runtime model: - Lazy entry-point discovery via group `bencher.plot_plugins` - Skip-with-warning on plugin import failure (tolerant) - Override-by-name (user plugins replace built-ins by sharing a name) - Selection by match filter, with include/exclude/backend/only controls and capability gating via PlotFilter requires set - Render coordinator catches exceptions per-plugin and substitutes a visible error pane; strict=True opt-in re-raises for development

sourcery-ai · 2026-04-26T17:53:58Z

Reviewer's Guide

Introduces a new, fully additive plot plugin infrastructure: a frozen BenchData value type, a PlotPlugin protocol and decorator, a lazy-discovered PluginRegistry with selection and rendering (including error-substitution), top-level exports, and a focused test suite validating registration, discovery, selection, and error-handling behaviors.

Class diagram for new plot plugin infrastructure

classDiagram
    class BenchData {
        <<frozen dataclass>>
        +xr.Dataset dataset
        +tuple input_vars
        +tuple result_vars
        +PltCntCfg plt_cnt_cfg
        +RunMeta run_meta
        +Any optimizer_study
        +tuple~BenchData~ baseline_runs
        +CacheHandle cache
        +bool has(capability)
        +BenchData with_changes(**kwargs)
        +BenchData fake(dataset, input_vars, result_vars, plt_cnt_cfg, **overrides)
    }

    class RunMeta {
        <<frozen dataclass>>
        +str name
        +datetime timestamp
        +str sweep_hash
    }

    class CacheHandle {
        <<protocol>>
        +Any get(key)
        +None set(key, value)
    }

    class PlotPlugin {
        <<protocol>>
        +str name
        +str backend
        +PlotFilter match
        +int priority
        +frozenset~str~ requires
        +pn.viewable.Viewable render(data)
    }

    class _FunctionPlugin {
        <<dataclass>>
        +str name
        +str backend
        +PlotFilter match
        +int priority
        +frozenset~str~ requires
        -Callable fn
        +pn.viewable.Viewable render(data)
    }

    class PluginRegistry {
        -dict~str, PlotPlugin~ _plugins
        -bool _entry_points_loaded
        +register(plugin)
        +unregister(name)
        +clear()
        +mark_entry_points_loaded()
        +PlotPlugin get(name)
        +tuple~PlotPlugin~ all()
        -_ensure_entry_points_loaded()
        -_register_loaded(ep_name, obj)
        +tuple~PlotPlugin~ select(data, include, exclude, backend, only)
        +tuple~(str, pn.viewable.Viewable)~ render(data, include, exclude, backend, only, strict)
    }

    class plot_plugin {
        <<function decorator>>
        +_FunctionPlugin __call__(fn)
    }

    BenchData o-- RunMeta : run_meta
    BenchData o-- CacheHandle : cache
    BenchData o-- BenchData : baseline_runs

    _FunctionPlugin ..|> PlotPlugin
    CacheHandle <|.. CacheHandle
    PlotPlugin <|.. PlotPlugin

    PluginRegistry "*" o--> PlotPlugin : manages
    plot_plugin ..> _FunctionPlugin : creates
    plot_plugin ..> PluginRegistry : optional_register

File-Level Changes

Change	Details	Files
Introduce BenchData value type and related metadata/cache contracts for plugins.	Add CacheHandle protocol as the plugin-visible caching interface. Define frozen RunMeta dataclass with basic run identity fields. Define frozen BenchData dataclass encapsulating dataset, vars, plotting config, optional optimizer/baseline/cache, plus a capability-based has() helper and with_changes() copier. Add BenchData.fake() convenience constructor for tests with sensible defaults.	`bencher/plugins/bench_data.py`
Define the PlotPlugin protocol and a function-based plugin implementation/decorator.	Define PlotPlugin runtime-checkable protocol specifying name/backend/match/priority/requires and render() contract. Implement _FunctionPlugin dataclass that wraps a callable into a concrete PlotPlugin. Provide plot_plugin decorator to create _FunctionPlugin instances, with optional auto-registration into the global registry via register flag.	`bencher/plugins/plugin.py`
Implement PluginRegistry with lazy entry-point discovery, selection, and render orchestration plus global helpers.	Add _render_error_pane helper that turns exceptions into Markdown panes with tracebacks. Implement PluginRegistry managing in-process plugins, including register/override, unregister, clear, and test-only mark_entry_points_loaded. Implement lazy entry-point discovery for ENTRY_POINT_GROUP, supporting direct plugin instances, factories, and tolerant error handling/logging. Provide select() to filter plugins by name/backend, gate by BenchData capabilities and PlotFilter.matches_result, and order by priority/name. Provide render() to run selected plugins, optionally strict, substituting error panes on failure; expose a process-global registry with get_registry/register_plugin/unregister_plugin helpers.	`bencher/plugins/registry.py`
Expose plugin infrastructure as public API and verify via unit tests.	Create bencher.plugins package that re-exports BenchData/RunMeta/CacheHandle, PlotPlugin/plot_plugin, PluginRegistry helpers, and PlotFilter/VarRange/ENTRY_POINT_GROUP. Ensure plugin API is reachable at the bencher top level via init exports (per PR description, though diff is truncated). Add comprehensive unit tests covering BenchData behavior, registry registration/override/unregister, selection filters and capability gating, render error handling/strict mode, global registration helpers, and entry-point discovery behaviors.	`bencher/plugins/__init__.py` `bencher/__init__.py` `test/test_plugins.py`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

github-actions · 2026-04-26T17:56:07Z

Performance Report for `0ae9f6c`

Metric	Value
Total tests	1379
Total time	115.41s
Mean	0.0837s
Median	0.0010s

Top 10 slowest tests

Test	Time (s)
`test.test_bench_examples.TestBenchExamples::test_example_meta`	19.081
`test.test_over_time_save_perf::test_save_faster_without_aggregated_tab`	5.070
`test.test_hash_persistent.TestCrossProcessDeterminism::test_hash_stable_across_two_processes[ResultBool]`	4.466
`test.test_generated_examples::test_generated_example[cartesian_animation/example_cartesian_animation.py]`	3.175
`test.test_over_time_repeats.TestMaxSliderPoints::test_default_subsampling_caps_at_max`	3.167
`test.test_generated_examples::test_generated_example[regression/example_regression_tuning_drift.py]`	2.961
`test.test_generated_examples::test_generated_example[regression/example_regression_tuning_step.py]`	2.858
`test.test_generated_examples::test_generated_example[result_types/result_image/example_result_image_to_video.py]`	2.814
`test.test_generated_examples::test_generated_example[regression/example_regression_tuning_noise.py]`	2.355
`test.test_bench_runner.TestBenchRunner::test_benchrunner_unified_interface`	1.511

Full report

Updated by Performance Tracking workflow

Captures the design that was settled before tier-0 infrastructure landed so contributors and agents continuing the work have the full context: the alternatives considered at each branch of the decision tree, the tensions surfaced, the pivotal moments where the design changed direction, and the reasoning behind every chosen position. Sections: - Goal and motivation (links to the existing TODO in bench_result_base.py) - Resolved design at a glance (single-table summary) - Decision walkthrough — the eight resolved questions: granularity, composition, output target, contract, discovery + migrate-built-ins, meta-views, lifecycle, phasing - Contract surfaces with signatures - Runtime model (discovery, selection, render, override semantics) - Migration plan (six tiers + cleanup, simplest first) - Tier-0 status - Open tactical questions (deferred, not forgotten) - Handoff guide with a step-by-step recipe for tier 1

github-actions · 2026-04-26T18:04:03Z

Performance Report for `534d33e`

Metric	Value
Total tests	1379
Total time	112.99s
Mean	0.0819s
Median	0.0010s

Top 10 slowest tests

Test	Time (s)
`test.test_bench_examples.TestBenchExamples::test_example_meta`	19.271
`test.test_over_time_save_perf::test_save_faster_without_aggregated_tab`	4.770
`test.test_hash_persistent.TestCrossProcessDeterminism::test_hash_stable_across_two_processes[ResultBool]`	4.424
`test.test_generated_examples::test_generated_example[cartesian_animation/example_cartesian_animation.py]`	3.152
`test.test_over_time_repeats.TestMaxSliderPoints::test_default_subsampling_caps_at_max`	2.989
`test.test_generated_examples::test_generated_example[regression/example_regression_tuning_drift.py]`	2.928
`test.test_generated_examples::test_generated_example[regression/example_regression_tuning_step.py]`	2.824
`test.test_generated_examples::test_generated_example[result_types/result_image/example_result_image_to_video.py]`	2.720
`test.test_generated_examples::test_generated_example[regression/example_regression_tuning_noise.py]`	2.344
`test.test_bench_runner.TestBenchRunner::test_benchrunner_unified_interface`	1.513

Full report

Updated by Performance Tracking workflow

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Plot plugin infrastructure (tier 0, additive)#932

Plot plugin infrastructure (tier 0, additive)#932
blooop wants to merge 2 commits into
mainfrom
feature/plugin-system

blooop commented Apr 26, 2026 •

edited by sourcery-ai Bot

Loading

Uh oh!

sourcery-ai Bot commented Apr 26, 2026

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

github-actions Bot commented Apr 26, 2026

Uh oh!

github-actions Bot commented Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

blooop commented Apr 26, 2026 • edited by sourcery-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Design decisions (resolved before implementation)

What's in this PR

What is not changed

Test plan

Next up

Summary by Sourcery

Uh oh!

sourcery-ai Bot commented Apr 26, 2026

Reviewer's Guide

Class diagram for new plot plugin infrastructure

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

github-actions Bot commented Apr 26, 2026

Performance Report for 0ae9f6c

Uh oh!

github-actions Bot commented Apr 26, 2026

Performance Report for 534d33e

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

blooop commented Apr 26, 2026 •

edited by sourcery-ai Bot

Loading

Performance Report for `0ae9f6c`

Performance Report for `534d33e`