Skip to content

Plot plugin infrastructure (tier 0, additive)#932

Draft
blooop wants to merge 2 commits into
mainfrom
feature/plugin-system
Draft

Plot plugin infrastructure (tier 0, additive)#932
blooop wants to merge 2 commits into
mainfrom
feature/plugin-system

Conversation

@blooop
Copy link
Copy Markdown
Owner

@blooop blooop commented Apr 26, 2026

Summary

Lays down the plot plugin registry, contract, and discovery — purely additive. The registry exists but no existing code path queries it yet. Built-in chart types migrate onto this mechanism in subsequent PRs.

This is the first piece of a larger plan to replace the inheritance-based rendering system in bencher/results/ with a plugin registry, so third parties can ship plot backends in their own repos and users can write one-off plugins inline without touching upstream.

Design decisions (resolved before implementation)

  • Plugin granularity: hybrid — one plugin = (chart-type × backend) pair with a match rule and a render function. Backend is a namespace string for grouping and optional-dep gating.
  • Output target: single — pn.viewable.Viewable / Panel HTML report. Plugins own internal composition (linked hv.Layout, plotly.subplots, full Rerun blueprints); bencher does Panel-level outer composition only. Non-HTML outputs (full-Rerun export, future PDF) consume bench.dataset / BenchData outside the plugin system.
  • Plugin contract: (name, backend, match: PlotFilter, priority, requires: frozenset[str], render(BenchData) -> pn.viewable.Viewable).
  • BenchData: frozen value type — dataset, input/result vars, plt_cnt_cfg, run_meta, plus optional optimizer_study, baseline_runs, cache. One contract covers chart and meta-views; plugins gate on availability via PlotFilter.requires.
  • Discovery: entry-point group bencher.plot_plugins (lazy, on first lookup) + bencher.register_plugin(...) for in-script plugins.
  • Failure modes: missing optional deps → skip with logged warning. Render exceptions → substitute a visible error pane (strict=True re-raises for development).
  • Migration: gradual, no flag, six tiers (trivial Holoviews charts → 3D/Plotly → distribution → data/media → schema-stretching like Optuna/regression/Explorer → Rerun). The existing to_<type>() API surface is preserved as thin shims dispatching through the registry; zero user-visible breakage.

What's in this PR

File Purpose
bencher/plugins/bench_data.py BenchData (frozen), RunMeta, CacheHandle protocol, BenchData.fake() test constructor
bencher/plugins/plugin.py PlotPlugin protocol, @plot_plugin decorator (function form)
bencher/plugins/registry.py PluginRegistry — register, override-by-name, lazy entry-point discovery, select(...) with priority/include/exclude/backend/only filters, render(...) with error-pane substitution and strict=True opt-in
bencher/plugins/__init__.py Public surface
bencher/__init__.py Re-exports plugin API at the top level
test/test_plugins.py 23 unit tests covering registration, override semantics, selection, capability gating, render happy/error paths, lazy entry-point loading, factory loading

What is not changed

  • No existing result class (LineResult, HeatmapResult, etc.) is touched.
  • The inheritance MRO in bench_result.py is unchanged.
  • No existing call sites query the registry.
  • No public API has changed shape; only new symbols are exposed.

Test plan

  • pixi run format — clean
  • pixi run ruff-lint — clean
  • pixi run pylint — 10.00/10
  • pixi run ty — no diagnostics on new files
  • pixi run pytest test/test_plugins.py — 23/23 pass
  • Full test suite (1212 tests, excluding doc-generation): all pass
  • import bencher smoke test: plugin API reachable at the top level

Next up

Tier 1 — migrate LineResult, HeatmapResult, CurveResult, BarResult, ScatterResult, BandResult, TableResult onto the registry. The existing to_<type>() methods stay as thin shims; behavior unchanged. That migration validates the contract before tackling the schema-stretching cases (Optuna, regression reports, Explorer, Rerun).

🤖 Generated with Claude Code

Summary by Sourcery

Introduce a new plot plugin infrastructure with a registry, plugin contract, and shared BenchData value type, without yet wiring it into existing plotting code paths.

New Features:

  • Add BenchData, RunMeta, and CacheHandle protocol as the stable data contract passed into plot plugins.
  • Define a PlotPlugin protocol and plot_plugin decorator for function-based plot plugins, with optional auto-registration.
  • Introduce a PluginRegistry with global accessors for registering, selecting, and rendering plot plugins, including entry-point–based discovery and error-pane substitution on failures.
  • Expose the new plugin infrastructure and plotting filter helpers via the bencher.plugins package public API.

Tests:

  • Add a comprehensive test suite for BenchData, plugin registration and overriding, selection and capability gating, rendering behavior, global registry helpers, and entry-point discovery.

Lays down the registry, contract, and discovery for the plot plugin system
without changing any existing behavior. The registry is created but no
existing code path queries it yet — built-in chart types migrate onto this
mechanism in subsequent PRs.

Public surface (re-exported at the top level):
- BenchData: frozen value type handed to plugin render() — the stable contract
- PlotPlugin: protocol all plugins satisfy (name, backend, match, priority, requires, render)
- @plot_plugin: decorator for the function form
- register_plugin / unregister_plugin / get_registry: explicit registration

Runtime model:
- Lazy entry-point discovery via group `bencher.plot_plugins`
- Skip-with-warning on plugin import failure (tolerant)
- Override-by-name (user plugins replace built-ins by sharing a name)
- Selection by match filter, with include/exclude/backend/only controls and
  capability gating via PlotFilter requires set
- Render coordinator catches exceptions per-plugin and substitutes a visible
  error pane; strict=True opt-in re-raises for development
@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai Bot commented Apr 26, 2026

Reviewer's Guide

Introduces a new, fully additive plot plugin infrastructure: a frozen BenchData value type, a PlotPlugin protocol and decorator, a lazy-discovered PluginRegistry with selection and rendering (including error-substitution), top-level exports, and a focused test suite validating registration, discovery, selection, and error-handling behaviors.

Class diagram for new plot plugin infrastructure

classDiagram
    class BenchData {
        <<frozen dataclass>>
        +xr.Dataset dataset
        +tuple input_vars
        +tuple result_vars
        +PltCntCfg plt_cnt_cfg
        +RunMeta run_meta
        +Any optimizer_study
        +tuple~BenchData~ baseline_runs
        +CacheHandle cache
        +bool has(capability)
        +BenchData with_changes(**kwargs)
        +BenchData fake(dataset, input_vars, result_vars, plt_cnt_cfg, **overrides)
    }

    class RunMeta {
        <<frozen dataclass>>
        +str name
        +datetime timestamp
        +str sweep_hash
    }

    class CacheHandle {
        <<protocol>>
        +Any get(key)
        +None set(key, value)
    }

    class PlotPlugin {
        <<protocol>>
        +str name
        +str backend
        +PlotFilter match
        +int priority
        +frozenset~str~ requires
        +pn.viewable.Viewable render(data)
    }

    class _FunctionPlugin {
        <<dataclass>>
        +str name
        +str backend
        +PlotFilter match
        +int priority
        +frozenset~str~ requires
        -Callable fn
        +pn.viewable.Viewable render(data)
    }

    class PluginRegistry {
        -dict~str, PlotPlugin~ _plugins
        -bool _entry_points_loaded
        +register(plugin)
        +unregister(name)
        +clear()
        +mark_entry_points_loaded()
        +PlotPlugin get(name)
        +tuple~PlotPlugin~ all()
        -_ensure_entry_points_loaded()
        -_register_loaded(ep_name, obj)
        +tuple~PlotPlugin~ select(data, include, exclude, backend, only)
        +tuple~(str, pn.viewable.Viewable)~ render(data, include, exclude, backend, only, strict)
    }

    class plot_plugin {
        <<function decorator>>
        +_FunctionPlugin __call__(fn)
    }

    BenchData o-- RunMeta : run_meta
    BenchData o-- CacheHandle : cache
    BenchData o-- BenchData : baseline_runs

    _FunctionPlugin ..|> PlotPlugin
    CacheHandle <|.. CacheHandle
    PlotPlugin <|.. PlotPlugin

    PluginRegistry "*" o--> PlotPlugin : manages
    plot_plugin ..> _FunctionPlugin : creates
    plot_plugin ..> PluginRegistry : optional_register
Loading

File-Level Changes

Change Details Files
Introduce BenchData value type and related metadata/cache contracts for plugins.
  • Add CacheHandle protocol as the plugin-visible caching interface.
  • Define frozen RunMeta dataclass with basic run identity fields.
  • Define frozen BenchData dataclass encapsulating dataset, vars, plotting config, optional optimizer/baseline/cache, plus a capability-based has() helper and with_changes() copier.
  • Add BenchData.fake() convenience constructor for tests with sensible defaults.
bencher/plugins/bench_data.py
Define the PlotPlugin protocol and a function-based plugin implementation/decorator.
  • Define PlotPlugin runtime-checkable protocol specifying name/backend/match/priority/requires and render() contract.
  • Implement _FunctionPlugin dataclass that wraps a callable into a concrete PlotPlugin.
  • Provide plot_plugin decorator to create _FunctionPlugin instances, with optional auto-registration into the global registry via register flag.
bencher/plugins/plugin.py
Implement PluginRegistry with lazy entry-point discovery, selection, and render orchestration plus global helpers.
  • Add _render_error_pane helper that turns exceptions into Markdown panes with tracebacks.
  • Implement PluginRegistry managing in-process plugins, including register/override, unregister, clear, and test-only mark_entry_points_loaded.
  • Implement lazy entry-point discovery for ENTRY_POINT_GROUP, supporting direct plugin instances, factories, and tolerant error handling/logging.
  • Provide select() to filter plugins by name/backend, gate by BenchData capabilities and PlotFilter.matches_result, and order by priority/name.
  • Provide render() to run selected plugins, optionally strict, substituting error panes on failure; expose a process-global registry with get_registry/register_plugin/unregister_plugin helpers.
bencher/plugins/registry.py
Expose plugin infrastructure as public API and verify via unit tests.
  • Create bencher.plugins package that re-exports BenchData/RunMeta/CacheHandle, PlotPlugin/plot_plugin, PluginRegistry helpers, and PlotFilter/VarRange/ENTRY_POINT_GROUP.
  • Ensure plugin API is reachable at the bencher top level via init exports (per PR description, though diff is truncated).
  • Add comprehensive unit tests covering BenchData behavior, registry registration/override/unregister, selection filters and capability gating, render error handling/strict mode, global registration helpers, and entry-point discovery behaviors.
bencher/plugins/__init__.py
bencher/__init__.py
test/test_plugins.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@github-actions
Copy link
Copy Markdown

Performance Report for 0ae9f6c

Metric Value
Total tests 1379
Total time 115.41s
Mean 0.0837s
Median 0.0010s
Top 10 slowest tests
Test Time (s)
test.test_bench_examples.TestBenchExamples::test_example_meta 19.081
test.test_over_time_save_perf::test_save_faster_without_aggregated_tab 5.070
test.test_hash_persistent.TestCrossProcessDeterminism::test_hash_stable_across_two_processes[ResultBool] 4.466
test.test_generated_examples::test_generated_example[cartesian_animation/example_cartesian_animation.py] 3.175
test.test_over_time_repeats.TestMaxSliderPoints::test_default_subsampling_caps_at_max 3.167
test.test_generated_examples::test_generated_example[regression/example_regression_tuning_drift.py] 2.961
test.test_generated_examples::test_generated_example[regression/example_regression_tuning_step.py] 2.858
test.test_generated_examples::test_generated_example[result_types/result_image/example_result_image_to_video.py] 2.814
test.test_generated_examples::test_generated_example[regression/example_regression_tuning_noise.py] 2.355
test.test_bench_runner.TestBenchRunner::test_benchrunner_unified_interface 1.511

Full report

Updated by Performance Tracking workflow

Captures the design that was settled before tier-0 infrastructure landed
so contributors and agents continuing the work have the full context: the
alternatives considered at each branch of the decision tree, the tensions
surfaced, the pivotal moments where the design changed direction, and the
reasoning behind every chosen position.

Sections:
- Goal and motivation (links to the existing TODO in bench_result_base.py)
- Resolved design at a glance (single-table summary)
- Decision walkthrough — the eight resolved questions: granularity,
  composition, output target, contract, discovery + migrate-built-ins,
  meta-views, lifecycle, phasing
- Contract surfaces with signatures
- Runtime model (discovery, selection, render, override semantics)
- Migration plan (six tiers + cleanup, simplest first)
- Tier-0 status
- Open tactical questions (deferred, not forgotten)
- Handoff guide with a step-by-step recipe for tier 1
@github-actions
Copy link
Copy Markdown

Performance Report for 534d33e

Metric Value
Total tests 1379
Total time 112.99s
Mean 0.0819s
Median 0.0010s
Top 10 slowest tests
Test Time (s)
test.test_bench_examples.TestBenchExamples::test_example_meta 19.271
test.test_over_time_save_perf::test_save_faster_without_aggregated_tab 4.770
test.test_hash_persistent.TestCrossProcessDeterminism::test_hash_stable_across_two_processes[ResultBool] 4.424
test.test_generated_examples::test_generated_example[cartesian_animation/example_cartesian_animation.py] 3.152
test.test_over_time_repeats.TestMaxSliderPoints::test_default_subsampling_caps_at_max 2.989
test.test_generated_examples::test_generated_example[regression/example_regression_tuning_drift.py] 2.928
test.test_generated_examples::test_generated_example[regression/example_regression_tuning_step.py] 2.824
test.test_generated_examples::test_generated_example[result_types/result_image/example_result_image_to_video.py] 2.720
test.test_generated_examples::test_generated_example[regression/example_regression_tuning_noise.py] 2.344
test.test_bench_runner.TestBenchRunner::test_benchrunner_unified_interface 1.513

Full report

Updated by Performance Tracking workflow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant