Plot plugin infrastructure (tier 0, additive)#932
Draft
blooop wants to merge 2 commits into
Draft
Conversation
Lays down the registry, contract, and discovery for the plot plugin system without changing any existing behavior. The registry is created but no existing code path queries it yet — built-in chart types migrate onto this mechanism in subsequent PRs. Public surface (re-exported at the top level): - BenchData: frozen value type handed to plugin render() — the stable contract - PlotPlugin: protocol all plugins satisfy (name, backend, match, priority, requires, render) - @plot_plugin: decorator for the function form - register_plugin / unregister_plugin / get_registry: explicit registration Runtime model: - Lazy entry-point discovery via group `bencher.plot_plugins` - Skip-with-warning on plugin import failure (tolerant) - Override-by-name (user plugins replace built-ins by sharing a name) - Selection by match filter, with include/exclude/backend/only controls and capability gating via PlotFilter requires set - Render coordinator catches exceptions per-plugin and substitutes a visible error pane; strict=True opt-in re-raises for development
Contributor
Reviewer's GuideIntroduces a new, fully additive plot plugin infrastructure: a frozen BenchData value type, a PlotPlugin protocol and decorator, a lazy-discovered PluginRegistry with selection and rendering (including error-substitution), top-level exports, and a focused test suite validating registration, discovery, selection, and error-handling behaviors. Class diagram for new plot plugin infrastructureclassDiagram
class BenchData {
<<frozen dataclass>>
+xr.Dataset dataset
+tuple input_vars
+tuple result_vars
+PltCntCfg plt_cnt_cfg
+RunMeta run_meta
+Any optimizer_study
+tuple~BenchData~ baseline_runs
+CacheHandle cache
+bool has(capability)
+BenchData with_changes(**kwargs)
+BenchData fake(dataset, input_vars, result_vars, plt_cnt_cfg, **overrides)
}
class RunMeta {
<<frozen dataclass>>
+str name
+datetime timestamp
+str sweep_hash
}
class CacheHandle {
<<protocol>>
+Any get(key)
+None set(key, value)
}
class PlotPlugin {
<<protocol>>
+str name
+str backend
+PlotFilter match
+int priority
+frozenset~str~ requires
+pn.viewable.Viewable render(data)
}
class _FunctionPlugin {
<<dataclass>>
+str name
+str backend
+PlotFilter match
+int priority
+frozenset~str~ requires
-Callable fn
+pn.viewable.Viewable render(data)
}
class PluginRegistry {
-dict~str, PlotPlugin~ _plugins
-bool _entry_points_loaded
+register(plugin)
+unregister(name)
+clear()
+mark_entry_points_loaded()
+PlotPlugin get(name)
+tuple~PlotPlugin~ all()
-_ensure_entry_points_loaded()
-_register_loaded(ep_name, obj)
+tuple~PlotPlugin~ select(data, include, exclude, backend, only)
+tuple~(str, pn.viewable.Viewable)~ render(data, include, exclude, backend, only, strict)
}
class plot_plugin {
<<function decorator>>
+_FunctionPlugin __call__(fn)
}
BenchData o-- RunMeta : run_meta
BenchData o-- CacheHandle : cache
BenchData o-- BenchData : baseline_runs
_FunctionPlugin ..|> PlotPlugin
CacheHandle <|.. CacheHandle
PlotPlugin <|.. PlotPlugin
PluginRegistry "*" o--> PlotPlugin : manages
plot_plugin ..> _FunctionPlugin : creates
plot_plugin ..> PluginRegistry : optional_register
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Performance Report for
|
| Metric | Value |
|---|---|
| Total tests | 1379 |
| Total time | 115.41s |
| Mean | 0.0837s |
| Median | 0.0010s |
Top 10 slowest tests
| Test | Time (s) |
|---|---|
test.test_bench_examples.TestBenchExamples::test_example_meta |
19.081 |
test.test_over_time_save_perf::test_save_faster_without_aggregated_tab |
5.070 |
test.test_hash_persistent.TestCrossProcessDeterminism::test_hash_stable_across_two_processes[ResultBool] |
4.466 |
test.test_generated_examples::test_generated_example[cartesian_animation/example_cartesian_animation.py] |
3.175 |
test.test_over_time_repeats.TestMaxSliderPoints::test_default_subsampling_caps_at_max |
3.167 |
test.test_generated_examples::test_generated_example[regression/example_regression_tuning_drift.py] |
2.961 |
test.test_generated_examples::test_generated_example[regression/example_regression_tuning_step.py] |
2.858 |
test.test_generated_examples::test_generated_example[result_types/result_image/example_result_image_to_video.py] |
2.814 |
test.test_generated_examples::test_generated_example[regression/example_regression_tuning_noise.py] |
2.355 |
test.test_bench_runner.TestBenchRunner::test_benchrunner_unified_interface |
1.511 |
Updated by Performance Tracking workflow
Captures the design that was settled before tier-0 infrastructure landed so contributors and agents continuing the work have the full context: the alternatives considered at each branch of the decision tree, the tensions surfaced, the pivotal moments where the design changed direction, and the reasoning behind every chosen position. Sections: - Goal and motivation (links to the existing TODO in bench_result_base.py) - Resolved design at a glance (single-table summary) - Decision walkthrough — the eight resolved questions: granularity, composition, output target, contract, discovery + migrate-built-ins, meta-views, lifecycle, phasing - Contract surfaces with signatures - Runtime model (discovery, selection, render, override semantics) - Migration plan (six tiers + cleanup, simplest first) - Tier-0 status - Open tactical questions (deferred, not forgotten) - Handoff guide with a step-by-step recipe for tier 1
Performance Report for
|
| Metric | Value |
|---|---|
| Total tests | 1379 |
| Total time | 112.99s |
| Mean | 0.0819s |
| Median | 0.0010s |
Top 10 slowest tests
| Test | Time (s) |
|---|---|
test.test_bench_examples.TestBenchExamples::test_example_meta |
19.271 |
test.test_over_time_save_perf::test_save_faster_without_aggregated_tab |
4.770 |
test.test_hash_persistent.TestCrossProcessDeterminism::test_hash_stable_across_two_processes[ResultBool] |
4.424 |
test.test_generated_examples::test_generated_example[cartesian_animation/example_cartesian_animation.py] |
3.152 |
test.test_over_time_repeats.TestMaxSliderPoints::test_default_subsampling_caps_at_max |
2.989 |
test.test_generated_examples::test_generated_example[regression/example_regression_tuning_drift.py] |
2.928 |
test.test_generated_examples::test_generated_example[regression/example_regression_tuning_step.py] |
2.824 |
test.test_generated_examples::test_generated_example[result_types/result_image/example_result_image_to_video.py] |
2.720 |
test.test_generated_examples::test_generated_example[regression/example_regression_tuning_noise.py] |
2.344 |
test.test_bench_runner.TestBenchRunner::test_benchrunner_unified_interface |
1.513 |
Updated by Performance Tracking workflow
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Lays down the plot plugin registry, contract, and discovery — purely additive. The registry exists but no existing code path queries it yet. Built-in chart types migrate onto this mechanism in subsequent PRs.
This is the first piece of a larger plan to replace the inheritance-based rendering system in
bencher/results/with a plugin registry, so third parties can ship plot backends in their own repos and users can write one-off plugins inline without touching upstream.Design decisions (resolved before implementation)
pn.viewable.Viewable/ Panel HTML report. Plugins own internal composition (linkedhv.Layout,plotly.subplots, full Rerun blueprints); bencher does Panel-level outer composition only. Non-HTML outputs (full-Rerun export, future PDF) consumebench.dataset/BenchDataoutside the plugin system.(name, backend, match: PlotFilter, priority, requires: frozenset[str], render(BenchData) -> pn.viewable.Viewable).BenchData: frozen value type — dataset, input/result vars, plt_cnt_cfg, run_meta, plus optionaloptimizer_study,baseline_runs,cache. One contract covers chart and meta-views; plugins gate on availability viaPlotFilter.requires.bencher.plot_plugins(lazy, on first lookup) +bencher.register_plugin(...)for in-script plugins.strict=Truere-raises for development).to_<type>()API surface is preserved as thin shims dispatching through the registry; zero user-visible breakage.What's in this PR
bencher/plugins/bench_data.pyBenchData(frozen),RunMeta,CacheHandleprotocol,BenchData.fake()test constructorbencher/plugins/plugin.pyPlotPluginprotocol,@plot_plugindecorator (function form)bencher/plugins/registry.pyPluginRegistry— register, override-by-name, lazy entry-point discovery,select(...)with priority/include/exclude/backend/only filters,render(...)with error-pane substitution andstrict=Trueopt-inbencher/plugins/__init__.pybencher/__init__.pytest/test_plugins.pyWhat is not changed
LineResult,HeatmapResult, etc.) is touched.bench_result.pyis unchanged.Test plan
pixi run format— cleanpixi run ruff-lint— cleanpixi run pylint— 10.00/10pixi run ty— no diagnostics on new filespixi run pytest test/test_plugins.py— 23/23 passimport benchersmoke test: plugin API reachable at the top levelNext up
Tier 1 — migrate
LineResult,HeatmapResult,CurveResult,BarResult,ScatterResult,BandResult,TableResultonto the registry. The existingto_<type>()methods stay as thin shims; behavior unchanged. That migration validates the contract before tackling the schema-stretching cases (Optuna, regression reports, Explorer, Rerun).🤖 Generated with Claude Code
Summary by Sourcery
Introduce a new plot plugin infrastructure with a registry, plugin contract, and shared BenchData value type, without yet wiring it into existing plotting code paths.
New Features:
Tests: