AORTA-23 : Phase 6 : Added plots for comparison pipeline #103
AORTA-23 : Phase 6 : Added plots for comparison pipeline #103prosenjitdhole wants to merge 1 commit intoprosenj_cli_hq_eval_report_phase_5from
Conversation
There was a problem hiding this comment.
Pull request overview
Adds Phase 6 support for generating and wiring up “comparison mode” (Mode C) plots in the HWQueue report pipeline, enabling visual baseline-vs-test analysis across multiple workloads.
Changes:
- Implemented Mode C plot generation (throughput comparison, delta summary, regression heatmap, latency delta).
- Hooked comparison plot generation into the HWQueue comparison pipeline step.
- Exported
generate_comparison_plotsfrom the generators package.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
src/aorta/report/pipelines/hwqueue_pipeline.py |
Calls the new comparison plot generator and records generated plot outputs in the pipeline result. |
src/aorta/report/generators/hwqueue_plots.py |
Adds plot functions and an orchestrator for Mode C multi-workload baseline/test comparisons. |
src/aorta/report/generators/__init__.py |
Re-exports generate_comparison_plots for public generator access. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| threshold=config.threshold, | ||
| verbose=config.verbose, | ||
| ) | ||
| result.files_generated["plots"] = plot_files |
There was a problem hiding this comment.
HWQueuePipelineResult.files_generated is annotated as Dict[str, Path], but here it is assigned a List[Path] (plot_files). This also means the CLI output (which only prints entries that are isinstance(path, Path)) will silently omit the generated plots. Consider updating the result type to accept lists (e.g., Path | list[Path]) and/or store a single plots_dir Path plus individual plot paths under separate keys so downstream code can display them consistently.
| result.files_generated["plots"] = plot_files | |
| # Record the directory containing plots and each individual plot file. | |
| result.files_generated["plots_dir"] = plots_dir | |
| for plot_file in plot_files: | |
| result.files_generated[f"plot:{plot_file.name}"] = plot_file |
| _, b_tp = baseline_data[wl].get_best_throughput() | ||
| _, t_tp = test_data[wl].get_best_throughput() | ||
| if b_tp > 0: | ||
| change_pct = (t_tp - b_tp) / b_tp * 100 | ||
| else: | ||
| change_pct = 0 | ||
| changes.append((wl, change_pct)) |
There was a problem hiding this comment.
When b_tp is 0 (or results are empty), this sets change_pct = 0, which can be misleading (e.g., baseline 0 → test >0 shows as 0% change). Consider using np.nan/None (and labeling as N/A), or handling the division-by-zero case explicitly (e.g., treat as +∞ / -100% depending on semantics) so the plot reflects missing/invalid baselines accurately.
| for sc in sorted_streams: | ||
| b_val = b_by_streams.get(sc) | ||
| t_val = t_by_streams.get(sc) | ||
| if b_val and t_val and b_val > 0: |
There was a problem hiding this comment.
The if b_val and t_val and b_val > 0: check treats t_val == 0 as missing data and writes np.nan, which hides real regressions (e.g., throughput dropping to 0 should show as -100%). Use explicit is not None checks (and keep the b_val > 0 guard) so zero values are still plotted.
| if b_val and t_val and b_val > 0: | |
| if b_val is not None and t_val is not None and b_val > 0: |
| sorted_streams = sorted(all_streams) | ||
|
|
||
| # Build heatmap data (% change) | ||
| heatmap_data = [] | ||
| for wl in common_workloads: |
There was a problem hiding this comment.
sorted_streams can be empty if sweep JSONs contain no results. In that case heatmap_data becomes an array with shape (len(common_workloads), 0) and ax.imshow(...) will raise due to invalid image dimensions. Add an early guard (e.g., if no stream counts or heatmap_data.size == 0) to skip this plot with a clear message or return None.
| if b_p99 and t_p99 and b_p99 > 0: | ||
| # Note: For latency, lower is better, so positive change is regression | ||
| change_pct = (t_p99 - b_p99) / b_p99 * 100 | ||
| else: | ||
| change_pct = 0 |
There was a problem hiding this comment.
The if b_p99 and t_p99 and b_p99 > 0: condition treats valid zero values as missing (because 0 is falsy), which can hide large improvements/regressions (e.g., test P99 = 0 would currently fall back to change_pct = 0). Use is not None checks for b_p99/t_p99, plus the b_p99 > 0 guard, to compute percentage changes correctly.
No description provided.