AORTA-23 : Phase 1 : Json loader for all configuration. by prosenjitdhole · Pull Request #98 · ROCm/aorta

prosenjitdhole · 2026-02-12T10:31:17Z

Implement the JSON loader for all configurations i.e single run, sweep run, and multiple workload run

Copilot

Pull request overview

Adds a HW Queue Eval JSON loader and related data classes to the report processing layer, enabling parsing of single-run, sweep, and directory-based multi-workload result outputs.

Changes:

Introduces hwqueue_loader.py with dataclasses for HW Queue Eval results and a loader API for single files, sweeps, and directories.
Exposes the new loader and dataclasses via aorta.report.processing package exports.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.

File	Description
`src/aorta/report/processing/hwqueue_loader.py`	New dataclasses + JSON loading utilities for HW Queue Eval results (single/sweep/directory + comparison helpers).
`src/aorta/report/processing/__init__.py`	Re-exports HW Queue Eval loader/types from the processing package.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot · 2026-03-04T16:13:41Z

@prosenjitdhole I've opened a new pull request, #120, to work on those changes. Once the pull request is ready, I'll request review from you.

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

… JSON (#120) * Initial plan * Add isinstance(data, dict) guard to _is_single_run_format and _is_sweep_format Co-authored-by: prosenjitdhole <239307697+prosenjitdhole@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: prosenjitdhole <239307697+prosenjitdhole@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-06T11:10:48Z

@prosenjitdhole I've opened a new pull request, #123, to work on those changes. Once the pull request is ready, I'll request review from you.

Copilot · 2026-03-06T11:12:33Z

@prosenjitdhole I've opened a new pull request, #124, to work on those changes. Once the pull request is ready, I'll request review from you.

* Initial plan * Add TODO for optimizing load_comparison_data to avoid unnecessary I/O Co-authored-by: prosenjitdhole <239307697+prosenjitdhole@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: prosenjitdhole <239307697+prosenjitdhole@users.noreply.github.com>

Copilot · 2026-03-06T11:50:17Z

@prosenjitdhole I've opened a new pull request, #125, to work on those changes. Once the pull request is ready, I'll request review from you.

…json` in `load_directory()` (#125) * Initial plan * Load environment_info.json in load_directory() for single-run wrappers Co-authored-by: prosenjitdhole <239307697+prosenjitdhole@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: prosenjitdhole <239307697+prosenjitdhole@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-06T13:57:29Z

+    def _load_json(path: Path) -> Dict[str, Any]:
+        """Load JSON file with error handling."""
+        if not path.exists():
+            raise HWQueueLoaderError(f"File not found: {path}")
+        if not path.is_file():
+            raise HWQueueLoaderError(f"Not a file: {path}")
+
+        try:
+            with open(path, "r") as f:
+                return json.load(f)
+        except (json.JSONDecodeError, OSError, UnicodeDecodeError) as e:
+            raise HWQueueLoaderError(f"Failed to load JSON from {path}: {e}") from e


_load_json() is annotated to return Dict[str, Any], but json.load() is typed as returning Any. With the repo's mypy settings (warn_return_any = true), this will be flagged as returning Any from a function declared to return a Dict. Consider loading into a local var, validating it's a dict (or raising HWQueueLoaderError if not), and then returning a cast(Dict[str, Any], data) (or loosen the return type to Any).

Copilot · 2026-03-06T13:57:29Z

+        results = [SingleRunData.from_dict(r) for r in data.get("results", [])]
+        return cls(
+            workload_name=data.get("workload", "unknown"),
+            results=results,
+            environment=EnvironmentData.from_dict(data.get("environment")),
+            analysis=ScalingAnalysisData.from_dict(data.get("analysis")),


SweepData.from_dict() assumes every entry in data["results"] is a mapping; if an entry is not a dict, SingleRunData.from_dict(r) will raise (e.g., AttributeError on .get) and bypass HWQueueLoaderError handling in load_sweep() / load_auto(). Consider validating each element is a dict (and possibly required keys exist) and raising HWQueueLoaderError with a clear schema error message.

Suggested change

results = [SingleRunData.from_dict(r) for r in data.get("results", [])]

return cls(

workload_name=data.get("workload", "unknown"),

results=results,

environment=EnvironmentData.from_dict(data.get("environment")),

analysis=ScalingAnalysisData.from_dict(data.get("analysis")),

raw_results = data.get("results", [])

if not isinstance(raw_results, list):

raise HWQueueLoaderError(

"Invalid sweep JSON: 'results' must be a list of objects."

)

results: List[SingleRunData] = []

for idx, r in enumerate(raw_results):

if not isinstance(r, dict):

raise HWQueueLoaderError(

"Invalid sweep JSON: each entry in 'results' must be an object; "

f"found {type(r).__name__} at index {idx}."

)

results.append(SingleRunData.from_dict(r))

env_data = data.get("environment")

if env_data is not None and not isinstance(env_data, dict):

raise HWQueueLoaderError(

"Invalid sweep JSON: 'environment' must be an object when present."

)

analysis_data = data.get("analysis")

if analysis_data is not None and not isinstance(analysis_data, dict):

raise HWQueueLoaderError(

"Invalid sweep JSON: 'analysis' must be an object when present."

)

return cls(

workload_name=data.get("workload", "unknown"),

results=results,

environment=EnvironmentData.from_dict(env_data),

analysis=ScalingAnalysisData.from_dict(analysis_data),

Copilot · 2026-03-06T13:57:29Z

+    gpu_count: int = 0
+    gpus: List[Dict[str, Any]] = field(default_factory=list)
+


EnvironmentData.gpus is typed as List[str], but the producer (aorta.utils.device.get_system_info() / log_environment_info()) records GPUs as a list of dicts (e.g., {index, name, memory_gb}), and save_sweep_results() persists that structure. This mismatch makes the loader's type hints misleading; consider changing gpus to List[Dict[str, Any]] (or introducing a dedicated GPU dataclass) and updating the factories accordingly.

Copilot · 2026-03-06T13:57:30Z

+        for json_file in json_files:
+            try:
+                # Extract workload name from filename (e.g., hetero_kernels_results.json -> hetero_kernels)
+                workload_name = json_file.stem.replace("_results", "")


Workload name extraction uses json_file.stem.replace("_results", ""), which removes all occurrences of _results in the stem. For filenames like my_results_test_results.json this would produce my_test instead of my_results_test. Consider using a suffix-based removal (e.g., removesuffix("_results") with an endswith guard) to only strip the trailing marker.

Suggested change

workload_name = json_file.stem.replace("_results", "")

workload_name = json_file.stem

if workload_name.endswith("_results"):

workload_name = workload_name[: -len("_results")]

Copilot · 2026-03-06T13:57:30Z

+                raise HWQueueLoaderError(f"Invalid directory: {dir_path}")
+            workloads = set()
+            for f in dir_path.glob("*_results.json"):
+                workload_name = f.stem.replace("_results", "")


Same issue as in load_directory(): f.stem.replace("_results", "") will remove all _results occurrences, not just a trailing suffix, which can miscompute workload names for stems that legitimately contain _results earlier. Prefer a suffix-only removal (e.g., removesuffix("_results")).

Suggested change

workload_name = f.stem.replace("_results", "")

workload_name = f.stem.removesuffix("_results")

AORTA-23 : Phase 1 : Json loader for all configuration.

a717260

prosenjitdhole requested a review from oyazdanb February 12, 2026 10:31

prosenjitdhole changed the title ~~AORTA-23 : Phase 1 : Json loader for all configuration.~~ [WIP] AORTA-23 : Phase 1 : Json loader for all configuration. Feb 12, 2026

dummy

7e57c16

prosenjitdhole changed the title ~~[WIP] AORTA-23 : Phase 1 : Json loader for all configuration.~~ AORTA-23 : Phase 1 : Json loader for all configuration. Feb 18, 2026

prosenjitdhole marked this pull request as ready for review February 18, 2026 09:28

Fixed precommit

5d949f9

prosenjitdhole requested a review from Copilot March 4, 2026 15:26

Copilot started reviewing on behalf of prosenjitdhole March 4, 2026 15:28 View session

Copilot AI reviewed Mar 4, 2026

View reviewed changes

Update src/aorta/report/processing/hwqueue_loader.py

7061237

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings March 4, 2026 16:12

Copilot AI mentioned this pull request Mar 4, 2026

Guard _is_single_run_format and _is_sweep_format against non-dict JSON #120

Merged

Copilot started reviewing on behalf of prosenjitdhole March 4, 2026 16:14 View session

Copilot AI reviewed Mar 4, 2026

View reviewed changes

Update src/aorta/report/processing/hwqueue_loader.py

ff094d4

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings March 4, 2026 16:31

Copilot started reviewing on behalf of prosenjitdhole March 4, 2026 16:33 View session

prosenjitdhole and others added 3 commits March 4, 2026 22:04

Update src/aorta/report/processing/hwqueue_loader.py

95e2d32

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update src/aorta/report/processing/hwqueue_loader.py

8877e80

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI reviewed Mar 4, 2026

View reviewed changes

Comment thread src/aorta/report/processing/hwqueue_loader.py

Comment thread src/aorta/report/processing/hwqueue_loader.py

Comment thread src/aorta/report/processing/hwqueue_loader.py

Comment thread src/aorta/report/processing/hwqueue_loader.py

Copilot AI mentioned this pull request Mar 6, 2026

Add TODO for load_comparison_data unnecessary I/O optimization #123

Merged

Copilot AI mentioned this pull request Mar 6, 2026

Validate SweepData.from_dict() results entries are dicts, raise HWQueueLoaderError #124

Closed

Copilot AI mentioned this pull request Mar 6, 2026

Populate environment metadata from directory-level environment_info.json in load_directory() #125

Merged

Copilot AI review requested due to automatic review settings March 6, 2026 12:24

Copilot started reviewing on behalf of prosenjitdhole March 6, 2026 12:25 View session

Copilot AI reviewed Mar 6, 2026

View reviewed changes

Comment thread src/aorta/report/processing/hwqueue_loader.py Outdated

Comment thread src/aorta/report/processing/hwqueue_loader.py

Comment thread src/aorta/report/processing/hwqueue_loader.py Outdated

Update src/aorta/report/processing/hwqueue_loader.py

a1d6019

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings March 6, 2026 13:52

Copilot started reviewing on behalf of prosenjitdhole March 6, 2026 13:53 View session

Update src/aorta/report/processing/hwqueue_loader.py

c4b9299

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI reviewed Mar 6, 2026

View reviewed changes

oyazdanb approved these changes Mar 6, 2026

View reviewed changes

-        results = [SingleRunData.from_dict(r) for r in data.get("results", [])]
-        return cls(
-            workload_name=data.get("workload", "unknown"),
-            results=results,
-            environment=EnvironmentData.from_dict(data.get("environment")),
-            analysis=ScalingAnalysisData.from_dict(data.get("analysis")),
+        raw_results = data.get("results", [])
+        if not isinstance(raw_results, list):
+            raise HWQueueLoaderError(
+                "Invalid sweep JSON: 'results' must be a list of objects."
+            )
+        results: List[SingleRunData] = []
+        for idx, r in enumerate(raw_results):
+            if not isinstance(r, dict):
+                raise HWQueueLoaderError(
+                    "Invalid sweep JSON: each entry in 'results' must be an object; "
+                    f"found {type(r).__name__} at index {idx}."
+                )
+            results.append(SingleRunData.from_dict(r))
+        env_data = data.get("environment")
+        if env_data is not None and not isinstance(env_data, dict):
+            raise HWQueueLoaderError(
+                "Invalid sweep JSON: 'environment' must be an object when present."
+            )
+        analysis_data = data.get("analysis")
+        if analysis_data is not None and not isinstance(analysis_data, dict):
+            raise HWQueueLoaderError(
+                "Invalid sweep JSON: 'analysis' must be an object when present."
+            )
+        return cls(
+            workload_name=data.get("workload", "unknown"),
+            results=results,
+            environment=EnvironmentData.from_dict(env_data),
+            analysis=ScalingAnalysisData.from_dict(analysis_data),

		gpu_count: int = 0
		gpus: List[Dict[str, Any]] = field(default_factory=list)

	workload_name = f.stem.replace("_results", "")
	workload_name = f.stem.removesuffix("_results")

Conversation

prosenjitdhole commented Feb 12, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI commented Mar 4, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI commented Mar 6, 2026

Uh oh!

Copilot AI commented Mar 6, 2026

Uh oh!

Copilot AI commented Mar 6, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants