[CI]Add parity report scripts and workflow by ethanwee1 · Pull Request #3094 · ROCm/pytorch

ethanwee1 · 2026-03-20T19:56:30Z

Summary

Add pytorch-unit-test-scripts/ directory with all parity scripts (download_testlogs, summarize_xml_testreports, parity.sh, and supporting utilities)
Add parity.yml GitHub Actions workflow that can be manually triggered to download CI artifacts and generate parity CSVs
All download_testlogs and summarize_xml_testreports.py flags are exposed as workflow inputs (SHA, PR ID, arch, exclude flags, filter, set names, etc.)
Architectures are configurable via comma-separated input (default: mi200,mi300,mi355)
Generated CSVs and logs are uploaded as downloadable workflow artifacts

Setup

Requires these repository secrets:

- IFU_GITHUB_TOKEN (already exists)
- AWS_ACCESS_KEY_ID
- AWS_SECRET_ACCESS_KEY

Test plan

Trigger workflow via Actions tab or gh workflow run parity.yml --ref add-parity-scripts-dashboard
Verify artifacts download and CSVs generate for each architecture
Verify CSV artifacts are downloadable from the workflow run
https://github.com/ethanwee1/pytorch/actions/runs/23413634454

Move unit test parity scripts from frameworks-internal into ROCm/pytorch. Includes download_testlogs, summarize_xml_testreports, parity.sh, and supporting utilities. Also adds a daily GitHub Actions workflow that generates parity CSVs for MI200/MI300/MI355 and deploys a skip reason dashboard to GitHub Pages.

Replace hardcoded workflow with full configurability: SHA, PR ID, architecture list, exclude flags, ignore_status, artifacts_only, no_rocm, no_cuda, set1/set2 names, and status filter. Remove dashboard/Pages -- workflow now just generates and uploads CSVs as downloadable artifacts.

rocm-repo-management-api · 2026-03-20T19:58:17Z

Jenkins build for 67120c11565d67066bccdc1232b8b3f11a6edbd6 commit finished as NOT_BUILT
Links: Pipeline Overview / Build artifacts / Test Results

rocm-repo-management-api · 2026-03-20T20:14:44Z

Jenkins build for e24415b7052760edf0e4eb16ee3fb74c975097a1 commit finished as NOT_BUILT
Links: Pipeline Overview / Build artifacts / Test Results

- New generate_summary.py produces a combined summary CSV with per-architecture columns showing per-workflow stats and overall parity metrics. - Workflow now has a generate-summary job that runs after all per-arch CSVs are generated and uploads the summary as an artifact. - Default arch order changed to mi355, mi300, mi200. - Arch input now accepts both commas and spaces as delimiters. - Fixed upload glob to capture all CSVs regardless of csv_name.

- generate_summary.py now outputs both .csv and .md files - Markdown uses proper tables with section headers per workflow - Workflow writes the .md to GITHUB_STEP_SUMMARY so it renders directly on the workflow run page

Shows test_file, test_class, test_name, workflow, and both statuses for every test marked FAILED on either set, grouped by architecture.

rocm-repo-management-api · 2026-03-20T20:59:21Z

Jenkins build for c93e9c137fc60256ed6dc4775218224b39e60b49 commit finished as NOT_BUILT
Links: Pipeline Overview / Build artifacts / Test Results

When trunk workflow has no distributed jobs for MI355 at the given SHA (e.g. PR commits), fall back to periodic-rocm-mi355 workflow. Also switches the job prefix to linux-noble-rocm-py3.12-mi355 when the fallback is used.

When csv_name is provided (e.g. "march_report"), artifacts are named: - march_report_mi355 (per-arch CSV) - march_report_summary (summary) Without csv_name, defaults to parity-csv-ARCH and parity-summary.

- Replace artifacts_only with include_logs checkbox (default: off). When checked, CI log files (.txt) are downloaded and included. - Move download_testlogs tee log into the output folder so it is always captured in the artifact zip. - Upload step now captures *.csv, *.log, and *.txt from the output folder, putting everything into one zip per architecture.

Query the GitHub API after all artifacts are uploaded to resolve real artifact IDs and generate direct download links.

rocm-repo-management-api · 2026-03-20T21:58:52Z

Jenkins build for c93e9c137fc60256ed6dc4775218224b39e60b49 commit finished as FAILURE
Links: Pipeline Overview / Build artifacts / Test Results

Workflow now automatically downloads the previous week's CSV to carry forward skip_reason, assignee, comments, and existed_last_week columns: 1. Checks the parity-input GitHub Release for a user-edited CSV 2. Falls back to the last successful parity run's artifact 3. Gracefully skips if neither is available Also adds a README documenting the weekly parity workflow process.

MI200 tests have moved from separate rocm-mi200/periodic-rocm-mi200/ inductor-rocm-mi200 workflows into the unified trunk-rocm-sandbox workflow. All three test types (default, distributed, inductor) now use the same workflow and linux-jammy-rocm-py3.10 job prefix.

Generates a self-contained HTML dashboard from per-architecture CSVs with four tabs: - Summary: per-workflow stats cards with AGREE/DISAGREE percentages - Skip Reasons: breakdown by category with bar charts per workflow - Failed Tests: table of all FAILED tests across architectures - All Tests: filterable/searchable table of skipped, missed, failed, and new tests with pagination Dashboard is included in the summary artifact for easy download.

Dashboard All Tests tab now has two status filters (e.g. "All rocm" and "All cuda") so you can filter by either set independently or combine both to find specific status pairs like SKIPPED+PASSED.

Dashboard is now a separate artifact with a prominent download link at the top of the summary, above the artifacts table.

rocm-repo-management-api · 2026-03-22T22:07:17Z

Jenkins build for 88759200b8abde043046841f17782e969cfd635e commit finished as NOT_BUILT
Links: Pipeline Overview / Build artifacts / Test Results

… auto-skip CUDA for baseline_sha

…rocm_dist*.txt, baseline_rocm_inductor*.txt)

- Move baseline_sha workflow input to appear right after sha for better UX - When both SHAs provided, prefix log files with short SHAs (e.g. 5809e41e_rocm1.txt, a4de454b_rocm1.txt) - When no baseline, preserve original naming (rocm1.txt, baseline_rocm1.txt)

Moves all GitHub project items assigned to ethanwee1 from the previous sprint to the current sprint on ROCm/projects/18 using the GraphQL API.

Exclude MISSED status from total counts in generate_summary.py. Tests only present in ROCm get status_cuda=MISSED in the merged CSV, and since each architecture has different ROCm-only tests, the CUDA totals were incorrectly inflated by varying amounts per architecture.

This metric is always 0 because --prev_week_csv is never passed in the workflow. It's also misleading since runs target arbitrary commits, not a weekly cadence.

Classifies ~6400 tests with "skipIfRocm: Fails with Triton 3.7" skip messages under the "triton 3.7 bump" category.

Add include_xml checkbox (default off) to control whether raw XML test reports are included in the artifact zip. XMLs account for ~53% of the uncompressed artifact size and are only needed for debugging.

rocm-repo-management-api · 2026-03-31T18:55:26Z

Jenkins build for 2afdb21c2377908d47346f4da03f34e35bf35250 commit finished as NOT_BUILT
Links: Pipeline Overview / Build artifacts / Test Results

rocm-repo-management-api · 2026-03-31T19:08:45Z

Jenkins build for 74d0e969363a1b2cacfdf3c3ce33ccf18c199370 commit finished as NOT_BUILT
Links: Pipeline Overview / Build artifacts / Test Results

Keep pytorch-unit-test-scripts as the directory name. Remove scripts not used by the parity workflow and remove move-sprint-items.sh. Remaining files: download_testlogs, summarize_xml_testreports.py, generate_summary.py, auto_classify_skip_reasons.py, upload_test_stats.py, upload_stats_lib.py, requirements.txt.

rocm-repo-management-api · 2026-03-31T19:54:30Z

Jenkins build for 74d0e969363a1b2cacfdf3c3ce33ccf18c199370 commit finished as FAILURE
Links: Pipeline Overview / Build artifacts / Test Results

Upstream CI workflows for the same SHA can have different created_at dates when they span midnight, causing create_test_folder to create separate directories. The ls -dt folder selection then picks the wrong one (most recently modified), leading to all-zero results. Fix by adding get_or_create_test_folder() which reuses the first folder for all subsequent downloads, ensuring all artifacts land in one place. Also update mi355 job prefix from linux-jammy-rocm-py3.10 to linux-jammy-rocm-py3.10-mi355 to match current upstream CI job names.

When trunk workflow has no run for a given SHA (e.g. PR commits), fall back to rocm-mi355 workflow for mi355 default tests. Mirrors the existing periodic/distributed fallback pattern.

The rocm-mi355 workflow uses linux-noble-rocm-py3.12-mi355 as the job prefix, not linux-jammy-rocm-py3.10-mi355 (trunk). Without this, log downloads fail with "TEST KEY DOES NOT EXIST" when the fallback is used.

The inductor-rocm-mi355 workflow uses linux-noble-rocm-py3.12-mi355 as the job prefix, not rocm-py3.12-inductor-mi355.

mi300 inductor: rocm-py3.12-inductor-mi300 -> linux-noble-rocm-py3.12-mi300 navi31 default: linux-jammy-rocm-py3_10 -> linux-jammy-rocm-py3.10-navi31

Remove nightly from auto-exclusion list and add workflow names, job prefixes, and shard counts for distributed and inductor tests. Update workflow input descriptions to reflect the change.

rocm-repo-management-api · 2026-04-09T15:29:55Z

Jenkins build for 4f726975463f1f5ed41d1299eeffbdbb67907c3d commit finished as SUCCESS
Links: Pipeline Overview / Build artifacts

rocm-repo-management-api · 2026-04-10T14:53:47Z

Jenkins build for f1481642031550c5c461b6ce347a6ac8aaced87b commit is in progress
Links: Pipeline Overview / Build artifacts

ethanwee1 added 5 commits March 20, 2026 19:48

Remove cron schedule, use workflow_dispatch only

56fb72b

Use IFU_GITHUB_TOKEN secret instead of GH_PAT

622b3e5

Rename workflow to parity.yml

a40d9ae

Always pass --ignore_status, remove it as an input

67120c1

ethanwee1 added 5 commits March 20, 2026 20:23

Set artifact retention to 1 day

a1b7438

Add csv_name input to customize output CSV filename

e24415b

Add failed tests table to parity summary

9f5bbcd

Shows test_file, test_class, test_name, workflow, and both statuses for every test marked FAILED on either set, grouped by architecture.

ethanwee1 added 7 commits March 20, 2026 21:08

Add MI355 distributed fallback to periodic-rocm-mi355

0149de6

When trunk workflow has no distributed jobs for MI355 at the given SHA (e.g. PR commits), fall back to periodic-rocm-mi355 workflow. Also switches the job prefix to linux-noble-rocm-py3.12-mi355 when the fallback is used.

Use csv_name input for artifact zip names

2b7ce95

When csv_name is provided (e.g. "march_report"), artifacts are named: - march_report_mi355 (per-arch CSV) - march_report_summary (summary) Without csv_name, defaults to parity-csv-ARCH and parity-summary.

Default include_logs to true

22b8f7a

Rename no_rocm/no_cuda inputs to skip_rocm/skip_cuda

c93e9c1

Add artifacts table to job summary page

394d6f0

Link artifacts table to actual artifact download URLs

d85a713

Query the GitHub API after all artifacts are uploaded to resolve real artifact IDs and generate direct download links.

Show commit SHA as single header above architecture columns

e58b60d

ethanwee1 changed the title ~~Add parity report scripts and workflow~~ [Develop][CI]Add parity report scripts and workflow Mar 22, 2026

ethanwee1 added 5 commits March 22, 2026 21:20

Split status filter into separate set1/set2 dropdowns

9bcb5d8

Dashboard All Tests tab now has two status filters (e.g. "All rocm" and "All cuda") so you can filter by either set independently or combine both to find specific status pairs like SKIPPED+PASSED.

Add dashboard download link to job summary page

2816308

Dashboard is now a separate artifact with a prominent download link at the top of the summary, above the artifacts table.

ethanwee1 and others added 10 commits March 26, 2026 22:15

Filter out rerun_disabled_tests artifacts that caused duplicate dirs,…

ec6fd49

… auto-skip CUDA for baseline_sha

Download CI log files for baseline SHA (baseline_rocm*.txt, baseline_…

15e33f3

…rocm_dist*.txt, baseline_rocm_inductor*.txt)

Add script to move sprint items from @current-1 to @current

54c6fcf

Moves all GitHub project items assigned to ethanwee1 from the previous sprint to the current sprint on ROCm/projects/18 using the GraphQL API.

Remove unused filter input from parity workflow

558c5d6

Skip malformed XML files instead of crashing during CSV generation

480966c

Remove "Number of tests changed from last week" from summary

27d1b4e

This metric is always 0 because --prev_week_csv is never passed in the workflow. It's also misleading since runs target arbitrary commits, not a weekly cadence.

Add triton 3.7 bump auto-classify rule

f7771af

Classifies ~6400 tests with "skipIfRocm: Fails with Triton 3.7" skip messages under the "triton 3.7 bump" category.

Make XML upload opt-in to reduce artifact size ~10x

72e550a

Add include_xml checkbox (default off) to control whether raw XML test reports are included in the artifact zip. XMLs account for ~53% of the uncompressed artifact size and are only needed for debugging.

ethanwee1 force-pushed the add-parity-scripts-dashboard branch from 0758547 to 72e550a Compare March 31, 2026 18:51

jithunnair-amd changed the title ~~[Develop][CI]Add parity report scripts and workflow~~ [CI]Add parity report scripts and workflow Mar 31, 2026

jithunnair-amd marked this pull request as ready for review March 31, 2026 18:55

Rename pytorch-unit-test-scripts to .automation-scripts

2afdb21

ethanwee1 added 6 commits April 9, 2026 15:02

Add rocm-mi355 fallback for default workflow in download_testlogs

5347891

When trunk workflow has no run for a given SHA (e.g. PR commits), fall back to rocm-mi355 workflow for mi355 default tests. Mirrors the existing periodic/distributed fallback pattern.

Update job prefix when using rocm-mi355 default fallback

917409c

The rocm-mi355 workflow uses linux-noble-rocm-py3.12-mi355 as the job prefix, not linux-jammy-rocm-py3.10-mi355 (trunk). Without this, log downloads fail with "TEST KEY DOES NOT EXIST" when the fallback is used.

Fix mi355 inductor job prefix to match actual CI job names

d64737b

The inductor-rocm-mi355 workflow uses linux-noble-rocm-py3.12-mi355 as the job prefix, not rocm-py3.12-inductor-mi355.

Fix mi300 inductor and navi31 job prefixes to match actual CI

c6fb2d8

mi300 inductor: rocm-py3.12-inductor-mi300 -> linux-noble-rocm-py3.12-mi300 navi31 default: linux-jammy-rocm-py3_10 -> linux-jammy-rocm-py3.10-navi31

Add distributed and inductor support for rocm-nightly workflow

4f72697

Remove nightly from auto-exclusion list and add workflow names, job prefixes, and shard counts for distributed and inductor tests. Update workflow input descriptions to reflect the change.

jithunnair-amd added 2 commits April 10, 2026 14:45

Move into .automation_scripts to avoid conflicts with upstream PyTorch

bd31e42

Update for path change

f148164

jithunnair-amd merged commit 79e8877 into develop Apr 10, 2026
0 of 2 checks passed

jithunnair-amd deleted the add-parity-scripts-dashboard branch April 10, 2026 14:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CI]Add parity report scripts and workflow#3094

[CI]Add parity report scripts and workflow#3094
jithunnair-amd merged 77 commits intodevelopfrom
add-parity-scripts-dashboard

ethanwee1 commented Mar 20, 2026 •

edited by jithunnair-amd

Loading

Uh oh!

rocm-repo-management-api Bot commented Mar 20, 2026 •

edited

Loading

Uh oh!

rocm-repo-management-api Bot commented Mar 20, 2026 •

edited

Loading

Uh oh!

rocm-repo-management-api Bot commented Mar 20, 2026 •

edited

Loading

Uh oh!

rocm-repo-management-api Bot commented Mar 20, 2026 •

edited

Loading

Uh oh!

rocm-repo-management-api Bot commented Mar 22, 2026 •

edited

Loading

Uh oh!

rocm-repo-management-api Bot commented Mar 31, 2026 •

edited

Loading

Uh oh!

rocm-repo-management-api Bot commented Mar 31, 2026 •

edited

Loading

Uh oh!

rocm-repo-management-api Bot commented Mar 31, 2026 •

edited

Loading

Uh oh!

rocm-repo-management-api Bot commented Apr 9, 2026 •

edited

Loading

Uh oh!

Uh oh!

rocm-repo-management-api Bot commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ethanwee1 commented Mar 20, 2026 • edited by jithunnair-amd Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Setup

Test plan

Uh oh!

rocm-repo-management-api Bot commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rocm-repo-management-api Bot commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rocm-repo-management-api Bot commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rocm-repo-management-api Bot commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rocm-repo-management-api Bot commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rocm-repo-management-api Bot commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rocm-repo-management-api Bot commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rocm-repo-management-api Bot commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rocm-repo-management-api Bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

rocm-repo-management-api Bot commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ethanwee1 commented Mar 20, 2026 •

edited by jithunnair-amd

Loading

rocm-repo-management-api Bot commented Mar 20, 2026 •

edited

Loading

rocm-repo-management-api Bot commented Mar 20, 2026 •

edited

Loading

rocm-repo-management-api Bot commented Mar 20, 2026 •

edited

Loading

rocm-repo-management-api Bot commented Mar 20, 2026 •

edited

Loading

rocm-repo-management-api Bot commented Mar 22, 2026 •

edited

Loading

rocm-repo-management-api Bot commented Mar 31, 2026 •

edited

Loading

rocm-repo-management-api Bot commented Mar 31, 2026 •

edited

Loading

rocm-repo-management-api Bot commented Mar 31, 2026 •

edited

Loading

rocm-repo-management-api Bot commented Apr 9, 2026 •

edited

Loading