[CI]Add parity report scripts and workflow#3094
Conversation
Move unit test parity scripts from frameworks-internal into ROCm/pytorch. Includes download_testlogs, summarize_xml_testreports, parity.sh, and supporting utilities. Also adds a daily GitHub Actions workflow that generates parity CSVs for MI200/MI300/MI355 and deploys a skip reason dashboard to GitHub Pages.
Replace hardcoded workflow with full configurability: SHA, PR ID, architecture list, exclude flags, ignore_status, artifacts_only, no_rocm, no_cuda, set1/set2 names, and status filter. Remove dashboard/Pages -- workflow now just generates and uploads CSVs as downloadable artifacts.
|
Jenkins build for 67120c11565d67066bccdc1232b8b3f11a6edbd6 commit finished as NOT_BUILT |
|
Jenkins build for e24415b7052760edf0e4eb16ee3fb74c975097a1 commit finished as NOT_BUILT |
- New generate_summary.py produces a combined summary CSV with per-architecture columns showing per-workflow stats and overall parity metrics. - Workflow now has a generate-summary job that runs after all per-arch CSVs are generated and uploads the summary as an artifact. - Default arch order changed to mi355, mi300, mi200. - Arch input now accepts both commas and spaces as delimiters. - Fixed upload glob to capture all CSVs regardless of csv_name.
- generate_summary.py now outputs both .csv and .md files - Markdown uses proper tables with section headers per workflow - Workflow writes the .md to GITHUB_STEP_SUMMARY so it renders directly on the workflow run page
Shows test_file, test_class, test_name, workflow, and both statuses for every test marked FAILED on either set, grouped by architecture.
|
Jenkins build for c93e9c137fc60256ed6dc4775218224b39e60b49 commit finished as NOT_BUILT |
When trunk workflow has no distributed jobs for MI355 at the given SHA (e.g. PR commits), fall back to periodic-rocm-mi355 workflow. Also switches the job prefix to linux-noble-rocm-py3.12-mi355 when the fallback is used.
When csv_name is provided (e.g. "march_report"), artifacts are named: - march_report_mi355 (per-arch CSV) - march_report_summary (summary) Without csv_name, defaults to parity-csv-ARCH and parity-summary.
- Replace artifacts_only with include_logs checkbox (default: off). When checked, CI log files (.txt) are downloaded and included. - Move download_testlogs tee log into the output folder so it is always captured in the artifact zip. - Upload step now captures *.csv, *.log, and *.txt from the output folder, putting everything into one zip per architecture.
Query the GitHub API after all artifacts are uploaded to resolve real artifact IDs and generate direct download links.
|
Jenkins build for c93e9c137fc60256ed6dc4775218224b39e60b49 commit finished as FAILURE |
Workflow now automatically downloads the previous week's CSV to carry forward skip_reason, assignee, comments, and existed_last_week columns: 1. Checks the parity-input GitHub Release for a user-edited CSV 2. Falls back to the last successful parity run's artifact 3. Gracefully skips if neither is available Also adds a README documenting the weekly parity workflow process.
MI200 tests have moved from separate rocm-mi200/periodic-rocm-mi200/ inductor-rocm-mi200 workflows into the unified trunk-rocm-sandbox workflow. All three test types (default, distributed, inductor) now use the same workflow and linux-jammy-rocm-py3.10 job prefix.
Generates a self-contained HTML dashboard from per-architecture CSVs with four tabs: - Summary: per-workflow stats cards with AGREE/DISAGREE percentages - Skip Reasons: breakdown by category with bar charts per workflow - Failed Tests: table of all FAILED tests across architectures - All Tests: filterable/searchable table of skipped, missed, failed, and new tests with pagination Dashboard is included in the summary artifact for easy download.
Dashboard All Tests tab now has two status filters (e.g. "All rocm" and "All cuda") so you can filter by either set independently or combine both to find specific status pairs like SKIPPED+PASSED.
Dashboard is now a separate artifact with a prominent download link at the top of the summary, above the artifacts table.
|
Jenkins build for 88759200b8abde043046841f17782e969cfd635e commit finished as NOT_BUILT |
… auto-skip CUDA for baseline_sha
…rocm_dist*.txt, baseline_rocm_inductor*.txt)
- Move baseline_sha workflow input to appear right after sha for better UX - When both SHAs provided, prefix log files with short SHAs (e.g. 5809e41e_rocm1.txt, a4de454b_rocm1.txt) - When no baseline, preserve original naming (rocm1.txt, baseline_rocm1.txt)
Moves all GitHub project items assigned to ethanwee1 from the previous sprint to the current sprint on ROCm/projects/18 using the GraphQL API.
Exclude MISSED status from total counts in generate_summary.py. Tests only present in ROCm get status_cuda=MISSED in the merged CSV, and since each architecture has different ROCm-only tests, the CUDA totals were incorrectly inflated by varying amounts per architecture.
This metric is always 0 because --prev_week_csv is never passed in the workflow. It's also misleading since runs target arbitrary commits, not a weekly cadence.
Classifies ~6400 tests with "skipIfRocm: Fails with Triton 3.7" skip messages under the "triton 3.7 bump" category.
Add include_xml checkbox (default off) to control whether raw XML test reports are included in the artifact zip. XMLs account for ~53% of the uncompressed artifact size and are only needed for debugging.
0758547 to
72e550a
Compare
|
Jenkins build for 2afdb21c2377908d47346f4da03f34e35bf35250 commit finished as NOT_BUILT |
|
Jenkins build for 74d0e969363a1b2cacfdf3c3ce33ccf18c199370 commit finished as NOT_BUILT |
Keep pytorch-unit-test-scripts as the directory name. Remove scripts not used by the parity workflow and remove move-sprint-items.sh. Remaining files: download_testlogs, summarize_xml_testreports.py, generate_summary.py, auto_classify_skip_reasons.py, upload_test_stats.py, upload_stats_lib.py, requirements.txt.
|
Jenkins build for 74d0e969363a1b2cacfdf3c3ce33ccf18c199370 commit finished as FAILURE |
Upstream CI workflows for the same SHA can have different created_at dates when they span midnight, causing create_test_folder to create separate directories. The ls -dt folder selection then picks the wrong one (most recently modified), leading to all-zero results. Fix by adding get_or_create_test_folder() which reuses the first folder for all subsequent downloads, ensuring all artifacts land in one place. Also update mi355 job prefix from linux-jammy-rocm-py3.10 to linux-jammy-rocm-py3.10-mi355 to match current upstream CI job names.
When trunk workflow has no run for a given SHA (e.g. PR commits), fall back to rocm-mi355 workflow for mi355 default tests. Mirrors the existing periodic/distributed fallback pattern.
The rocm-mi355 workflow uses linux-noble-rocm-py3.12-mi355 as the job prefix, not linux-jammy-rocm-py3.10-mi355 (trunk). Without this, log downloads fail with "TEST KEY DOES NOT EXIST" when the fallback is used.
The inductor-rocm-mi355 workflow uses linux-noble-rocm-py3.12-mi355 as the job prefix, not rocm-py3.12-inductor-mi355.
mi300 inductor: rocm-py3.12-inductor-mi300 -> linux-noble-rocm-py3.12-mi300 navi31 default: linux-jammy-rocm-py3_10 -> linux-jammy-rocm-py3.10-navi31
Remove nightly from auto-exclusion list and add workflow names, job prefixes, and shard counts for distributed and inductor tests. Update workflow input descriptions to reflect the change.
|
Jenkins build for 4f726975463f1f5ed41d1299eeffbdbb67907c3d commit finished as SUCCESS |
|
Jenkins build for f1481642031550c5c461b6ce347a6ac8aaced87b commit is in progress |
Summary
pytorch-unit-test-scripts/directory with all parity scripts (download_testlogs, summarize_xml_testreports, parity.sh, and supporting utilities)parity.ymlGitHub Actions workflow that can be manually triggered to download CI artifacts and generate parity CSVsdownload_testlogsandsummarize_xml_testreports.pyflags are exposed as workflow inputs (SHA, PR ID, arch, exclude flags, filter, set names, etc.)Setup
Requires these repository secrets:
IFU_GITHUB_TOKEN(already exists)AWS_ACCESS_KEY_IDAWS_SECRET_ACCESS_KEYTest plan
gh workflow run parity.yml --ref add-parity-scripts-dashboardhttps://github.com/ethanwee1/pytorch/actions/runs/23413634454