Skip to content

[Issue]: pytorch unit tests never finish #124

@baryluk

Description

@baryluk

Problem Description

root@002d42c15b02:/var/lib/jenkins# python3 -c 'import torch; print(torch.cuda.is_available())'
True
root@002d42c15b02:/var/lib/jenkins# 
root@002d42c15b02:/var/lib/jenkins/pytorch# PYTORCH_TEST_WITH_ROCM=1 python3 test/run_test.py --verbose \
> --include test_nn test_torch test_cuda test_ops \
> test_unary_ufuncs test_binary_ufuncs test_autograd
/var/lib/jenkins/pytorch/test/run_test.py:18: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
  import pkg_resources
Ignoring disabled issues:  ['']
Downloading https://ossci-metrics.s3.amazonaws.com/slow-tests.json to /var/lib/jenkins/pytorch/test/.pytorch-slow-tests.json
Downloading https://ossci-metrics.s3.amazonaws.com/disabled-tests-condensed.json to /var/lib/jenkins/pytorch/test/.pytorch-disabled-tests.json
Received 7 tests to prioritize
  test_nn
  test_torch
  test_cuda
  test_ops
  test_unary_ufuncs
  test_binary_ufuncs
  test_autograd
/var/lib/jenkins/pytorch/tools/testing/target_determination/heuristics/previously_failed_in_pr.py:34: UserWarning: No pytorch cache found at /var/lib/jenkins/pytorch/.pytest_cache/v/cache/lastfailed
  warn(
Heuristic PreviouslyFailedInPR identified 0 tests to prioritize (0.00%%)
Heuristic EditedByPR identified 3 tests to prioritize (42.86%%)
High Relevance tests (3):
  test_nn
  test_ops
  test_torch
Unranked Relevance tests (4):
  test_autograd
  test_binary_ufuncs
  test_cuda
  test_unary_ufuncs
Heuristic CorrelatedWithHistoricalFailures identified 7 tests to prioritize (100.00%%)
Probable Relevance tests (7):
  test_binary_ufuncs
  test_autograd
  test_unary_ufuncs
  test_cuda
  test_nn
  test_torch
  test_ops
High Relevance tests (3):
  test_nn
  test_ops
  test_torch
Probable Relevance tests (4):
  test_binary_ufuncs
  test_autograd
  test_unary_ufuncs
  test_cuda
::warning:: Gathered no stats from artifacts for build env pytorch-linux-focal-rocm6.0-py3.9 build env and None test config. Using default build env and default test config instead.
Name: high_relevance
  Parallel tests:
    test_ops 1/6
    test_ops 2/6
    test_ops 3/6
    test_ops 4/6
    test_ops 5/6
    test_ops 6/6
  Serial tests:
    test_nn 1/1
    test_torch 1/1
Name: probable_relevance
  Parallel tests:
    test_binary_ufuncs 1/1
    test_unary_ufuncs 1/1
  Serial tests:
    test_autograd 1/1
    test_cuda 1/1
Name: unranked_relevance
  Parallel tests:
  Serial tests:
Starting test batch 'high_relevance' 7.152557373046875e-07 seconds after initiating testing
With sharding, this batch will run 8 tests
Ignoring disabled issues:  ['']
Running test_ops 1/6 ... [2024-03-09 07:03:08.588933]
Executing ['/opt/conda/envs/py_3.9/bin/python3', '-bb', 'test_ops.py', '--shard-id=0', '--num-shards=6', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2024-03-09 07:03:08.589313]


ROCm Version

ROCm 6.0.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions