Skip to content

[CI] Determine which gpu tests, if any, can be parallelized, and strategy to do so #8675

@areusch

Description

@areusch

#8576 is enabling pytest-xdist for CPU-targeted TVM tests. This is a good first step to parallelizing the TVM test suite, but the real benefits will come if we can find a way to do this for tests that target GPUs. It's not clear if this can be done.

I think:

  • we can probably safely run smaller GPU tests in parallel
  • we don't have a good list of which tests those are so we don't know how much benefit this will provide
  • we should probably write more smaller tests, though
  • we should probably build tvm.testing decorators to formally identify these in code

Known unknowns that could break this approach:

  • Are there ordering problems with some tests?
  • I think we generate cached testdata for re-use between tests, and that needs to be broken out into pytest fixtures
  • Other driver-level issues that could break this.

Feel free to generate actionable child issues as a result of this. If things are not actionable and require deliberation, it would be great to post up to discuss.tvm.ai instead--we like to keep issues in GH to only those with clear steps to resolve.

@mikepapadim to take a look at this
cc @Mousius @denise-k @driazati @gigiblender @jroesch @leandron

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs-triagePRs or issues that need to be investigated by maintainers to find the right assignees to address it

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions