Skip to content

Conversation

@yelite
Copy link
Contributor

@yelite yelite commented Nov 1, 2022

This PR adds features to the python/tvm/meta_schedule/testing/torchbench/run.py.

  • Integrate with the TVM PyTorch integration to handle boolean tensor and unaligned memory.
  • Deduplicate collected tuning tasks to prevent thousands of tasks created by hundreds of subgraphs with similar structure.
  • Add option to cast model to float32, which are more stable numerically than float16 and prevents inaccurate result from many models.
  • Add option to choose search strategy in MetaSchedule.
  • Inspect output error if the actual output doesn't match the expectation. Also save the actual output and expected output for further analysis if needed.
  • Save subgraphs and their example input for debug purpose.
  • Print MetaSchedule profiling information at the end of execution.
  • Detach PyTorch tensor before exporting to dlpack.
  • Fix the sys path to avoid conflict with the benchmarks package installed by TorchBench dependency.
  • Trim all command line args passed in, in order to prevent breaking some TorchBench model that depends on args.
  • Empty cuda cache before starting the actual benchmark.

cc: @junrushao @zxybazh

@tvm-bot
Copy link
Collaborator

tvm-bot commented Nov 1, 2022

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

Generated by tvm-bot

Copy link
Member

@junrushao junrushao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@junrushao junrushao merged commit b98b9f9 into apache:main Nov 3, 2022
xinetzone pushed a commit to daobook/tvm that referenced this pull request Nov 10, 2022
…marking (apache#13255)

This PR adds features to the `python/tvm/meta_schedule/testing/torchbench/run.py`.

- Integrate with the TVM PyTorch integration to handle boolean tensor and unaligned memory.
- Deduplicate collected tuning tasks to prevent thousands of tasks created by hundreds of subgraphs with similar structure.
- Add option to cast model to float32, which are more stable numerically than float16 and prevents inaccurate result from many models.
- Add option to choose search strategy in MetaSchedule.
- Inspect output error if the actual output doesn't match the expectation. Also save the actual output and expected output for further analysis if needed.
- Save subgraphs and their example input for debug purpose.
- Print MetaSchedule profiling information at the end of execution.
- Detach PyTorch tensor before exporting to dlpack.
- Fix the sys path to avoid conflict with the `benchmarks` package installed by TorchBench dependency.
- Trim all command line args passed in, in order to prevent breaking some TorchBench model that depends on args.
- Empty cuda cache before starting the actual benchmark.
xinetzone pushed a commit to daobook/tvm that referenced this pull request Nov 25, 2022
…marking (apache#13255)

This PR adds features to the `python/tvm/meta_schedule/testing/torchbench/run.py`.

- Integrate with the TVM PyTorch integration to handle boolean tensor and unaligned memory.
- Deduplicate collected tuning tasks to prevent thousands of tasks created by hundreds of subgraphs with similar structure.
- Add option to cast model to float32, which are more stable numerically than float16 and prevents inaccurate result from many models.
- Add option to choose search strategy in MetaSchedule.
- Inspect output error if the actual output doesn't match the expectation. Also save the actual output and expected output for further analysis if needed.
- Save subgraphs and their example input for debug purpose.
- Print MetaSchedule profiling information at the end of execution.
- Detach PyTorch tensor before exporting to dlpack.
- Fix the sys path to avoid conflict with the `benchmarks` package installed by TorchBench dependency.
- Trim all command line args passed in, in order to prevent breaking some TorchBench model that depends on args.
- Empty cuda cache before starting the actual benchmark.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants