Skip to content

Update LLM export to use PyTorch 2.7#24549

Merged
kunal-vaishnavi merged 15 commits intomainfrom
kvaishnavi/llama-torch-2.7
Apr 26, 2025
Merged

Update LLM export to use PyTorch 2.7#24549
kunal-vaishnavi merged 15 commits intomainfrom
kvaishnavi/llama-torch-2.7

Conversation

@kunal-vaishnavi
Copy link
Contributor

Description

This PR updates ONNX Runtime's LLM conversion tools to use PyTorch 2.7 and reduces memory usage during export.

Motivation and Context

Importing the transformers package with import transformers will take a long time because of the many namespaces it has at the top level. It is more efficient to only import the desired class names. Additionally, the benchmarking of the PyTorch model includes the deep copy of the inputs when it does not need to. The deep copy can be performed before measuring latency.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can commit the suggested changes from lintrunner.

@kunal-vaishnavi kunal-vaishnavi merged commit 8ef3fe6 into main Apr 26, 2025
86 of 88 checks passed
@kunal-vaishnavi kunal-vaishnavi deleted the kvaishnavi/llama-torch-2.7 branch April 26, 2025 00:14
ankitm3k pushed a commit to intel/onnxruntime that referenced this pull request May 12, 2025
### Description
This PR updates ONNX Runtime's LLM conversion tools to use [PyTorch
2.7](https://pytorch.org/blog/pytorch-2-7/) and reduces memory usage
during export.

### Motivation and Context
Importing the `transformers` package with `import transformers` will
take a long time because of the many namespaces it has at the top level.
It is more efficient to only import the desired class names.
Additionally, the benchmarking of the PyTorch model includes the deep
copy of the inputs when it does not need to. The deep copy can be
performed before measuring latency.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants