Update LLM export to use PyTorch 2.7 by kunal-vaishnavi · Pull Request #24549 · microsoft/onnxruntime

kunal-vaishnavi · 2025-04-25T18:25:20Z

Description

This PR updates ONNX Runtime's LLM conversion tools to use PyTorch 2.7 and reduces memory usage during export.

Motivation and Context

Importing the transformers package with import transformers will take a long time because of the many namespaces it has at the top level. It is more efficient to only import the desired class names. Additionally, the benchmarking of the PyTorch model includes the deep copy of the inputs when it does not need to. The deep copy can be performed before measuring latency.

onnxruntime/python/tools/transformers/models/llama/llama_parity.py

github-actions

You can commit the suggested changes from lintrunner.

onnxruntime/python/tools/transformers/models/llama/llama_parity.py

### Description This PR updates ONNX Runtime's LLM conversion tools to use [PyTorch 2.7](https://pytorch.org/blog/pytorch-2-7/) and reduces memory usage during export. ### Motivation and Context Importing the `transformers` package with `import transformers` will take a long time because of the many namespaces it has at the top level. It is more efficient to only import the desired class names. Additionally, the benchmarking of the PyTorch model includes the deep copy of the inputs when it does not need to. The deep copy can be performed before measuring latency.

kunal-vaishnavi added 10 commits April 25, 2025 11:00

Update convert_to_onnx.py

0f41851

Update llama_parity.py

f2aa6fd

Update requirements.txt

dcd6259

Update __init__.py

b5c5304

Update cache_helper.py

0930788

Update onnx_export_errors.py

b4e06a9

Update onnx_export_serialization.py

68d6652

Update patch_inputs.py

6380280

Update patch_transformers.py

ae7ecc1

Update onnx_export_errors.py

821fb3d

github-advanced-security bot found potential problems Apr 25, 2025

View reviewed changes

onnxruntime/python/tools/transformers/models/llama/llama_parity.py Fixed Show fixed Hide fixed

kunal-vaishnavi and others added 4 commits April 25, 2025 11:49

Update llama_parity.py

8736158

Update requirements.txt

1ce4d04

Add changes suggested by linter

9182169

Fix importing transformers version

311c0ae

github-actions bot reviewed Apr 25, 2025

View reviewed changes

onnxruntime/python/tools/transformers/models/llama/llama_parity.py Show resolved Hide resolved

Separate imports from transformers

e4d81a9

tianleiwu approved these changes Apr 25, 2025

View reviewed changes

kunal-vaishnavi merged commit 8ef3fe6 into main Apr 26, 2025
86 of 88 checks passed

kunal-vaishnavi deleted the kvaishnavi/llama-torch-2.7 branch April 26, 2025 00:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update LLM export to use PyTorch 2.7#24549

Update LLM export to use PyTorch 2.7#24549
kunal-vaishnavi merged 15 commits intomainfrom
kvaishnavi/llama-torch-2.7

kunal-vaishnavi commented Apr 25, 2025

Uh oh!

Uh oh!

github-actions bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kunal-vaishnavi commented Apr 25, 2025

Description

Motivation and Context

Uh oh!

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants