Add LLaMA end-to-end benchmarking by kunal-vaishnavi · Pull Request #19985 · microsoft/onnxruntime

kunal-vaishnavi · 2024-03-20T01:48:54Z

Description

This PR adds a benchmarking script to measure end-to-end performance and saves the results in a CSV file.

Motivation and Context

With this PR, end-to-end performance can be easily measured for many large-language models such as LLaMA-2. The performance numbers for LLaMA-2 are located here.

onnxruntime/python/tools/transformers/models/llama/benchmark_e2e.py

+            )
+            all_csv_metrics.append(csv_metrics)
+
+        except:  # noqa: E722


onnxruntime/python/tools/transformers/models/llama/benchmark.py

### Description This PR updates the end-to-end benchmarking numbers for LLaMA-2. ### Motivation and Context The numbers were gathered with the end-to-end benchmarking script in [this PR](microsoft/onnxruntime#19985).

### Description This PR adds a benchmarking script to measure end-to-end performance and saves the results in a CSV file. ### Motivation and Context With this PR, end-to-end performance can be easily measured for many large-language models such as LLaMA-2. The performance numbers for LLaMA-2 are located [here](https://github.com/microsoft/onnxruntime-inference-examples/tree/main/python/models/llama).

kunal-vaishnavi added 8 commits March 17, 2024 23:30

Add E2E benchmarking

cad5aab

Add license to files and changes suggested by linter

3abc6fe

Fix runtime errors

72c92b8

Update README

ac1f266

Update CSV labels

f3d9036

Add cache dir

45774c2

Add changes suggested by linter

8172c28

Fix comment formatting

5ff17cf

kunal-vaishnavi mentioned this pull request Mar 20, 2024

Update LLaMA end-to-end numbers microsoft/onnxruntime-inference-examples#396

Merged

github-advanced-security bot found potential problems Mar 20, 2024

View reviewed changes

onnxruntime/python/tools/transformers/models/llama/benchmark_e2e.py

)

all_csv_metrics.append(csv_metrics)

except: # noqa: E722

Check notice

Code scanning / CodeQL

Except block handles 'BaseException'

Except block directly handles BaseException.

github-advanced-security bot found potential problems Mar 20, 2024

View reviewed changes

onnxruntime/python/tools/transformers/models/llama/benchmark.py Fixed Show fixed Hide fixed

kunal-vaishnavi added 5 commits March 20, 2024 02:13

Fix linter error in CI that differs locally

33897a2

Update README

5736e14

Fix bug with input shapes

559a738

Merge branch 'main' into kvaishnavi/llama-e2e

c44005b

Add changes suggested by linter

f0ebf4d

kunal-vaishnavi added 5 commits March 21, 2024 01:50

Add longer prompts

b3971d1

Add error message for unrecognized prompt lengths

0b21f14

Add local directory option for loading Hugging Face models

30a5973

Clarify arg descriptions

2dc0762

Update local directory arg description

34110f5

RyanUnderhill approved these changes Mar 22, 2024

View reviewed changes

kunal-vaishnavi merged commit 6238e9c into microsoft:main Mar 22, 2024

kunal-vaishnavi added the release:1.17.3 label Mar 22, 2024

dependabot bot mentioned this pull request Jan 17, 2026

nuget: Bump the dotnet-minor group with 10 updates psford/claudeProjects#4

Merged

dependabot bot mentioned this pull request Jan 27, 2026

Bump Microsoft.ML.OnnxRuntime from 1.17.0 to 1.23.2 freduardo4/H.O.P.E.#16

Closed

dependabot bot mentioned this pull request Feb 10, 2026

Bump Microsoft.ML.OnnxRuntime from 1.17.0 to 1.24.1 freduardo4/H.O.P.E.#51

Open

dependabot bot mentioned this pull request Feb 23, 2026

Bump Microsoft.ML.OnnxRuntime from 1.17.0 to 1.24.2 PrivStackApp/PrivStack-IO#57

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add LLaMA end-to-end benchmarking#19985

Add LLaMA end-to-end benchmarking#19985
kunal-vaishnavi merged 18 commits intomicrosoft:mainfrom
kunal-vaishnavi:kvaishnavi/llama-e2e

kunal-vaishnavi commented Mar 20, 2024

Uh oh!

Check notice

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kunal-vaishnavi commented Mar 20, 2024

Description

Motivation and Context

Uh oh!

Check notice

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants