-
Notifications
You must be signed in to change notification settings - Fork 752
[Executorch][LLM] Use caching allocator for runner #15730
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
kimishpatel
wants to merge
77
commits into
main
Choose a base branch
from
gh/kimishpatel/213/head
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+20
−3
Open
Changes from all commits
Commits
Show all changes
77 commits
Select commit
Hold shift + click to select a range
3f789b8
[Executorch] parallelize op_choose_qparams
kimishpatel 08dd980
[Executorch] Add simd path for op quantize
kimishpatel 27fc8b1
[Executorch] Add multithreading for op_quantize
kimishpatel ae61ab4
Reduce allocation overhead in quantized sdpa
kimishpatel ea16e15
[Executorch] Introduce caching cpu memory allocator
kimishpatel c3ed4b2
Update base for Update on "[Executorch] Introduce caching cpu memory …
kimishpatel 08ab552
Update on "[Executorch] Introduce caching cpu memory allocator"
kimishpatel dbf63cc
Update base for Update on "[Executorch] Introduce caching cpu memory …
kimishpatel f9ce984
Update on "[Executorch] Introduce caching cpu memory allocator"
kimishpatel 86c7c4b
Update base for Update on "[Executorch] Introduce caching cpu memory …
kimishpatel 0c23c32
Update on "[Executorch] Introduce caching cpu memory allocator"
kimishpatel 68d76d3
Update base for Update on "[Executorch] Introduce caching cpu memory …
kimishpatel 79bb135
Update on "[Executorch] Introduce caching cpu memory allocator"
kimishpatel 351a400
[Executorch] Use temp allocator for allocating scratch memory
kimishpatel b4fdc22
[Executorch] Make module constructors uniform across
kimishpatel 00fffa1
[Executorch][LLM] Use caching allocator for runner
kimishpatel daca5e0
Update base for Update on "[Executorch][LLM] Use caching allocator fo…
kimishpatel 5cecbfc
Update on "[Executorch][LLM] Use caching allocator for runner"
kimishpatel 30c6fba
Update base for Update on "[Executorch][LLM] Use caching allocator fo…
kimishpatel e09bcd6
Update on "[Executorch][LLM] Use caching allocator for runner"
kimishpatel e73b365
Update base for Update on "[Executorch][LLM] Use caching allocator fo…
kimishpatel 356ec2f
Update on "[Executorch][LLM] Use caching allocator for runner"
kimishpatel f12869c
Update base for Update on "[Executorch][LLM] Use caching allocator fo…
kimishpatel 1f59722
Update on "[Executorch][LLM] Use caching allocator for runner"
kimishpatel 7f9288a
Update base for Update on "[Executorch][LLM] Use caching allocator fo…
kimishpatel 2aaf193
Update on "[Executorch][LLM] Use caching allocator for runner"
kimishpatel 3efee70
Update base for Update on "[Executorch][LLM] Use caching allocator fo…
kimishpatel e91d367
Update on "[Executorch][LLM] Use caching allocator for runner"
kimishpatel 75900d0
Update base for Update on "[Executorch][LLM] Use caching allocator fo…
kimishpatel 7784291
Update on "[Executorch][LLM] Use caching allocator for runner"
kimishpatel ca1757a
Update base for Update on "[Executorch][LLM] Use caching allocator fo…
kimishpatel 10c67dc
Update on "[Executorch][LLM] Use caching allocator for runner"
kimishpatel a4912c5
Update base for Update on "[Executorch][LLM] Use caching allocator fo…
kimishpatel a7be4da
Update on "[Executorch][LLM] Use caching allocator for runner"
kimishpatel 39cd25d
Update base for Update on "[Executorch][LLM] Use caching allocator fo…
kimishpatel cc6beb5
Update on "[Executorch][LLM] Use caching allocator for runner"
kimishpatel 5bce956
Update base for Update on "[Executorch][LLM] Use caching allocator fo…
kimishpatel 9b35c78
Update on "[Executorch][LLM] Use caching allocator for runner"
kimishpatel 5df2408
Update base for Update on "[Executorch][LLM] Use caching allocator fo…
kimishpatel 4db1a94
Update on "[Executorch][LLM] Use caching allocator for runner"
kimishpatel 6a0d471
Update base for Update on "[Executorch][LLM] Use caching allocator fo…
kimishpatel ea7c837
Update on "[Executorch][LLM] Use caching allocator for runner"
kimishpatel 0bf3b2e
Update base for Update on "[Executorch][LLM] Use caching allocator fo…
kimishpatel b340181
Update on "[Executorch][LLM] Use caching allocator for runner"
kimishpatel d83b4a9
Update base for Update on "[Executorch][LLM] Use caching allocator fo…
kimishpatel af57723
Update on "[Executorch][LLM] Use caching allocator for runner"
kimishpatel a1f687f
Update base for Update on "[Executorch][LLM] Use caching allocator fo…
kimishpatel e4845c5
Update on "[Executorch][LLM] Use caching allocator for runner"
kimishpatel 2d79945
Update base for Update on "[Executorch][LLM] Use caching allocator fo…
kimishpatel 1d85984
Update on "[Executorch][LLM] Use caching allocator for runner"
kimishpatel 365be54
Update base for Update on "[Executorch][LLM] Use caching allocator fo…
kimishpatel 559d0d3
Update on "[Executorch][LLM] Use caching allocator for runner"
kimishpatel ba27007
Update base for Update on "[Executorch][LLM] Use caching allocator fo…
kimishpatel 5198114
Update on "[Executorch][LLM] Use caching allocator for runner"
kimishpatel 20854fc
Update base for Update on "[Executorch][LLM] Use caching allocator fo…
kimishpatel c2bbfbd
Update on "[Executorch][LLM] Use caching allocator for runner"
kimishpatel 36cce27
Update base for Update on "[Executorch][LLM] Use caching allocator fo…
kimishpatel 90d3d57
Update on "[Executorch][LLM] Use caching allocator for runner"
kimishpatel 834171f
Update base for Update on "[Executorch][LLM] Use caching allocator fo…
kimishpatel be88d80
Update on "[Executorch][LLM] Use caching allocator for runner"
kimishpatel bae4829
Update base for Update on "[Executorch][LLM] Use caching allocator fo…
kimishpatel 4082b28
Update on "[Executorch][LLM] Use caching allocator for runner"
kimishpatel 71cc532
Update base for Update on "[Executorch][LLM] Use caching allocator fo…
kimishpatel 54f9381
Update on "[Executorch][LLM] Use caching allocator for runner"
kimishpatel 230cd24
Update base for Update on "[Executorch][LLM] Use caching allocator fo…
kimishpatel 494bbd5
Update on "[Executorch][LLM] Use caching allocator for runner"
kimishpatel 997b5e2
Update base for Update on "[Executorch][LLM] Use caching allocator fo…
kimishpatel 4092750
Update on "[Executorch][LLM] Use caching allocator for runner"
kimishpatel 7590e9c
Update base for Update on "[Executorch][LLM] Use caching allocator fo…
kimishpatel 4e0b339
Update on "[Executorch][LLM] Use caching allocator for runner"
kimishpatel f06f5ba
Update base for Update on "[Executorch][LLM] Use caching allocator fo…
kimishpatel d63ffbd
Update on "[Executorch][LLM] Use caching allocator for runner"
kimishpatel e22cb35
Update base for Update on "[Executorch][LLM] Use caching allocator fo…
kimishpatel 7608f53
Update on "[Executorch][LLM] Use caching allocator for runner"
kimishpatel fed9aea
Merge branch 'main' into gh/kimishpatel/213/head
kimishpatel 251b270
Update base for Update on "[Executorch][LLM] Use caching allocator fo…
kimishpatel 3cd0176
Update on "[Executorch][LLM] Use caching allocator for runner"
kimishpatel File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The hardcoded value of 10MB for the caching allocator size should be documented or made configurable. According to the PR description, this improves performance by 6% on iOS for SDPA op temp allocations, but different models or use cases may benefit from different cache sizes. Consider: