Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
288 commits
Select commit Hold shift + click to select a range
5f657d3
formatting fix
cavusmustafa Sep 5, 2025
eafcc33
formatting fix
cavusmustafa Sep 5, 2025
1763b99
formatting fix
cavusmustafa Sep 5, 2025
4863826
formatting fix
cavusmustafa Sep 5, 2025
e24072f
formatting fix
cavusmustafa Sep 5, 2025
b9bb5f0
formatting fix
cavusmustafa Sep 5, 2025
291dcd9
formatting fix
cavusmustafa Sep 5, 2025
c8ea777
use new transformations
anzr299 Sep 6, 2025
a6b605f
add comment for manual MP allocation
anzr299 Sep 6, 2025
9614fc4
remove nncf_compression from export llama lib
anzr299 Sep 6, 2025
45007cf
change pt2e quantize flag to use openvino_4wo instead of openvino_8da…
anzr299 Sep 6, 2025
9d49414
follow up to last commit
anzr299 Sep 6, 2025
d6727cf
update quantizer lib with openvino_4wo
anzr299 Sep 6, 2025
4a0a781
split qspec function into 2 parts; 1 for WC and other for PTQ qspecs
anzr299 Sep 6, 2025
f6a1ee3
micro fix
anzr299 Sep 8, 2025
d285fcc
udpate mixed precision layers for higher accuracy. Change INT4 mode t…
anzr299 Sep 8, 2025
4e66df1
Apply suggestions from code review
anzr299 Sep 8, 2025
e850e41
Review changes
anzr299 Sep 8, 2025
204043f
review changes in quantizer
anzr299 Sep 8, 2025
ae6b089
revert extra args changes
anzr299 Sep 8, 2025
a6f036c
Merge branch 'openvino_llama_support' of https://github.com/anzr299/e…
anzr299 Sep 9, 2025
2de5693
precommit fixes
anzr299 Sep 9, 2025
0e10f28
revert _calculate_qparams back to calculate_qparams
anzr299 Sep 9, 2025
05f5a92
remove manual ignored nodes
anzr299 Sep 10, 2025
fbe0e21
add ratio to quantizer initialization
anzr299 Sep 10, 2025
6bff1cd
Update export_llama_lib.py
anzr299 Sep 11, 2025
d744ae9
Update quantizer_lib.py
anzr299 Sep 11, 2025
21c43fe
Merge pull request #9 from anzr299/an/ovquantizer
suryasidd Sep 11, 2025
b874204
Updated NNCF commit id
suryasidd Sep 11, 2025
08280ed
Merge branch 'main' into openvino_llama_support
suryasidd Sep 11, 2025
35f1d84
Update README.md
cavusmustafa Sep 11, 2025
41ac36a
openvino llama export configuration - initial
cavusmustafa Sep 11, 2025
4426541
Update README.md
cavusmustafa Sep 11, 2025
6b936c5
Update README.md
cavusmustafa Sep 11, 2025
08461ec
updated ov llama config file
cavusmustafa Sep 11, 2025
be85af8
Update README.md
cavusmustafa Sep 11, 2025
bba4a01
Update README.md
cavusmustafa Sep 11, 2025
1421921
Update README.md with quantization paragraph
anzr299 Sep 12, 2025
cf0e71c
Merge pull request #10 from anzr299/patch-3
cavusmustafa Sep 15, 2025
f050eea
formatting fix
cavusmustafa Sep 15, 2025
4bfdca9
Update README.md
cavusmustafa Sep 15, 2025
16aba1b
Update non_cpu_backends.md for OpenVINO instructions
cavusmustafa Sep 16, 2025
155529f
Update llama instructions link for OpenVINO backend
cavusmustafa Sep 16, 2025
5875aa8
Remove OpenVINO from non_cpu_backends.md
cavusmustafa Sep 16, 2025
2630fd6
Update llama instructions for OpenVINO backend
cavusmustafa Sep 16, 2025
6d0cbc5
Removed the comma which was added by mistake
cavusmustafa Sep 16, 2025
3fbefec
Added NPU in choices
suryasidd Sep 16, 2025
c97bd09
Merge branch 'main' into openvino_llama_support
suryasidd Sep 16, 2025
12e51c7
Fixed ref links
suryasidd Sep 16, 2025
d3d3ae0
Merge branch 'main' into openvino_llama_support
suryasidd Sep 17, 2025
72331f5
Added Remove clone ops transformation to OpenVINO backend
suryasidd Sep 17, 2025
8016165
Fixed variable names
suryasidd Sep 17, 2025
f0d9fc7
Added extended support list for openvino backend
cavusmustafa Sep 17, 2025
9b41c28
formating fix
cavusmustafa Sep 17, 2025
e751726
formatting fix
cavusmustafa Sep 17, 2025
1736571
Merge pull request #11 from cavusmustafa/remove_clone_ops
cavusmustafa Sep 17, 2025
8106204
Added DimorderOpsRevertPass to Openvino backend
suryasidd Sep 30, 2025
04ca3f3
Merge remote-tracking branch 'cavus/main' into openvino_llama_support
suryasidd Sep 30, 2025
62f74a8
Merge branch 'main' into openvino_llama_support
suryasidd Sep 30, 2025
d95143e
refactor:(samsung backend): replace pkg_resources with importlib.reso…
onuralpszr Oct 1, 2025
eaf0e17
Fixed linter issues
suryasidd Oct 1, 2025
15f5e23
Merge branch 'main' into openvino_llama_support
suryasidd Oct 1, 2025
19be2a3
Try to get nightly wheel build work with qnn (#14633)
cccclai Oct 1, 2025
7ed9266
Move to ProxyValue instead of FakeTensor weights.
hsharma35 Oct 1, 2025
a4ac70d
Disable nxp tests (#14730)
abhinaykukkadapu Oct 1, 2025
649f92d
Arm backend: Correct type annotations in aot_arm_compiler (#14627)
martinlsm Oct 1, 2025
871fe39
Arm backend: Update full quantization annotation (#14585)
oscarandersson8218 Oct 1, 2025
0081bef
Arm backend: Add complie spec factories (#14376)
Erik-Lundell Oct 1, 2025
0cd8256
Arm backend: Add docstrings for operator_support/convolution_support.…
Sebastian-Larsson Oct 1, 2025
96dfa9c
Add pybindings for bpte and ptd file
lucylq Oct 1, 2025
b1309e7
Aoti support multi method (#14715)
larryliu0820 Oct 1, 2025
426b701
Arm backend: Backend test TOSA FP, INT and Ethos-U55/U85 (#14653)
zingo Oct 1, 2025
d4f208d
Android set different maven package names of flavors (#14674)
kirklandsign Oct 2, 2025
e608a21
[Backend Tester] Update README (#14739)
GregoryComer Oct 2, 2025
fb66fb3
NXP Backend: Add codeowner for the NXP Backend (#14723)
robert-kalmar Oct 2, 2025
baaaa86
Add transposed convolution
DrJessop Oct 2, 2025
9ab5592
support qnn mean (dim=None) (#14675)
cccclai Oct 2, 2025
f24351a
Update mul int16 test
3l1 Oct 2, 2025
499ce50
Arm backend: Add VGF tests to StableDiffusion module tests (#14655)
YufengShi-dudu Oct 2, 2025
edf6927
NXP backend: Improve Neutron targets handling (#14718)
StrycekSimon Oct 2, 2025
0145604
Arm Backend: Add tests for stack.default (#14623)
agrima1304 Oct 2, 2025
4372a14
Fix const prop pass when a const prop tensor has zero stride, make it…
abhinaykukkadapu Oct 2, 2025
3b358d5
Merge branch 'main' into openvino_llama_support
suryasidd Oct 2, 2025
0882c9b
Qualcomm AI Engine Direct - GA Static Gemma-2b-instruct (#14459)
DannyYuyang-quic Oct 2, 2025
deb42f2
update lama export DS specs to be more accurate.
laithsakka Oct 2, 2025
19258d2
update tokenizer pin (#14751)
JacobSzwejbka Oct 2, 2025
a1652f9
Fix pyproject.toml license classifier deprecation (#14592)
tmi Oct 2, 2025
53ccfd0
Fix cuda export test failures from #14715 (#14753)
larryliu0820 Oct 2, 2025
c997fe4
Remove explicit device arguments
navsud Oct 3, 2025
54bfd72
Fix Wav2Vec Replace Pass Bug
DrJessop Oct 3, 2025
822a711
Update addmm int16 for Ethos-U85
3l1 Oct 3, 2025
e652746
Use FusedMovingAvgObsFakeQuantize instead of FakeQuantize for faster QAT
navsud Oct 3, 2025
70ea661
Add Phi4 test and fix regex parsing.
shoumikhin Oct 3, 2025
05799c9
NXP backend: added aten.sub operator support (#14514)
novak-vaclav Oct 3, 2025
3557edf
Update MTK tool versions in documents (#14772)
neuropilot-captain Oct 3, 2025
c44c541
Runner support for multiple ptd files (#14758)
pytorchbot Oct 3, 2025
4d681cb
JNI support for multiple ptd files (#14769)
pytorchbot Oct 3, 2025
7116e0a
Tag mutated buffer for AOTI cuda partitioner (#14783)
larryliu0820 Oct 3, 2025
b021fd0
Support im2row
DrJessop Oct 3, 2025
7c7b729
Patch https://github.com/pytorch/executorch/pull/14754 (#14786)
lucylq Oct 3, 2025
0ee1160
Add transposed im2row
DrJessop Oct 4, 2025
0b5a4ab
Update linear -> conv2d int16 for Ethos
3l1 Oct 4, 2025
ca9fc06
[Release Only] Bugfix/fix nxp separable conv test (#14800)
pytorchbot Oct 4, 2025
3f0896a
[ET-VK] Miscellaneous fixes (#14801)
pytorchbot Oct 4, 2025
881915d
Add platforms for all operator library sub-targets.
hsharma35 Oct 4, 2025
3d8b8d1
fix test-huggingface-transformers-* tests (#14752)
cccclai Oct 4, 2025
3b16bc1
Summary: Use javaClassStatic() for class references stored in static …
psiddh Oct 6, 2025
f81e834
Add strict-flag to ExportSession (#14588)
Erik-Lundell Oct 6, 2025
75ebd05
Fix OpenVINO ci (#14784)
suryasidd Oct 6, 2025
9a7fb42
Arm backend: Fix torch.matmul() failures for 2D tensor inputs (#14624)
YufengShi-dudu Oct 6, 2025
ed3fdad
Update extension/llm/tokenizers (#14807)
shoumikhin Oct 6, 2025
815ae92
Update ReplaceSingleElementTensorArgumentsFromFullOpWithScalarPass to…
ethansfng Oct 6, 2025
8c434dd
[Windows] Enable LLM preset in CI (#14805)
GregoryComer Oct 6, 2025
563a5d2
Arm backend: Remove CheckNeedsDecomposition (#14512)
oscarandersson8218 Oct 6, 2025
8484aee
Arm backend: Backend test serializes and uses EthosUQuant on Ethos-U …
zingo Oct 6, 2025
b6bc421
Arm backend: Fix Arm tester issue for inplace ops (#14625)
mansnils Oct 6, 2025
6e7353f
Arm backend: Add 6D tensor and pixel shuffle/unshuffle support (#14626)
mansnils Oct 6, 2025
266cfd0
Arm backend: Add test for monitoring memory allocation (#14657)
perheld Oct 6, 2025
f174974
Arm backend: Remove hello_world in core_software (#14775)
perheld Oct 6, 2025
cf31475
Revert "[Windows] Enable LLM preset in CI (#14805)" (#14823)
GregoryComer Oct 6, 2025
a39866c
Fix op signature for avg_pool2d
DrJessop Oct 6, 2025
bc931e1
Update APP_PATH to point to mv3 directory (#14828)
shoumikhin Oct 6, 2025
270873f
Restructure ET documentation with 'Platform First' model (#14720)
psiddh Oct 6, 2025
d8a2126
Add Gemma 3 test.
shoumikhin Oct 6, 2025
c609f63
Fixed assumption on out_shift for quantized linear
DrJessop Oct 7, 2025
d36bf8c
Run ET-eager on message recall
derekxu Oct 7, 2025
0b748bf
oss et update to support SAR2230P
billmguo Oct 7, 2025
2c603e4
Arm backend: Move rescale ops out of node visitors (#14584)
martinlsm Oct 7, 2025
1b8d380
NXP backend: Add NXP backend tutorial page (#14850)
StrycekSimon Oct 7, 2025
d8e07bd
Add .ptd support to portable executor runner (#14833)
larryliu0820 Oct 7, 2025
0e74a17
Qualcomm AI Engine Direct - Suite Operator Test Support Part 2 (#14848)
winskuo-quic Oct 7, 2025
0bfb61e
Arm backend: Backend test call setup_path.sh (#14846)
zingo Oct 7, 2025
4ac04c5
Arm backend: Bump tosa version to remove mlplatform dependencies (#14…
ArmRyan Oct 7, 2025
8ac6300
Arm backend: Change input distribution on resnet18 test (#14815)
gggekov Oct 7, 2025
7d8da19
Arm backend: Mark test in test_bmm.py as flaky (#14748)
martinlsm Oct 7, 2025
e09abea
support argmax/argmin without dim kwargs and fix adaptive_max_pool3d …
cccclai Oct 7, 2025
351d82f
Sweep major CMake files for use of include/lib instead of CMAKE_INSTA…
swolchok Oct 7, 2025
740fe14
Back out "oss et update to support SAR2230P"
cccclai Oct 7, 2025
15a203b
Fix avg_pool2d replace ops pass
DrJessop Oct 7, 2025
5c4d214
link new vision kernel internally
zonglinpeng Oct 7, 2025
5dee222
[ez] Try to fix Samsung CI job (#14866)
SS-JIA Oct 7, 2025
fcd42bc
Update link for working with Large Language Models (#14863)
mergennachin Oct 7, 2025
697078b
[aoti-et] Add cuda delegate runtime code (#14827)
larryliu0820 Oct 7, 2025
bba9d26
Introduce public MergedDataMap
lucylq Oct 8, 2025
8efba17
Merge branch 'main' into openvino_llama_support
suryasidd Oct 8, 2025
fb87fa6
Including mixed quant Linear op in Jarvis
mgiordy Oct 8, 2025
229bbd2
Use defualt runner for OpenVINO backend as well
suryasidd Oct 8, 2025
400b2a5
[aoti-et] Add a voxtral runner and add CI (#14875)
larryliu0820 Oct 8, 2025
ab5fb84
Arm backend: fix meandim when dim = None (#14883)
Erik-Lundell Oct 8, 2025
45bf018
Arm backend: build with NAMED_DATA_MAP=ON for vgf (#14885)
Erik-Lundell Oct 8, 2025
9be3aaa
Arm backend: Support min/max with unset dim. (#14884)
Erik-Lundell Oct 8, 2025
7d2b8c6
Arm backend: Add correction for floor mode (#14776)
wwwind Oct 8, 2025
41b061e
NXP backend: Update user guide and docs Readme (#14852)
roman-janik-nxp Oct 8, 2025
a41cdef
refactor: ♻️ update YOLO12 example doc and code (#14771)
onuralpszr Oct 8, 2025
b88b09c
Arm backend: Add missing attribute in VisualizePass (#14847)
martinlsm Oct 8, 2025
5c25493
Arm backend: Add docstrings for tosa/partitioner.py (#14844)
Sebastian-Larsson Oct 8, 2025
bf3b66c
Arm backend: Add docstrings for operator_support/ethos_u55_support.py…
Sebastian-Larsson Oct 8, 2025
91f1769
Arm backend: Switch torch.tan to torch.max in test_multiple_delegates…
emmakujala Oct 8, 2025
5a6113f
Arm backend: Add TOSA dialect op for MATMUL (#14694)
oscarandersson8218 Oct 8, 2025
a9fe0b4
Cortex_m backend: Add script for building test runner (#14750)
AdrianLundell Oct 8, 2025
5af73eb
Qualcomm AI Engine Direct - Support floor_divide with int input in QN…
winskuo-quic Oct 8, 2025
7c148a7
Add constraints for split_copy test
ethansfng Oct 8, 2025
d677277
Enable named data map extension in CUDA build (#14898)
larryliu0820 Oct 8, 2025
ec56cfa
Gather common remove passes in one list.
eigen-k Oct 8, 2025
5246168
Group-quantized embedding op
DrJessop Oct 8, 2025
1da530d
Build pthreadpool with hidden visibility on Apple (#14838)
GregoryComer Oct 8, 2025
2672dd3
TransformerBlock: support attention skips
sxu Oct 8, 2025
c62cbfe
Arm backend: Remove out of date warning for ethos-u tutorial (#14897)
robell Oct 8, 2025
73c8d8c
Move cuda/runtime/shim/utils to cuda/runtime for better usibility. (#…
pytorchbot Oct 8, 2025
0142a1a
introduce CudaGuard and cudastreamguard (#14914)
pytorchbot Oct 8, 2025
f64c864
Revert D84020397: Group-quantized embedding op (#14915)
DrJessop Oct 8, 2025
0525d9c
Merge pull request #12 from suryasidd/runner_changes
cavusmustafa Oct 8, 2025
f32e9fc
Back FreeableBuffer with int64_t
lucylq Oct 8, 2025
24f67b6
Merge branch 'main' into openvino_llama_support
suryasidd Oct 8, 2025
a26412e
Reapply "Add EXECUTORCH_THREADPOOL_SIZE options, default to u… (#1430…
GregoryComer Oct 9, 2025
09c93d4
Read max context length from the correct ModelArgs field
sxu Oct 9, 2025
38b51aa
print bfloat16 tensor data (#14889)
manuelcandales Oct 9, 2025
6520e06
Make type of logits a template parameter
sxu Oct 9, 2025
698ea79
Qualcomm AI Engine Direct - docs fix (#14881)
DannyYuyang-quic Oct 9, 2025
29b4db8
Including mixed quant Conv1D op in Jarvis
mgiordy Oct 9, 2025
f7f97f7
introduce shim layers for cudaguard and cudastreamguard (#14925)
pytorchbot Oct 9, 2025
2eb8994
Add Voxtral test. (#14918)
shoumikhin Oct 9, 2025
8fbc42c
Arm backend: Unsqueeze rank 0 tensor at vgf runtime (#14856)
ArmRyan Oct 9, 2025
418c584
Use quantizable LSTM in test when flow has quantize=True (#14893)
Erik-Lundell Oct 9, 2025
dda2705
Arm backend: Decompose sub/add with alpha!=1 (#14932)
Erik-Lundell Oct 9, 2025
29b98c3
Arm backend: add new cmake line to vgf tutorial (#14935)
Erik-Lundell Oct 9, 2025
75f968d
Make determinism of channels_last more conservative
kimishpatel Oct 9, 2025
a509431
Update extension/llm/tokenizers to d710a0cf10cfa8cb7ffda33c4e61af6311…
shoumikhin Oct 9, 2025
bdc526b
Qualcomm AI Engine Direct - change the llama tutorial to static llama…
DannyYuyang-quic Oct 9, 2025
d4129b7
Arm backend: Updated how generic evaluator is handled (#14940)
Michiel-Olieslagers Oct 9, 2025
71c8031
Fix iOS demo app package resolution on CI (#14952)
shoumikhin Oct 9, 2025
64b0fd9
Make determinism of channels_last more conservative
kimishpatel Oct 9, 2025
bf977e0
Group-quantized embedding op
DrJessop Oct 9, 2025
84d060a
XNNPACK: Assert on unsupported pass through tensor args
digantdesai Oct 9, 2025
a5d7e5c
[ET-VK] Add Fusing for Conv/Binary Ops, Clamp/Binary Ops, and Clamp/C…
alexdean08 Oct 9, 2025
b6884df
Bump cortex-m size test (#14950)
lucylq Oct 9, 2025
f443ebb
Add overload to create atensor view from TensorPtr.
shoumikhin Oct 9, 2025
fc512fa
Fix typos in docs ahead of GA (#14964)
abhinaykukkadapu Oct 9, 2025
8d51b0f
[ET-VK] Show stack trace in Exception messages via boost if boost is …
pytorchbot Oct 9, 2025
d0827e5
Use merged data map in module (#14966)
pytorchbot Oct 9, 2025
66c3dea
Add a wav loader (#14923)
larryliu0820 Oct 9, 2025
9764269
Pass which replaces torch quantized embedding byte with cadence variant
DrJessop Oct 9, 2025
d39992f
Make HQQ default PTQ quantization in ExecuTorch
metascroy Oct 10, 2025
7f31fd8
Removed support for non-per-tensor quantized relu
DrJessop Oct 10, 2025
8b67236
Enable Exynos backend Quatization (#14464)
Jiseong-oh Oct 10, 2025
cf13b9a
[docs] Fix typo in instructions (#14968)
cccclai Oct 10, 2025
4bdd3df
Allow custom sizes, dim order and strides for tensor view. (#14944)
shoumikhin Oct 10, 2025
c979158
Cortex-M backend: Add mul and linear tests (#14746)
AdrianLundell Oct 10, 2025
7395999
Arm backend: add DeiTTiny evaluator and deterministic shuffled calibr…
tirwu01 Oct 10, 2025
caa0094
backends/cuda: use async malloc/free (#14976)
swolchok Oct 10, 2025
94c892c
Arm backend: Update MLSDK dependencies to use gitlab (#14989)
ArmRyan Oct 10, 2025
3591604
NXP backend: Extend NXP backend docs page, add partitioner and quanti…
roman-janik-nxp Oct 10, 2025
896178e
Fix MSVC ambiguity in make_tensor_ptr (#14991)
shoumikhin Oct 10, 2025
3247c15
Revert "[multimodal] Allow generate and prefill to take move sematics…
larryliu0820 Oct 10, 2025
9b03c13
Add extension_named_data_map to llava (#14973)
lucylq Oct 10, 2025
21557d0
set emulate_precision_casts as true for cuda backend for better accur…
pytorchbot Oct 10, 2025
e16cf17
More fixes to docs, fix broken links and more typos (#14975)
abhinaykukkadapu Oct 10, 2025
e26670b
Make image and audio variables const references (#14999)
larryliu0820 Oct 10, 2025
c4bd450
Revert "Arm backend: Add correction for floor mode" (#14998)
ArmRyan Oct 10, 2025
fca0f38
update etrecrod doc to cover new generation pipeline (#15012)
pytorchbot Oct 10, 2025
3bfd5e0
Promote pyproject beta to production/stable (#14777)
mergennachin Oct 10, 2025
7533df6
use reference link for html doc (#15029)
pytorchbot Oct 10, 2025
09eac16
[aoti-et] Enable multimodal runner for Voxtral on CUDA (#14980)
larryliu0820 Oct 11, 2025
4609cdb
Move RemovePermutesAroundElementwiseOps and RemoveSqueezeViewBeforeEl…
eigen-k Oct 11, 2025
1dc0e0e
Arm backend: Upgrade vela to 4.4.1
oscarandersson8218 Oct 11, 2025
cc6cb83
Add option to specify fake tensor mode for graph and program builders.
hsharma35 Oct 11, 2025
35d431b
Arm backend: Enable parallel building on MLSDK emulation layer (#14993)
ArmRyan Oct 11, 2025
019c8da
Qualcomm AI Engine Direct - increase index_put coverage (#14924)
haowhsu-quic Oct 11, 2025
50a10a2
Updating tests for 16A8W ops which are supported (#14945)
Ninja91 Oct 10, 2025
703d25a
Including mixed quant GRU op in Jarvis
mgiordy Oct 12, 2025
e69700b
Support for batched matmul
DrJessop Oct 12, 2025
f32cdc3
pin bump with better architecture (#15040)
pytorchbot Oct 12, 2025
afd98fe
NXP backend: Add conversion and quantization support for dim_order_op…
StrycekSimon Oct 13, 2025
d00279d
Minor update for Arm README.md (#15045)
robell Oct 13, 2025
82bc4c5
Merge branch 'main' into openvino_llama_support
suryasidd Oct 13, 2025
1a8acf6
Update top-level README.md file (#15049)
mergennachin Oct 13, 2025
f84c423
[Metal] Update aoti_common with additional AOTI functions needed by M…
manuelcandales Oct 13, 2025
626a7d1
Move RemoveCatFromSliceCopyPass to the common section. (#14972)
eigen-k Oct 13, 2025
9560800
Fix documentation link for Core ATen operators (#15050)
mergennachin Oct 13, 2025
6efddba
Support sine operator on XNNPACK (#14711)
GregoryComer Oct 13, 2025
a66ea20
msvc support 1/N (#14970)
JacobSzwejbka Oct 13, 2025
adc4889
Move tensor layout into exir (#14917)
pytorchbot Oct 13, 2025
f19882b
Handle uint types. (#15055)
shoumikhin Oct 13, 2025
b9451c9
Use new logo in ExecuTorch (#14782)
mergennachin Oct 13, 2025
23db0bc
Tensor view keeps original tensor alive. (#15056)
shoumikhin Oct 13, 2025
8876113
Ignore PRs that's empty (#15065)
mergennachin Oct 13, 2025
b9e8126
Export lora weights to sep file (#15061)
lucylq Oct 13, 2025
b18243b
Revert "[ET-VK] Add Fusing for Conv/Binary Ops, Clamp/Binary Ops, and…
JacobSzwejbka Oct 13, 2025
1428d81
Changed quantization scheme
suryasidd Oct 13, 2025
caba225
Merge branch 'main' into openvino_llama_support
suryasidd Oct 13, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
2 changes: 1 addition & 1 deletion .ci/docker/ci_commit_pins/optimum-executorch.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
bd06b54e627fbfd354a2cffa4c80fb21883209a9
44d8d54e38c0258357d4e92e1fefe21e845947a3
2 changes: 1 addition & 1 deletion .ci/docker/ci_commit_pins/pytorch.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
53a2908a10f414a2f85caa06703a26a40e873869
cf9d09490c7f6685ec68d5db3acf2e0d73c54d00
1 change: 1 addition & 0 deletions .ci/scripts/build-qnn-sdk.sh
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ set_up_aot() {
-DEXECUTORCH_BUILD_EXTENSION_EXTENSION_LLM=ON \
-DEXECUTORCH_BUILD_EXTENSION_EXTENSION_LLM_RUNNER=ON \
-DEXECUTORCH_BUILD_EXTENSION_FLAT_TENSOR=ON \
-DEXECUTORCH_BUILD_EXTENSION_NAMED_DATA_MAP=ON \
-DEXECUTORCH_BUILD_EXTENSION_TENSOR=ON \
-DEXECUTORCH_ENABLE_EVENT_TRACER=ON \
-DPYTHON_EXECUTABLE=python3
Expand Down
20 changes: 9 additions & 11 deletions .ci/scripts/setup-openvino.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,19 +10,17 @@ set -ex
# shellcheck source=/dev/null
source "$(dirname "${BASH_SOURCE[0]}")/utils.sh"

git clone https://github.com/openvinotoolkit/openvino.git
cd openvino && git checkout releases/2025/1
git submodule update --init --recursive
sudo ./install_build_dependencies.sh
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release -DENABLE_PYTHON=ON
make -j$(nproc)
# Download and install OpenVINO from release packages
OPENVINO_VERSION="2025.3"
OPENVINO_BUILD="2025.3.0.19807.44526285f24"
OPENVINO_URL="https://storage.openvinotoolkit.org/repositories/openvino/packages/${OPENVINO_VERSION}/linux/openvino_toolkit_ubuntu22_${OPENVINO_BUILD}_x86_64.tgz"

cd ..
cmake --install build --prefix dist
curl -Lo /tmp/openvino_toolkit.tgz --retry 3 --fail ${OPENVINO_URL}
tar -xzf /tmp/openvino_toolkit.tgz
mv openvino_toolkit_ubuntu22_${OPENVINO_BUILD}_x86_64 openvino

source dist/setupvars.sh
cd ../backends/openvino
source openvino/setupvars.sh
cd backends/openvino
pip install -r requirements.txt
cd scripts
./openvino_build.sh --enable_python
4 changes: 2 additions & 2 deletions .ci/scripts/setup-samsung-linux-deps.sh
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ download_ai_lite_core() {
API_BASE="https://soc-developer.semiconductor.samsung.com/api/v1/resource/ai-litecore/download"
API_KEY=$SAMSUNG_AI_LITECORE_KEY

VERSION="0.5"
VERSION="0.7"
OS_NAME="Ubuntu 22.04"
OUT_FILE="/tmp/exynos-ai-litecore-v${VERSION}.tar.gz"
TARGET_PATH="/tmp/exynos_ai_lite_core"
Expand Down Expand Up @@ -62,7 +62,7 @@ install_enn_backend() {
export PYTHONPATH=${PYTHONPATH:-}:${EXECUTORCH_ROOT}/..
}

AI_LITE_CORE_VERSION=0.5.0
AI_LITE_CORE_VERSION=0.7.0

download_ai_lite_core ${AI_LITE_CORE_VERSION}
install_enn_backend
8 changes: 8 additions & 0 deletions .ci/scripts/test_backend.sh
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
#!/usr/bin/env bash
# Copyright (c) Meta Platforms, Inc. and affiliates.
# All rights reserved.
# Copyright 2025 Arm Limited and/or its affiliates.
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree.
Expand Down Expand Up @@ -58,6 +59,13 @@ fi
if [[ "$FLOW" == *arm* ]]; then
# Setup ARM deps.
.ci/scripts/setup-arm-baremetal-tools.sh
source examples/arm/ethos-u-scratch/setup_path.sh

if [[ "$FLOW" == *ethos_u* ]]; then
# Prepare a test runner binary that can run on the Corstone-3x0 FVPs
backends/arm/scripts/build_executorch.sh
backends/arm/test/setup_testing.sh
fi
fi

if [[ $IS_MACOS -eq 1 ]]; then
Expand Down
1 change: 1 addition & 0 deletions .ci/scripts/test_ios_ci.sh
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ say() {

say "Cloning the Demo App"

git config --global http.postBuffer 524288000
git clone --depth 1 https://github.com/meta-pytorch/executorch-examples.git

say "Installing CoreML Backend Requirements"
Expand Down
1 change: 1 addition & 0 deletions .ci/scripts/test_llama_torchao_lowbit.sh
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ cmake -DPYTHON_EXECUTABLE=python \
-DEXECUTORCH_BUILD_EXTENSION_DATA_LOADER=ON \
-DEXECUTORCH_BUILD_EXTENSION_FLAT_TENSOR=ON \
-DEXECUTORCH_BUILD_EXTENSION_MODULE=ON \
-DEXECUTORCH_BUILD_EXTENSION_NAMED_DATA_MAP=ON \
-DEXECUTORCH_BUILD_EXTENSION_TENSOR=ON \
-DEXECUTORCH_BUILD_XNNPACK=OFF \
-DEXECUTORCH_BUILD_KERNELS_QUANTIZED=ON \
Expand Down
1 change: 1 addition & 0 deletions .ci/scripts/test_llava.sh
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ EXECUTORCH_COMMON_CMAKE_ARGS=" \
-DEXECUTORCH_BUILD_EXTENSION_MODULE=ON \
-DEXECUTORCH_BUILD_EXTENSION_DATA_LOADER=ON \
-DEXECUTORCH_BUILD_EXTENSION_FLAT_TENSOR=ON \
-DEXECUTORCH_BUILD_EXTENSION_NAMED_DATA_MAP=ON \
-DEXECUTORCH_BUILD_EXTENSION_LLM=ON \
-DEXECUTORCH_BUILD_EXTENSION_LLM_RUNNER=ON \
-DEXECUTORCH_BUILD_EXTENSION_TENSOR=ON \
Expand Down
32 changes: 28 additions & 4 deletions .ci/scripts/test_model.sh
Original file line number Diff line number Diff line change
Expand Up @@ -48,22 +48,33 @@ prepare_artifacts_upload() {
fi
}


build_cmake_executor_runner() {
local backend_string_select="${1:-}"
echo "Building executor_runner"
rm -rf ${CMAKE_OUTPUT_DIR}
mkdir ${CMAKE_OUTPUT_DIR}
# Common options:
COMMON="-DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE"
if [[ "$backend_string_select" == "XNNPACK" ]]; then
echo "Backend $backend_string_select selected"
(cd ${CMAKE_OUTPUT_DIR} \
&& cmake -DCMAKE_BUILD_TYPE=Release \
cmake -DCMAKE_BUILD_TYPE=Release \
-DEXECUTORCH_BUILD_XNNPACK=ON \
-DPYTHON_EXECUTABLE="$PYTHON_EXECUTABLE" ..)
${COMMON} \
-B${CMAKE_OUTPUT_DIR} .
cmake --build ${CMAKE_OUTPUT_DIR} -j4
elif [[ "$backend_string_select" == "CUDA" ]]; then
echo "Backend $backend_string_select selected"
cmake -DCMAKE_BUILD_TYPE=Release \
-DEXECUTORCH_BUILD_CUDA=ON \
-DEXECUTORCH_BUILD_EXTENSION_TENSOR=ON \
${COMMON} \
-B${CMAKE_OUTPUT_DIR} .
cmake --build ${CMAKE_OUTPUT_DIR} -j4
else
cmake -DCMAKE_BUILD_TYPE=Debug \
-DEXECUTORCH_BUILD_KERNELS_OPTIMIZED=ON \
-DPYTHON_EXECUTABLE="$PYTHON_EXECUTABLE" \
${COMMON} \
-B${CMAKE_OUTPUT_DIR} .
cmake --build ${CMAKE_OUTPUT_DIR} -j4 --config Debug
fi
Expand Down Expand Up @@ -320,6 +331,13 @@ test_model_with_mediatek() {
EXPORTED_MODEL=$(find "./${EXPORT_SCRIPT}" -type f -name "*.pte" -print -quit)
}

test_model_with_cuda() {
# Export a basic .pte and .ptd, then run the model.
"${PYTHON_EXECUTABLE}" -m examples.cuda.scripts.export --model_name="${MODEL_NAME}" --output_dir "./"
build_cmake_executor_runner "CUDA"
./${CMAKE_OUTPUT_DIR}/executor_runner --model_path "./${MODEL_NAME}.pte" --data_path "./aoti_cuda_blob.ptd"
}


if [[ "${BACKEND}" == "portable" ]]; then
echo "Testing ${MODEL_NAME} with portable kernels..."
Expand Down Expand Up @@ -372,6 +390,12 @@ elif [[ "${BACKEND}" == "mediatek" ]]; then
if [[ $? -eq 0 ]]; then
prepare_artifacts_upload
fi
elif [[ "${BACKEND}" == "cuda" ]]; then
echo "Testing ${MODEL_NAME} with cuda..."
test_model_with_cuda
if [[ $? -eq 0 ]]; then
prepare_artifacts_upload
fi
else
set +e
if [[ "${BACKEND}" == *"quantization"* ]]; then
Expand Down
2 changes: 1 addition & 1 deletion .ci/scripts/test_openvino.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ set -ex
# shellcheck source=/dev/null
source "$(dirname "${BASH_SOURCE[0]}")/utils.sh"

source openvino/dist/setupvars.sh
source openvino/setupvars.sh
cd backends/openvino/tests
python test_runner.py --test_type ops
python test_runner.py --test_type models
1 change: 1 addition & 0 deletions .ci/scripts/test_torchao_huggingface_checkpoints.sh
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,7 @@ if [[ "$TEST_WITH_RUNNER" -eq 1 ]]; then
-DEXECUTORCH_BUILD_EXTENSION_DATA_LOADER=ON \
-DEXECUTORCH_BUILD_EXTENSION_FLAT_TENSOR=ON \
-DEXECUTORCH_BUILD_EXTENSION_MODULE=ON \
-DEXECUTORCH_BUILD_EXTENSION_NAMED_DATA_MAP=ON \
-DEXECUTORCH_BUILD_EXTENSION_TENSOR=ON \
-DEXECUTORCH_BUILD_XNNPACK=ON \
-DEXECUTORCH_BUILD_KERNELS_QUANTIZED=ON \
Expand Down
4 changes: 4 additions & 0 deletions .ci/scripts/test_yolo12.sh
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,8 @@ cmake_install_executorch_libraries() {
-DEXECUTORCH_BUILD_XNNPACK="$XNNPACK" \
-DEXECUTORCH_BUILD_EXTENSION_DATA_LOADER=ON \
-DEXECUTORCH_BUILD_EXTENSION_MODULE=ON \
-DEXECUTORCH_BUILD_EXTENSION_FLAT_TENSOR=ON \
-DEXECUTORCH_BUILD_EXTENSION_NAMED_DATA_MAP=ON \
-DEXECUTORCH_BUILD_EXTENSION_RUNNER_UTIL=ON \
-DEXECUTORCH_BUILD_EXTENSION_TENSOR=ON \
-B"${build_dir}"
Expand All @@ -131,6 +133,8 @@ cmake_install_executorch_libraries() {
-DEXECUTORCH_BUILD_XNNPACK="$XNNPACK" \
-DEXECUTORCH_BUILD_EXTENSION_DATA_LOADER=ON \
-DEXECUTORCH_BUILD_EXTENSION_MODULE=ON \
-DEXECUTORCH_BUILD_EXTENSION_FLAT_TENSOR=ON \
-DEXECUTORCH_BUILD_EXTENSION_NAMED_DATA_MAP=ON \
-DEXECUTORCH_BUILD_EXTENSION_RUNNER_UTIL=ON \
-DEXECUTORCH_ENABLE_LOGGING=ON \
-DEXECUTORCH_BUILD_EXTENSION_TENSOR=ON \
Expand Down
7 changes: 4 additions & 3 deletions .ci/scripts/utils.sh
Original file line number Diff line number Diff line change
Expand Up @@ -125,14 +125,15 @@ build_executorch_runner_cmake() {
clean_executorch_install_folders
mkdir "${CMAKE_OUTPUT_DIR}"

pushd "${CMAKE_OUTPUT_DIR}" || return
if [[ $1 == "Debug" ]]; then
CXXFLAGS="-fsanitize=address,undefined"
else
CXXFLAGS=""
fi
CXXFLAGS="$CXXFLAGS" retry cmake -DPYTHON_EXECUTABLE="${PYTHON_EXECUTABLE}" -DCMAKE_BUILD_TYPE="${1:-Release}" ..
popd || return
CXXFLAGS="$CXXFLAGS" retry cmake \
-DPYTHON_EXECUTABLE="${PYTHON_EXECUTABLE}" \
-DCMAKE_BUILD_TYPE="${1:-Release}" \
-B${CMAKE_OUTPUT_DIR} .

if [ "$(uname)" == "Darwin" ]; then
CMAKE_JOBS=$(( $(sysctl -n hw.ncpu) - 1 ))
Expand Down
4 changes: 4 additions & 0 deletions .github/workflows/android-release-artifacts.yml
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,10 @@ jobs:
fi

FLAVOR="${{ inputs.flavor }}"
if [ ! -z "$FLAVOR" ]; then
GRADLE_ARGS+=" -Dflavor=${FLAVOR}"
fi

if [[ "$FLAVOR" == "vulkan" || -z "$FLAVOR" ]]; then
curl -O https://sdk.lunarg.com/sdk/download/1.4.321.1/linux/vulkansdk-linux-x86_64-1.4.321.1.tar.xz
tar xf vulkansdk-linux-x86_64-1.4.321.1.tar.xz -C /tmp
Expand Down
Loading
Loading