Backmerging with Msft commits #638
Merged
jatinwadhwa921 merged 63 commits intoovep-developfrom Apr 4, 2025
Merged
Conversation
### Description Limit the Pipeline ability to build cuda 11. However, refernce to CUDA 11 is not complety removed in this PR. Will keep thme incase we decided to support both cuda 13 and cuda 12 in the future. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
### Description Move the x64 part of "Linux CPU CI pipeline" to Github Actions
…ite-default (microsoft#24167) Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) from 6.2.1 to 6.2.3. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/vitejs/vite/releases">vite's releases</a>.</em></p> <blockquote> <h2>v6.2.3</h2> <p>Please refer to <a href="https://github.com/vitejs/vite/blob/v6.2.3/packages/vite/CHANGELOG.md">CHANGELOG.md</a> for details.</p> <h2>v6.2.2</h2> <p>Please refer to <a href="https://github.com/vitejs/vite/blob/v6.2.2/packages/vite/CHANGELOG.md">CHANGELOG.md</a> for details.</p> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/vitejs/vite/blob/v6.2.3/packages/vite/CHANGELOG.md">vite's changelog</a>.</em></p> <blockquote> <h2><!-- raw HTML omitted -->6.2.3 (2025-03-24)<!-- raw HTML omitted --></h2> <ul> <li>fix: fs raw query with query separators (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19702">#19702</a>) (<a href="https://github.com/vitejs/vite/commit/f234b5744d8b74c95535a7b82cc88ed2144263c1">f234b57</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19702">#19702</a></li> </ul> <h2><!-- raw HTML omitted -->6.2.2 (2025-03-14)<!-- raw HTML omitted --></h2> <ul> <li>fix: await client buildStart on top level buildStart (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19624">#19624</a>) (<a href="https://github.com/vitejs/vite/commit/b31faab2a81b839e4b747baeb9c7a7cbb724f8d2">b31faab</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19624">#19624</a></li> <li>fix(css): inline css correctly for double quote use strict (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19590">#19590</a>) (<a href="https://github.com/vitejs/vite/commit/d0aa833296668fc420a27a1ea88ecdbdeacdbce7">d0aa833</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19590">#19590</a></li> <li>fix(deps): update all non-major dependencies (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19613">#19613</a>) (<a href="https://github.com/vitejs/vite/commit/363d691b4995d72f26a14eb59ed88a9483b1f931">363d691</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19613">#19613</a></li> <li>fix(indexHtml): ensure correct URL when querying module graph (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19601">#19601</a>) (<a href="https://github.com/vitejs/vite/commit/dc5395a27e44066ef7725278c4057d9f1071a53f">dc5395a</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19601">#19601</a></li> <li>fix(preview): use preview https config, not server (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19633">#19633</a>) (<a href="https://github.com/vitejs/vite/commit/98b3160fa5916189e756cd7c5aae87e0d8f1978e">98b3160</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19633">#19633</a></li> <li>fix(ssr): use optional chaining to prevent "undefined is not an object" happening in `ssrRewriteStac (<a href="https://github.com/vitejs/vite/commit/43097550a1aa8ff633c39fb197b5f9ac1222119b">4309755</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19612">#19612</a></li> <li>feat: show friendly error for malformed <code>base</code> (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19616">#19616</a>) (<a href="https://github.com/vitejs/vite/commit/2476391b2854aaa67d0ed317b6d0c462e68028f7">2476391</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19616">#19616</a></li> <li>feat(worker): show asset filename conflict warning (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19591">#19591</a>) (<a href="https://github.com/vitejs/vite/commit/367d968fbf584e9f0e17192b816e92e8045c6217">367d968</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19591">#19591</a></li> <li>chore: extend commit hash correctly when ambigious with a non-commit object (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19600">#19600</a>) (<a href="https://github.com/vitejs/vite/commit/89a62873243805518b672212db7e317989c5c197">89a6287</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19600">#19600</a></li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/vitejs/vite/commit/16869d7c9917eb58d9a0101e30064ab65e64fa91"><code>16869d7</code></a> release: v6.2.3</li> <li><a href="https://github.com/vitejs/vite/commit/f234b5744d8b74c95535a7b82cc88ed2144263c1"><code>f234b57</code></a> fix: fs raw query with query separators (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19702">#19702</a>)</li> <li><a href="https://github.com/vitejs/vite/commit/b12911edba0cd9edbad170a0940d37bb1e16ef2c"><code>b12911e</code></a> release: v6.2.2</li> <li><a href="https://github.com/vitejs/vite/commit/98b3160fa5916189e756cd7c5aae87e0d8f1978e"><code>98b3160</code></a> fix(preview): use preview https config, not server (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19633">#19633</a>)</li> <li><a href="https://github.com/vitejs/vite/commit/b31faab2a81b839e4b747baeb9c7a7cbb724f8d2"><code>b31faab</code></a> fix: await client buildStart on top level buildStart (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19624">#19624</a>)</li> <li><a href="https://github.com/vitejs/vite/commit/dc5395a27e44066ef7725278c4057d9f1071a53f"><code>dc5395a</code></a> fix(indexHtml): ensure correct URL when querying module graph (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19601">#19601</a>)</li> <li><a href="https://github.com/vitejs/vite/commit/2476391b2854aaa67d0ed317b6d0c462e68028f7"><code>2476391</code></a> feat: show friendly error for malformed <code>base</code> (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19616">#19616</a>)</li> <li><a href="https://github.com/vitejs/vite/commit/43097550a1aa8ff633c39fb197b5f9ac1222119b"><code>4309755</code></a> fix(ssr): use optional chaining to prevent "undefined is not an object" happe...</li> <li><a href="https://github.com/vitejs/vite/commit/363d691b4995d72f26a14eb59ed88a9483b1f931"><code>363d691</code></a> fix(deps): update all non-major dependencies (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19613">#19613</a>)</li> <li><a href="https://github.com/vitejs/vite/commit/d0aa833296668fc420a27a1ea88ecdbdeacdbce7"><code>d0aa833</code></a> fix(css): inline css correctly for double quote use strict (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19590">#19590</a>)</li> <li>Additional commits viewable in <a href="https://github.com/vitejs/vite/commits/v6.2.3/packages/vite">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/microsoft/onnxruntime/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…rosoft#24136) Move the allocator data member declaration before the `Ort::Value` container data members that might use the allocator so that the `Ort::Value` containers will be destroyed first. `custom_allocator_` may be used as the allocator for the `Ort::Value`s in `test_inputs_` and `outputs_`. The allocator shouldn't be destroyed before `Ort::Value`s allocated with it are freed.
### Description Fix layout transformer for FusedConv. The current layout transformer will transform `FusedConv` (kMSDomain) into `FusedConv` (kMSInternalNHWCDomain) if the EP wants channels_last. However, kMSInternalNHWCDomain uses OpType `Conv` for both Conv and FusedConv, so `FusedConv` (kMSInternalNHWCDomain) is invalid (unregistered op). This PR fixes this and allows layout transformer change `FusedConv` (kMSDomain) into `Conv` (kMSInternalNHWCDomain). ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
…idail is disabled in MacOS and iOS packaging stage due to (microsoft#24152) (microsoft#24153) NuGet_Packaging_CPU is broken due to similar issue from microsoft#23923 ### Description Migrate [Zip-Nuget Package Pipeline](https://aiinfra.visualstudio.com/Lotus/_build?definitionId=940&_a=summary) to 1ES ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> ### Check list - [x] Issue with onnxruntime-Win-CPU-2022 - [x] [Spot Bug](https://aiinfra.visualstudio.com/Lotus/_build/results?buildId=697830&view=logs&j=6c6a898f-bbbb-5c72-8695-82b606149fa2&t=433f102b-5ed3-5fed-87a0-6107744ce9b1&l=81)
### Description Update the min supported GCC version to 11.1. ### Motivation and Context In order to utilize new CPU instructions, we need to use new compilers. For example, our MLAS code needs bfloat16 support for arm, which requires GCC version >=10. And some other code requires GCC version >=11.1. Also, our CI pipelines only tests the code with GCC 11,12 and 14. Therefore this PR increase the min GCC version to 11.1. Will update it to 12 once we deprecate CUDA 11 pipelines
) --use_vcpkg option seems to be causing problems for --arm64ec python packages (onnxruntime-qnn) session creation crashes for packages built with --use_vcpkg. the released onnxruntime-qnn 1.21.0 python wheel for x64 (arm64ec) has this issue. removing --use_vcpkg while the issue is debugged in parallel. we plan to release a 1.21.1 onnxruntime-qnn x64 python wheel without --use_vcpkg to address the crash. microsoft#24082
Increases operator GEMM for WebGPU ep. --------- Co-authored-by: Xiaofei Han <xiaofeihan+microsoft@microsoft.com> Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com>
### Description There are slightly mismatch for the build flags for Web build pipeline when using vcpkg. A [fix](microsoft#24012) is on the way but for now we need to disable vcpkg for the next patch release. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
### Description - remove x86_64/Debug build in the matrix to reduce the amount of jobs - set max-parallel to 1 to avoid big backlogs (single PR will take longer but less traffic in the pipeine)
### Description currently it is triggered on every branch.
### Description upgrade QNN to latest version 2.32.0.250228
Fixes microsoft#24070 by explicitly restricting single-threaded, sequential execution in the case where `reduction=none && hasDuplicates`.
…24194) This is a workaround for a build error. See microsoft#24152.
### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
### Description Since we are adapting 1ES teamplate, we are remove the redundent CG steps from our pipelines ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
…ions (microsoft#24190) ### Description 1. Move Linux ARM64 CI pipeline and Linux DNNL CI pipeline to Github Actions 2. Refactor .github/workflows/linux_training.yml to use a template ### Motivation and Context
This fixes the missing component handling for the input and output variables in BatchNorm operator.
### Description Further reduce work load for Mac CI pipeline ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
### Description Generate unique name for fused Split nodes. ### Motivation and Context The bug is manifested when the model features more than one Slice to Split fusion patterns and the nodes of the graph are nameless. This addresses microsoft#24203.
### Description - Pin VCPKG version for Github Actions pipelines - Update NDK to 28 because cmake 4.0 dropped the support for NDK 27. - Disable vcpkg temporarily for 2 ADO pipelines.
### Description <!-- Describe your changes. --> For GroupQueryAttention op, if the input total_sequence_length is a constant, we can infer the shape of output present_key/present_value `(batch_size, kv_num_heads, present_sequence_length, head_size)`. https://github.com/microsoft/onnxruntime/blob/5ed900e9712ce2f02e40c15b945d18453d1960d8/onnxruntime/contrib_ops/cpu/bert/group_query_attention_helper.h#L185 We know that from CPU EP, `present_sequence_length = max(past_sequence_length, total_sequence_length)`, and `batch_size, kv_num_heads, head_size` are the same as past_key/past_value. This inference is very important for WebNN EP, because WebNN only supports GQA for `present_sequence_length == past_sequence_length` and requires static shape for graph compilation. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
### Description - adds react native namespace back in to the androidmanifest.xml ### Motivation and Context - reverses [this commit](microsoft@d8ed4da) - missed [this comment](https://github.com/microsoft/onnxruntime/blob/2656671064a83564ddf5766f3449c2406259c3ef/js/react_native/android/build.gradle#L141) that explains that androidmanifest.xml is used for iOS while androidmanifestnew.xml is used for android
### Description RoPE to work with fp16 data types ### Motivation and Context this is need to improve GQA --------- Signed-off-by: liqunfu <liqun.fu@microsoft.com> Signed-off-by: Liqun Fu <liqun.fu@microsoft.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
### Description This PR migrates the Web CI into github actions.
…ft#24233) ### Description update the readme to remove something Qnn specific for the tool ep_weight_sharing_ctx_gen --------- Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
Increases operator covereage for WebGPU EP.
### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
### Description Improve the performance of file glob for pyright. This helps to improve VSCode performance (if pyright plugin is installed)
### Description <!-- Describe your changes. --> Add Wasm Relaxed SIMD support. Use integer dot product instructions for QGemmU8X8. 1. Build with --enable_wasm_relaxed_simd 2. Use env.wasm.relaxedSimd to run it ### Motivation and Context microsoft#22533 --------- Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com>
### Description This PR adds a shader key validation step to the WebGPU CI pipeline. The shader key validation works in this way: - first, run onnxruntime_test_all with verbose logging, dumping the logs into a file - then, parse the file and found WebGPU EP program logs. The log contains the following information: - the shader cache key - the corresponding shader code The script will aggregate those information and make sure for each cache key, the corresponding shader code must be consistent. To make the validation work, this PR also modified a few things: - set the locale of `std::wclog` to ".UTF-8" to support Unicode characters. Otherwise the logger will fail and no longer output future logs. A fix is submitted in PR microsoft#24237 but there is a concern if this may potentially break some users. Setting inside onnxruntime_test_all is pretty safe. - re-enable the WebGPU device auto collect which was introduced in microsoft#24115. Now we have a better way to detect cache key inconsistency. ### Next Step The newly added test is marked as `continue-on-error: true`, which means even if it failed it does not block the CI pipeline. We should fix those failures one-by-one and eventually the test should pass. then we can remove the `continue-on-error: true` flag.
…rosoft#24247) ### Description Bump version of Dawn to 4cb1f9be152a4fa6bb695c08cd707ab078a1e2fb. ### Changes to the patches to Dawn: Removed patches because they are already merged into upstream or resolved in a different way: - (public) CMake fix to support Emscripten v4.0.3+ - (private) Fix external ref count for "external" device in emwgpu C++ implementation - (private) Allow "external" buffer in emwgpu C++ implementation Keep unchanged patches: - (private) Remove hard-coded CMAKE_OSX_DEPLOYMENT_TARGET in Dawn's CMake files Rewritten patches: - (public) Fix emwgpu C++ implementation for buffer destroy ### Corresponding changes in ORT - Dawn API changes - follow changes to `wgpu::Limits` - remove the usage of `DAWN_EMSCRIPTEN_TOOLCHAIN` - use `wgpu::InstanceDescriptor` in `wgpu::Instance` creation in WASM since it is supported now.
…24248) Bumps [dsaltares/fetch-gh-release-asset](https://github.com/dsaltares/fetch-gh-release-asset) from 1.1.0 to 1.1.2. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/dsaltares/fetch-gh-release-asset/releases">dsaltares/fetch-gh-release-asset's releases</a>.</em></p> <blockquote> <h2>1.1.2</h2> <h2>What's Changed</h2> <ul> <li>feat: support unauthenticated requests by <a href="https://github.com/maciekmm"><code>@maciekmm</code></a> in <a href="https://redirect.github.com/dsaltares/fetch-gh-release-asset/pull/59">dsaltares/fetch-gh-release-asset#59</a></li> <li>fix: 61 - upgrade to node 20 by <a href="https://github.com/dsaltares"><code>@dsaltares</code></a> in <a href="https://redirect.github.com/dsaltares/fetch-gh-release-asset/pull/63">dsaltares/fetch-gh-release-asset#63</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/maciekmm"><code>@maciekmm</code></a> made their first contribution in <a href="https://redirect.github.com/dsaltares/fetch-gh-release-asset/pull/59">dsaltares/fetch-gh-release-asset#59</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/dsaltares/fetch-gh-release-asset/compare/1.1.1...1.1.2">https://github.com/dsaltares/fetch-gh-release-asset/compare/1.1.1...1.1.2</a></p> <h2>1.1.1</h2> <h2>What's Changed</h2> <ul> <li>fix: 50 - actually default version to latest by <a href="https://github.com/dsaltares"><code>@dsaltares</code></a> in <a href="https://redirect.github.com/dsaltares/fetch-gh-release-asset/pull/56">dsaltares/fetch-gh-release-asset#56</a></li> <li>Bump json5 from 1.0.1 to 1.0.2 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/dsaltares/fetch-gh-release-asset/pull/55">dsaltares/fetch-gh-release-asset#55</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/dependabot"><code>@dependabot</code></a> made their first contribution in <a href="https://redirect.github.com/dsaltares/fetch-gh-release-asset/pull/55">dsaltares/fetch-gh-release-asset#55</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/dsaltares/fetch-gh-release-asset/compare/1.1.0...1.1.1">https://github.com/dsaltares/fetch-gh-release-asset/compare/1.1.0...1.1.1</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/dsaltares/fetch-gh-release-asset/commit/aa2ab1243d6e0d5b405b973c89fa4d06a2d0fff7"><code>aa2ab12</code></a> fix: 61 - upgrade to node 20 (<a href="https://redirect.github.com/dsaltares/fetch-gh-release-asset/issues/63">#63</a>)</li> <li><a href="https://github.com/dsaltares/fetch-gh-release-asset/commit/cdaf216b2a5baa0f20eecbf460912cc9947f2577"><code>cdaf216</code></a> feat: support unauthenticated requests (<a href="https://redirect.github.com/dsaltares/fetch-gh-release-asset/issues/59">#59</a>)</li> <li><a href="https://github.com/dsaltares/fetch-gh-release-asset/commit/5d24fa77c1ae2e1e1dea54677d267f127d5de53a"><code>5d24fa7</code></a> chore: remove support notice</li> <li><a href="https://github.com/dsaltares/fetch-gh-release-asset/commit/a40c8b4a0471f9ab81bdf73a010f74cc51476ad4"><code>a40c8b4</code></a> Bump json5 from 1.0.1 to 1.0.2 (<a href="https://redirect.github.com/dsaltares/fetch-gh-release-asset/issues/55">#55</a>)</li> <li><a href="https://github.com/dsaltares/fetch-gh-release-asset/commit/5a71312bcb7a436e89a7dd26123cdbdd7b3df709"><code>5a71312</code></a> fix: 50 - actually default version to latest (<a href="https://redirect.github.com/dsaltares/fetch-gh-release-asset/issues/56">#56</a>)</li> <li>See full diff in <a href="https://github.com/dsaltares/fetch-gh-release-asset/compare/1.1.0...1.1.2">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…ite-default (microsoft#24255) Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) from 6.2.3 to 6.2.4. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/vitejs/vite/releases">vite's releases</a>.</em></p> <blockquote> <h2>v6.2.4</h2> <p>Please refer to <a href="https://github.com/vitejs/vite/blob/v6.2.4/packages/vite/CHANGELOG.md">CHANGELOG.md</a> for details.</p> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/vitejs/vite/blob/v6.2.4/packages/vite/CHANGELOG.md">vite's changelog</a>.</em></p> <blockquote> <h2><!-- raw HTML omitted -->6.2.4 (2025-03-31)<!-- raw HTML omitted --></h2> <ul> <li>fix: fs check in transform middleware (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19761">#19761</a>) (<a href="https://github.com/vitejs/vite/commit/7a4fabab6a3aa24c89144e15a13d78f92b52e588">7a4faba</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19761">#19761</a></li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/vitejs/vite/commit/037f801075ec35bb6e52145d659f71a23813c48f"><code>037f801</code></a> release: v6.2.4</li> <li><a href="https://github.com/vitejs/vite/commit/7a4fabab6a3aa24c89144e15a13d78f92b52e588"><code>7a4faba</code></a> fix: fs check in transform middleware (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19761">#19761</a>)</li> <li>See full diff in <a href="https://github.com/vitejs/vite/commits/v6.2.4/packages/vite">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/microsoft/onnxruntime/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Fixes the following errors: ``` [ONNXRuntimeError] : 1 : FAIL : WebGPU validation failed. Error while parsing WGSL: :48:1 error: unexpected token } ^ - While validating [ShaderModuleDescriptor] - While calling [Device].CreateShaderModule([ShaderModuleDescriptor]). ``` ``` [E:onnxruntime:sam, sequential_executor.cc:572 onnxruntime::ExecuteKernel] Non-zero status code returned while running Split node. Name:'/Split_1' Status Message: WebGPU validation failed. Error while parsing WGSL: :62:14 error: cannot index type 'u32' index -= uniforms.sizes_in_split_axis[output_number - 1u]; ```
…9c32e445af4250f098 to f3d90afe522476c858909e0de2be0b12bc890068 (microsoft#24249) Bumps [microsoft/onnxruntime-github-actions](https://github.com/microsoft/onnxruntime-github-actions) from 35f8bd42417991aa46577e9c32e445af4250f098 to f3d90afe522476c858909e0de2be0b12bc890068. <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/microsoft/onnxruntime-github-actions/commit/f3d90afe522476c858909e0de2be0b12bc890068"><code>f3d90af</code></a> update</li> <li><a href="https://github.com/microsoft/onnxruntime-github-actions/commit/fe4bffdebbaf16477883ba661ecfeeeb5703c85a"><code>fe4bffd</code></a> update</li> <li><a href="https://github.com/microsoft/onnxruntime-github-actions/commit/2cf46f409099e5a27977bbece1c89f3d6dca6a1b"><code>2cf46f4</code></a> update</li> <li><a href="https://github.com/microsoft/onnxruntime-github-actions/commit/bb6b16e409684ffd0f46b35e8e217ce6ed72097c"><code>bb6b16e</code></a> update</li> <li><a href="https://github.com/microsoft/onnxruntime-github-actions/commit/0c8c2ab4b6ca3be3c8287a0e0549038ecca38d7d"><code>0c8c2ab</code></a> update</li> <li><a href="https://github.com/microsoft/onnxruntime-github-actions/commit/f861fd3c0d13dedcf2fae39ad7023acaad97532d"><code>f861fd3</code></a> update</li> <li>See full diff in <a href="https://github.com/microsoft/onnxruntime-github-actions/compare/35f8bd42417991aa46577e9c32e445af4250f098...f3d90afe522476c858909e0de2be0b12bc890068">compare view</a></li> </ul> </details> <br /> Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
### Description Update xcode and iphoneSimulatorVersion after MacOS-14 ### Motivation and Context iOS packaging pipeline and Github Action were still using the old xcode version after microsoft#23293
microsoft#24258) …nance ### Description Exclude onnxruntime-inference-examples directory from Component Governance ### Motivation and Context onnxruntime-inference-examples is a extneral repos
### Description <!-- Describe your changes. --> include mp11 as it is used for provider related headers ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> VitisAI failed to be built on latest g++ version.
### Description The ADO Web CI is migrated to Github Actions now. This PR makes the corresponding changes to the `npm run pull:wasm` command to use the new Github Action. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
… thread (microsoft#24192) Running cuda kernel on incorrect GPU device will end up getting CUDA error: `invalid resource handle.` CUDA EP and TRT EP both have this issue when ExecutionMode::ORT_PARALLEL is enabled. Repro code: ````python provider = [ [ ('TensorrtExecutionProvider', { 'device_id': 0, }), ], [ ('TensorrtExecutionProvider', { 'device_id': 1, }), ] ] class ThreadObj(): def __init__(self, model_path: str, iterations: int, idx: int): ... sess_opt = ort.SessionOptions() sess_opt.execution_mode = ort.ExecutionMode.ORT_PARALLEL self.inference_session = ort.InferenceSession(model_path, sess_opt, provider[idx % 2]) def warmup(self): self.inference_session.run(None, self.input) def run(self, thread_times, threads_complete): for iter in range(self.iterations): self.inference_session.run(None, self.input) def thread_target(obj, thread_times, threads_complete): obj.run(thread_times, threads_complete) ... iterations = 500 num_threads = 13 t_obj_list = [] thread_list = [] for tidx in range(num_threads): obj = ThreadObj(model_path, iterations, tidx) t_obj_list.append(obj) obj.warmup() for t_obj in t_obj_list: thread = threading.Thread(target=thread_target, daemon=True, args=(t_obj,thread_times,threads_complete,)) thread.start() thread_list.append(thread) ... ```` The reason is when the inference session is initialized, it can be bound to device > 0, whereas when running the inference, i.e. RunSince can be invoked by a new thread and new threads default to using device 0, then we will hit the error of using the incorrect GPU device. This PR provides a general fix for both CUDA EP and TRT EP to call cudaSetDeivce in RunSince.
Ceil mode is required for rtdetr model. The actual ceil mode calculation is already implemented in the PoolAttributes::ComputeOutputSize() method from pool_attributes.h under CPU EP.
### Description Pin vcpkg version. Yesterday vcpkg-tool made a new release that broke all our Linux pipelines. --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
### Description TensorRT V3 plugin is not able to load in TensorRT EP. The change deprecates `getPluginCreatorList` with `getAllCreators` to load V1 and V3 plugin creators. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Support load TensorRT plugin. Reference: https://github.com/NVIDIA/TensorRT/blob/8c6d69ddec0b2feff12f55472dc5d55cb6861d53/python/src/infer/pyPlugin.cpp#L2971C1-L2995C6
### Description <!-- Describe your changes. --> Expose TRT preview features as EP option. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Add support to turn on TensorRT preview features. For example, > If the IPluginV3OneBuildV2 build capability is used, the plugin can also communicate to TensorRT that certain input-output pairs are aliased (share the same data buffer). TensorRT will query IPluginV3OneBuildV2::getAliasedInput to determine any such aliasing behavior. To use this feature, **PreviewFeature::kALIASED_PLUGIN_IO_10_03** must be enabled. --------- Co-authored-by: Vcpkg Builder <builder@vcpkg>
This uses dummy override shapes to bypass the 'components' check.
…asFeature (microsoft#24281) ### Description This PR is one of a series of changes for optimization of Dawn API usage. Currently, the WebGPU EP has some suboptimal code paths that result in unnecessary Dawn API calls. Reducing the number of calls to those API will help improve the performance of the WebGPU EP, especially on WebAssembly. This PR optimizes the usage of `wgpuDeviceHasFeature`.
…/nextjs-default (microsoft#24283) Bumps [next](https://github.com/vercel/next.js) from 15.2.3 to 15.2.4. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/vercel/next.js/releases">next's releases</a>.</em></p> <blockquote> <h2>v15.2.4</h2> <blockquote> <p>[!NOTE]<br /> This release is backporting bug fixes. It does <strong>not</strong> include all pending features/changes on canary.</p> </blockquote> <h3>Core Changes</h3> <ul> <li>Match subrequest handling for edge and node (<a href="https://redirect.github.com/vercel/next.js/issues/77474">#77474</a>)</li> <li>exclude images and static media from dev origin check (<a href="https://redirect.github.com/vercel/next.js/issues/77417">#77417</a>)</li> <li>ensure /__next middleware URLs are included in the origin check (<a href="https://redirect.github.com/vercel/next.js/issues/77416">#77416</a>)</li> <li>remove direct ip/port bypass in dev origin check (<a href="https://redirect.github.com/vercel/next.js/issues/77414">#77414</a>)</li> <li>switch development origin verification to be opt-in rather than opt-out (<a href="https://redirect.github.com/vercel/next.js/issues/77395">#77395</a>)</li> </ul> <h3>Credits</h3> <p>Huge thanks to <a href="https://github.com/ijjk"><code>@ijjk</code></a> and <a href="https://github.com/ztanner"><code>@ztanner</code></a> for helping!</p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/vercel/next.js/commit/804aa35c71cc65cf3ddc29cdadcd29f06b368285"><code>804aa35</code></a> v15.2.4</li> <li><a href="https://github.com/vercel/next.js/commit/ecb72ee9ead86aaa1e3992b427bfb43b046aa08d"><code>ecb72ee</code></a> Match subrequest handling for edge and node (<a href="https://redirect.github.com/vercel/next.js/issues/77474">#77474</a>)</li> <li><a href="https://github.com/vercel/next.js/commit/25f810b596cdb6875d1f068ae8d203f1a5df7a46"><code>25f810b</code></a> exclude images and static media from dev origin check (<a href="https://redirect.github.com/vercel/next.js/issues/77417">#77417</a>)</li> <li><a href="https://github.com/vercel/next.js/commit/d9bcb833dd2a8dd5c13f30775d688f7015cd75b1"><code>d9bcb83</code></a> ensure /__next middleware URLs are included in the origin check (<a href="https://redirect.github.com/vercel/next.js/issues/77416">#77416</a>)</li> <li><a href="https://github.com/vercel/next.js/commit/cfeaa86fa718f1fecce9fb5f5fad3c310117fc53"><code>cfeaa86</code></a> remove direct ip/port bypass in dev origin check (<a href="https://redirect.github.com/vercel/next.js/issues/77414">#77414</a>)</li> <li><a href="https://github.com/vercel/next.js/commit/f84730266087817b39c9b87c42ccf1c3bb7de0c5"><code>f847302</code></a> switch development origin verification to be opt-in rather than opt-out (<a href="https://redirect.github.com/vercel/next.js/issues/77395">#77395</a>)</li> <li>See full diff in <a href="https://github.com/vercel/next.js/compare/v15.2.3...v15.2.4">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/microsoft/onnxruntime/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…t#24278) Bumps [image-size](https://github.com/image-size/image-size) from 1.1.1 to 1.2.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/image-size/image-size/releases">image-size's releases</a>.</em></p> <blockquote> <h2>v1.2.1</h2> <h2>Fixes</h2> <ul> <li>fix potential Denial of Service via specially crafted payloads in <a href="https://github.com/image-size/image-size/commit/640a67d9e821baee4cb596def8db00627f649dfc">https://github.com/image-size/image-size/commit/640a67d9e821baee4cb596def8db00627f649dfc</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/image-size/image-size/compare/v1.2.0...v1.2.1">https://github.com/image-size/image-size/compare/v1.2.0...v1.2.1</a></p> <h2>v1.2.0</h2> <p>This release adds support for JPEG-XL ( <a href="https://redirect.github.com/image-size/image-size/issues/409">#409</a> )</p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/image-size/image-size/commit/a4178fbb334ddb22d94cb4228ed597c24fd02e10"><code>a4178fb</code></a> 1.2.1</li> <li><a href="https://github.com/image-size/image-size/commit/640a67d9e821baee4cb596def8db00627f649dfc"><code>640a67d</code></a> fix potential Denial of Service via specially crafted payloads</li> <li><a href="https://github.com/image-size/image-size/commit/9d41448d7843405d1ff2c59352ec17a9bca3f358"><code>9d41448</code></a> 1.2.0</li> <li><a href="https://github.com/image-size/image-size/commit/405a244dae9d8576528869b89229cae539f7e901"><code>405a244</code></a> fixups</li> <li><a href="https://github.com/image-size/image-size/commit/76c5c9a8aa9b38e8c703136e5a4f8c5cadc74dff"><code>76c5c9a</code></a> mention jpeg-xl in the readme</li> <li><a href="https://github.com/image-size/image-size/commit/a10262c7c32e40ac269e3434afa07895c11a1274"><code>a10262c</code></a> Add support for JPEG XL (<a href="https://redirect.github.com/image-size/image-size/issues/409">#409</a>)</li> <li><a href="https://github.com/image-size/image-size/commit/a7a24a3fc4ce750cec253618d33967b3b9d331d7"><code>a7a24a3</code></a> (app): Fix typo in comments (<a href="https://redirect.github.com/image-size/image-size/issues/411">#411</a>)</li> <li><a href="https://github.com/image-size/image-size/commit/9f482134b358dd83f58501ccc3b18df2305c9793"><code>9f48213</code></a> update dependencies, and reformat code with eslint 9</li> <li><a href="https://github.com/image-size/image-size/commit/64dda84cca1551e219a47b1ab1e3c51adc8db0e4"><code>64dda84</code></a> refactor formats that use a ISO-BMFF container</li> <li><a href="https://github.com/image-size/image-size/commit/e3ea53801dc3ca9d7548c063bfc39c2d8e159419"><code>e3ea538</code></a> no need to create hex strings in j2c</li> <li>Additional commits viewable in <a href="https://github.com/image-size/image-size/compare/v1.1.1...v1.2.1">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/microsoft/onnxruntime/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…ft#24180) ### Description - Add support to Softmax operator with opset < 13 to op builder in QNN-EP. ### Motivation and Context - Enhance QNN-EP support for Softmax with opset < 13.
### Description Since we are no longer supporting cuda 11, we want to update the `publish-nuget.ym`l to the correct feed. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
…crosoft#23908) ### Description This commit improve the MatMulNBits f16 Block32 prefill performance, by increasing tiling size and enhancing memory efficiency. Achieved a +2x performance boost on Intel iGPUs for Phi-3.5-mini f16 model. ### Motivation and Context See above.
ankitm3k
approved these changes
Apr 4, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Backmerging with Msft commits