Add index check for embedding kernel#11375
Merged
facebook-github-bot merged 1 commit intomainfrom Jun 5, 2025
Merged
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/11375
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 8dda88e with merge base 3550824 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Contributor
|
This pull request was exported from Phabricator. Differential Revision: D75982682 |
Gasoonjia
approved these changes
Jun 4, 2025
Contributor
|
@larryliu0820 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Summary:
index should always be smaller than weight.size(0). Adding this check in `op_embedding`.
This is to avoid wild-addr-read error:
```
AddressSanitizer:DEADLYSIGNAL
=================================================================
==3544359==ERROR: AddressSanitizer: SEGV on unknown address 0x7fce2364bc00 (pc 0x000002d225a0 bp 0x7ffffc792a40 sp 0x7ffffc792990 T0)
==3544359==The signal is caused by a READ memory access.
SCARINESS: 20 (wild-addr-read)
#0 0x2d225a0 in void torch::executor::native::(anonymous namespace)::embedding_byte_per_channel<signed char, c10::Half, float>(executorch::runtime::etensor::Tensor const&, executorch::runtime::etensor::Tensor const&, std::optional<executorch::runtime::etensor::Tensor> const&, executorch::runtime::etensor::Tensor const&, executorch::runtime::etensor::Tensor&) xplat/executorch/kernels/quantized/cpu/op_embedding.cpp:175
#1 0x2d22367 in torch::executor::native::quantized_embedding_byte_dtype_out(executorch::runtime::etensor::Tensor const&, executorch::runtime::etensor::Tensor const&, std::optional<executorch::runtime::etensor::Tensor> const&, long, long, executorch::runtime::etensor::Tensor const&, std::optional<executorch::runtime::etensor::ScalarType>, executorch::runtime::etensor::Tensor&)::$_0::operator()() const::'lambda0'()::operator()() const::'lambda'()::operator()() const::'lambda0'()::operator()() const::'lambda'()::operator()() const::'lambda'()::operator()() const xplat/executorch/kernels/quantized/cpu/op_embedding.cpp:303
#2 0x2d2223d in torch::executor::native::quantized_embedding_byte_dtype_out(executorch::runtime::etensor::Tensor const&, executorch::runtime::etensor::Tensor const&, std::optional<executorch::runtime::etensor::Tensor> const&, long, long, executorch::runtime::etensor::Tensor const&, std::optional<executorch::runtime::etensor::ScalarType>, executorch::runtime::etensor::Tensor&)::$_0::operator()() const::'lambda0'()::operator()() const::'lambda'()::operator()() const::'lambda0'()::operator()() const::'lambda'()::operator()() const xplat/executorch/kernels/quantized/cpu/op_embedding.cpp:303
#3 0x2d21d37 in torch::executor::native::quantized_embedding_byte_dtype_out(executorch::runtime::etensor::Tensor const&, executorch::runtime::etensor::Tensor const&, std::optional<executorch::runtime::etensor::Tensor> const&, long, long, executorch::runtime::etensor::Tensor const&, std::optional<executorch::runtime::etensor::ScalarType>, executorch::runtime::etensor::Tensor&)::$_0::operator()() const::'lambda0'()::operator()() const::'lambda'()::operator()() const::'lambda0'()::operator()() const xplat/executorch/kernels/quantized/cpu/op_embedding.cpp:303
#4 0x2d21bca in torch::executor::native::quantized_embedding_byte_dtype_out(executorch::runtime::etensor::Tensor const&, executorch::runtime::etensor::Tensor const&, std::optional<executorch::runtime::etensor::Tensor> const&, long, long, executorch::runtime::etensor::Tensor const&, std::optional<executorch::runtime::etensor::ScalarType>, executorch::runtime::etensor::Tensor&)::$_0::operator()() const::'lambda0'()::operator()() const::'lambda'()::operator()() const xplat/executorch/kernels/quantized/cpu/op_embedding.cpp:303
#5 0x2d20f8f in torch::executor::native::quantized_embedding_byte_dtype_out(executorch::runtime::etensor::Tensor const&, executorch::runtime::etensor::Tensor const&, std::optional<executorch::runtime::etensor::Tensor> const&, long, long, executorch::runtime::etensor::Tensor const&, std::optional<executorch::runtime::etensor::ScalarType>, executorch::runtime::etensor::Tensor&)::$_0::operator()() const::'lambda0'()::operator()() const xplat/executorch/kernels/quantized/cpu/op_embedding.cpp:303
#6 0x2d20e13 in torch::executor::native::quantized_embedding_byte_dtype_out(executorch::runtime::etensor::Tensor const&, executorch::runtime::etensor::Tensor const&, std::optional<executorch::runtime::etensor::Tensor> const&, long, long, executorch::runtime::etensor::Tensor const&, std::optional<executorch::runtime::etensor::ScalarType>, executorch::runtime::etensor::Tensor&)::$_0::operator()() const xplat/executorch/kernels/quantized/cpu/op_embedding.cpp:303
#7 0x2d20d06 in torch::executor::native::quantized_embedding_byte_dtype_out(executorch::runtime::etensor::Tensor const&, executorch::runtime::etensor::Tensor const&, std::optional<executorch::runtime::etensor::Tensor> const&, long, long, executorch::runtime::etensor::Tensor const&, std::optional<executorch::runtime::etensor::ScalarType>, executorch::runtime::etensor::Tensor&) xplat/executorch/kernels/quantized/cpu/op_embedding.cpp:303
#8 0x2d226b7 in torch::executor::native::quantized_embedding_byte_dtype_out(executorch::runtime::KernelRuntimeContext&, executorch::runtime::etensor::Tensor const&, executorch::runtime::etensor::Tensor const&, std::optional<executorch::runtime::etensor::Tensor> const&, long, long, executorch::runtime::etensor::Tensor const&, std::optional<executorch::runtime::etensor::ScalarType>, executorch::runtime::etensor::Tensor&) xplat/executorch/kernels/quantized/cpu/op_embedding.cpp:329
#9 0x2d09bef in torch::executor::function::(anonymous namespace)::$_7::operator()(executorch::runtime::KernelRuntimeContext&, executorch::runtime::EValue**) const buck-out/v2/gen/fbsource/ff19a7e6cb17a7b1/xplat/executorch/kernels/quantized/__generated_lib_combined__/out/RegisterCodegenUnboxedKernelsEverything.cpp:322
#10 0x2d09a70 in torch::executor::function::(anonymous namespace)::$_7::__invoke(executorch::runtime::KernelRuntimeContext&, executorch::runtime::EValue**) buck-out/v2/gen/fbsource/ff19a7e6cb17a7b1/xplat/executorch/kernels/quantized/__generated_lib_combined__/out/RegisterCodegenUnboxedKernelsEverything.cpp:297
#11 0x27d769b in executorch::runtime::Method::execute_instruction() xplat/executorch/runtime/executor/method.cpp:1306
#12 0x27d8c55 in executorch::runtime::Method::execute() xplat/executorch/runtime/executor/method.cpp:1550
#13 0x27b1e25 in executorch::extension::Module::execute(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, std::vector<executorch::runtime::EValue, std::allocator<executorch::runtime::EValue>> const&) xplat/executorch/extension/module/module.cpp:261
#14 0x27afe43 in executorch::extension::Module::forward(std::vector<executorch::runtime::EValue, std::allocator<executorch::runtime::EValue>> const&) xplat/executorch/extension/module/module.h:340
#15 0x27e0519 in executorch::extension::llm::LlmBackboneRunner::run(std::shared_ptr<executorch::runtime::etensor::Tensor> const&, std::shared_ptr<executorch::runtime::etensor::Tensor> const&, std::shared_ptr<executorch::runtime::etensor::Tensor> const&) xplat/executorch/examples/models/fb/llama4/runner/llm_backbone_runner.cpp:58
#16 0x27a35c9 in executorch::extension::llm::Llama4Runner::prefill_tokens(std::shared_ptr<executorch::runtime::etensor::Tensor> const&, std::shared_ptr<executorch::runtime::etensor::Tensor> const&, std::shared_ptr<executorch::runtime::etensor::Tensor> const&) xplat/executorch/examples/models/fb/llama4/runner/llama4_runner.cpp:133
#17 0x885774 in main (/data/users/larryliu/fbsource/buck-out/v2/gen/fbsource/ff19a7e6cb17a7b1/xplat/cria/benchmark/llama4/__generation_main__/generation_main+0x885774)
#18 0x7fce2122c656 in __libc_start_call_main /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/csu/../sysdeps/nptl/libc_start_call_main.h:58:16
#19 0x7fce2122c717 in __libc_start_main@GLIBC_2.2.5 /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/csu/../csu/libc-start.c:409:3
#20 0x884c20 in _start /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/csu/../sysdeps/x86_64/start.S:116
AddressSanitizer can not provide additional info.
AddressSanitizer: SEGV xplat/executorch/kernels/quantized/cpu/op_embedding.cpp:175 in void torch::executor::native::(anonymous namespace)::embedding_byte_per_channel<signed char, c10::Half, float>(executorch::runtime::etensor::Tensor const&, executorch::runtime::etensor::Tensor const&, std::optional<executorch::runtime::etensor::Tensor> const&, executorch::runtime::etensor::Tensor const&, executorch::runtime::etensor::Tensor&)
==3544359==ABORTING
```
Test Plan:
Imported from GitHub, without a `Test Plan:` line.
Rollback Plan:
Reviewed By: Gasoonjia
Differential Revision: D75982682
Pulled By: larryliu0820
c061bd5 to
8dda88e
Compare
Contributor
|
This pull request was exported from Phabricator. Differential Revision: D75982682 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary:
index should always be smaller than weight.size(0). Adding this check in
op_embedding.This is to avoid wild-addr-read error:
Differential Revision: D75982682