feat(core): Introduce kBool data type for Qnn ElewiseEqual Op#618
feat(core): Introduce kBool data type for Qnn ElewiseEqual Op#618chenghuaWang merged 4 commits intoUbiquitousLearning:mainfrom
Conversation
… for improved model accuracy in AOT run.
…sor shapes and values before exiting. This aids in troubleshooting during AOT runs.
…ons for improved type handling. Modify tensor output type in EqualOp to use kBool for better logical operations.
📝 WalkthroughWalkthroughThis PR introduces comprehensive support for a new boolean data type (kBool) throughout the mllm framework. Changes include defining the kBool type in the core DataTypes system, implementing QNN backend support for boolean mapping, updating the equality comparison operator to output boolean results, and adjusting related Qwen3 model configurations. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@mllm/backends/qnn/aot_rt/PromptProcessor.cpp`:
- Around line 151-153: The unconditional exit(0) in PromptProcessor.cpp inside
the prefill path aborts generation; remove the exit(0) and instead guard the
debug prints (the two print(...) calls referencing output_tensors_) behind a
runtime debug/log flag or use the existing logging facility so they don't
terminate execution—e.g., wrap the print calls in if (debug_enabled) {
print(...); } or replace with a debug log call; ensure prefill continues
normally so logits/sampling are not skipped.
In `@pymllm/mobile/README.md`:
- Line 1: The README sentence is awkward and misses a hyphen: reword it to be
tighter and hyphenate "torch-based" — e.g. change "We should refactor current
pymllm's src to mobile directory. And provide more functionalities for torch
based VLA." to a single clear sentence such as "Refactor pymllm's src into the
mobile directory and provide additional functionality for torch-based VLA."
Update the README.md to use this tightened, hyphenated phrasing.
| @@ -0,0 +1 @@ | |||
| We should refactor current pymllm's src to mobile directory. And provide more functionalities for torch based VLA. | |||
There was a problem hiding this comment.
Polish wording and hyphenation.
Consider tightening the sentence and hyphenating “torch-based.”
✏️ Suggested edit
-We should refactor current pymllm's src to mobile directory. And provide more functionalities for torch based VLA.
+We should refactor the current pymllm src into the mobile directory and provide more functionality for torch-based VLA.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| We should refactor current pymllm's src to mobile directory. And provide more functionalities for torch based VLA. | |
| We should refactor the current pymllm src into the mobile directory and provide more functionality for torch-based VLA. |
🧰 Tools
🪛 LanguageTool
[grammar] ~1-~1: Use a hyphen to join words.
Context: ...d provide more functionalities for torch based VLA.
(QB_NEW_EN_HYPHEN)
🤖 Prompt for AI Agents
In `@pymllm/mobile/README.md` at line 1, The README sentence is awkward and misses
a hyphen: reword it to be tighter and hyphenate "torch-based" — e.g. change "We
should refactor current pymllm's src to mobile directory. And provide more
functionalities for torch based VLA." to a single clear sentence such as
"Refactor pymllm's src into the mobile directory and provide additional
functionality for torch-based VLA." Update the README.md to use this tightened,
hyphenated phrasing.
…by removing unnecessary slicing, improving efficiency in parameter preparation.
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
examples/qwen3_qnn_aot/modeling_qwen_qnn_aot_sha.hpp (1)
763-804:⚠️ Potential issue | 🟠 MajorHandle both shared and concatenated q_norm/k_norm weight layouts to avoid shape mismatches.
Line 767 and Line 803 now copy the full
orig_weightinto every per-head RMSNorm without validating its size. If the checkpoint still stores concatenated weights (num_heads × head_dim or num_kv_heads × head_dim), each per-head RMSNorm will receive an oversized vector and likely fail or normalize incorrectly. Guard the layout and slice only when needed.🧩 Suggested fix (guard + conditional slicing)
@@ - for (int h = 0; h < num_heads; ++h) { + const int q_weight_numel = orig_weight.numel(); + MLLM_RT_ASSERT(q_weight_numel == head_dim || q_weight_numel == num_heads * head_dim); + for (int h = 0; h < num_heads; ++h) { std::string h_str = std::to_string(h); // Weight: slice per head std::string new_weight_name = layer_prefix + "q_norm." + h_str + ".weight"; // NOLINT - params->push(new_weight_name, orig_weight.contiguous().setMemType(kParamsNormal).setName(new_weight_name)); + Tensor weight_h = orig_weight; + if (q_weight_numel != head_dim) { + int start_idx = h * head_dim; + int end_idx = (h + 1) * head_dim; + weight_h = orig_weight.slice({{start_idx, end_idx}}, false); + } + params->push(new_weight_name, weight_h.contiguous().setMemType(kParamsNormal).setName(new_weight_name)); @@ - for (int h = 0; h < num_kv_heads; ++h) { + const int k_weight_numel = orig_weight.numel(); + MLLM_RT_ASSERT(k_weight_numel == head_dim || k_weight_numel == num_kv_heads * head_dim); + for (int h = 0; h < num_kv_heads; ++h) { std::string h_str = std::to_string(h); // Weight: slice per head std::string new_weight_name = layer_prefix + "k_norm." + h_str + ".weight"; // NOLINT - params->push(new_weight_name, orig_weight.contiguous().setMemType(kParamsNormal).setName(new_weight_name)); + Tensor weight_h = orig_weight; + if (k_weight_numel != head_dim) { + int start_idx = h * head_dim; + int end_idx = (h + 1) * head_dim; + weight_h = orig_weight.slice({{start_idx, end_idx}}, false); + } + params->push(new_weight_name, weight_h.contiguous().setMemType(kParamsNormal).setName(new_weight_name));
Summary by CodeRabbit
New Features
Improvements
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.