feat(Qnn AOT): AOT and AOT Runtime. Qwen3 AOT Mode.#567
feat(Qnn AOT): AOT and AOT Runtime. Qwen3 AOT Mode.#567chenghuaWang merged 11 commits intoUbiquitousLearning:mainfrom
Conversation
- Implemented CPUEqualOp and CPUWhereOp classes for element-wise comparison and conditional selection. - Added corresponding factory classes for both operations. - Integrated Equal and Where operations into the CPUBackend. - Updated Tensor class to include methods for equal comparisons with both Tensor and float inputs. - Enhanced the aops layer with EqualOp and WhereOp definitions, including forward and reshape methods. - Registered new operations in the IR and updated RTTI generation scripts. - Modified Functional API to expose where operation for user convenience. - Adjusted build configuration to use RelWithDebInfo for better debugging support.
… enhance CMake configuration
…ation and tensor renaming
…antization configuration
…ementations - Introduced new data types for per-tensor and per-channel quantization, including int8 and int16 types. - Updated the LinearOp class to handle new linear implementation types: kQNN_LPBQ_w4a16o16_G32 and kQNN_LPBQ_w4a16o16_G64. - Enhanced the DataTypes utility functions to accommodate the new quantization types. - Modified Tensor operations to support addition and multiplication with int16 per-tensor symmetric types. - Updated the Module class to handle vector arguments for tensors and AnyValue types.
WalkthroughAdds Qwen3 QNN AOT example and model implementation, introduces QNN AOT backend infrastructure (parsing, env/wrappers, AOT passes, pipeline, visitors), registers CPU Equal/Where ops with CPU factories and Tensor::equal API, extends quantization data types and LinearImpl variants, and updates build/config files. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant CLI as compile.cpp
participant QwenModel as Qwen3ForCausalLM
participant Trace as IR Trace
participant QnnEnv as QnnAOTEnv
participant PassMgr as PassManager
participant Passes as MarkQnnGraphPass / OpNamingPass
participant Output as MIR Writer
User->>CLI: run with model/config/AOT-config
CLI->>CLI: parse args, validate AOT config
CLI->>QwenModel: load model, params, gen RoPE, setup KV-cache
CLI->>Trace: call model.trace()
Trace-->>CLI: IR graph
CLI->>QnnEnv: init from target machine JSON
CLI->>PassMgr: create with createQnnAOTLoweringPipeline(env, cfg)
PassMgr->>Passes: run MarkQnnGraphPass (mark using_qnn)
PassMgr->>Passes: run OpNamingPass (assign deterministic names)
PassMgr-->>CLI: transformed IR
CLI->>Output: write final MIR
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 17
🧹 Nitpick comments (24)
mllm/backends/cpu/kernels/common/ggml/quantize/quantize.hpp (1)
129-135: LGTM! Include guards improve safety.The addition of
#ifndefguards around the macro definitions prevents potential redefinition errors and is a good defensive programming practice. The#endifat line 135 correctly closes the#elseblock from line 106.Optional: Consider consistency across all branches
The ARM_NEON (lines 76-77) and MSC_VER (lines 103-104) branches define the same macros without include guards. If macro redefinition is a concern for the
#elsebranch, consider whether the other branches should also be guarded for consistency:// ARM_NEON branch #ifndef MLLM_FP16_TO_FP32 #define MLLM_FP16_TO_FP32(x) ((float)(x)) #endif #ifndef MLLM_FP32_TO_FP16 #define MLLM_FP32_TO_FP16(x) ((mllm_fp16_t)x) #endif // Similar for MSC_VER branchAlternatively, if guards are only needed in the
#elsebranch due to specific platform requirements, this is fine as-is.mllm/nn/Module.hpp (1)
155-160: Consider preserving lvalue vector arguments.The current implementation unconditionally moves from vector elements, which will leave lvalue arguments in a moved-from state. When a user passes a vector by lvalue reference (e.g.,
std::vector<Tensor> v; module(v, ...)), they might not expectv's elements to be consumed.Consider using conditional forwarding based on value category to copy from lvalues and move from rvalues, or document that this function consumes vector arguments.
🔎 Proposed refinement using conditional move/copy
One approach is to check if the argument is an rvalue reference and conditionally move:
} else if constexpr (std::is_same_v<CleanType, std::vector<Tensor>>) { tensors.reserve(tensors.size() + args.size()); - for (auto& tensor : args) { tensors.push_back(std::move(tensor)); } + if constexpr (std::is_rvalue_reference_v<decltype(args)>) { + for (auto& tensor : args) { tensors.push_back(std::move(tensor)); } + } else { + for (const auto& tensor : args) { tensors.push_back(tensor); } + } } else if constexpr (std::is_same_v<CleanType, std::vector<AnyValue>>) { others.reserve(others.size() + args.size()); - for (auto& value : args) { others.push_back(std::move(value)); } + if constexpr (std::is_rvalue_reference_v<decltype(args)>) { + for (auto& value : args) { others.push_back(std::move(value)); } + } else { + for (const auto& value : args) { others.push_back(value); } + } }Alternatively, document that the function consumes vector arguments and update the function documentation to clarify move semantics.
mllm/backends/qnn/aot/QnnTargetMachineParser.cpp (2)
14-67: Consider using maps for improved scalability.The current if-chain approach works but could be refactored to use
std::unordered_map<std::string, EnumType>for O(1) lookups instead of O(n). This would improve maintainability and scalability, especially forparseChipsetwith 18 variants.🔎 Example refactor for parseHTPArch
QcomHTPArch parseHTPArch(const std::string& arch_str) { - if (arch_str == "NONE") return QcomHTPArch::NONE; - if (arch_str == "V68") return QcomHTPArch::V68; - if (arch_str == "V69") return QcomHTPArch::V69; - if (arch_str == "V73") return QcomHTPArch::V73; - if (arch_str == "V75") return QcomHTPArch::V75; - if (arch_str == "V79") return QcomHTPArch::V79; - if (arch_str == "V81") return QcomHTPArch::V81; - throw std::invalid_argument("Unknown HTP architecture: " + arch_str); + static const std::unordered_map<std::string, QcomHTPArch> arch_map = { + {"NONE", QcomHTPArch::NONE}, {"V68", QcomHTPArch::V68}, {"V69", QcomHTPArch::V69}, + {"V73", QcomHTPArch::V73}, {"V75", QcomHTPArch::V75}, {"V79", QcomHTPArch::V79}, + {"V81", QcomHTPArch::V81} + }; + auto it = arch_map.find(arch_str); + if (it != arch_map.end()) return it->second; + throw std::invalid_argument("Unknown HTP architecture: " + arch_str); }
69-86: Improve error messages with field existence checks.While the try-catch handles missing fields, it doesn't provide specific information about which field is missing. Adding explicit checks would improve debuggability.
🔎 Proposed improvement
QcomTargetMachine parseQcomTargetMachineFromJSON(const std::string& json_str) { try { auto json = nlohmann::json::parse(json_str); if (!json.contains("target_machine")) { throw std::invalid_argument("JSON must contain 'target_machine' field"); } auto tm = json["target_machine"]; + + // Validate all required fields are present + const std::vector<std::string> required_fields = { + "htp_arch", "htp_chipset", "htp_try_best_performance", + "htp_security_pd_session", "htp_vtcm_capability_in_mb" + }; + for (const auto& field : required_fields) { + if (!tm.contains(field)) { + throw std::invalid_argument("target_machine missing required field: " + field); + } + } QcomTargetMachine result; result.soc_htp_arch = parseHTPArch(tm["htp_arch"].get<std::string>()); result.soc_htp_chipset = parseChipset(tm["htp_chipset"].get<std::string>()); result.soc_htp_performance = parsePerformance(tm["htp_try_best_performance"].get<std::string>()); result.soc_htp_security_pd_session = parseSecurityPDSession(tm["htp_security_pd_session"].get<std::string>()); result.soc_htp_vtcm_total_memory_size = tm["htp_vtcm_capability_in_mb"].get<uint32_t>(); return result; } catch (const nlohmann::json::exception& e) { throw std::invalid_argument(std::string("JSON parsing error: ") + e.what()); } }mllm/core/Tensor.hpp (1)
420-426: Rename parametervto a more descriptive name.Static analysis flags
vas too short (minimum 3 characters required). Consider renaming tootherorvaluefor consistency with the project's naming conventions.🔎 Proposed fix
- Tensor equal(Tensor v); + Tensor equal(Tensor other); /** * @brief equal * @return uint8 mask */ - Tensor equal(float v); + Tensor equal(float value);mllm/backends/qnn/aot/passes/AOTPipeline.hpp (1)
6-9: Add missing<string>include.The function signature uses
std::stringbut the header is not explicitly included. While it may work via transitive includes, explicit includes improve maintainability.🔎 Proposed fix
#include <vector> +#include <string> #include "mllm/backends/qnn/aot/QnnWrappersAPI.hpp" #include "mllm/compile/passes/Pass.hpp"mllm/backends/qnn/aot/passes/AOTCompileContext.hpp (2)
19-21: Consider completing the Rule of 5.Static analysis flags that copy constructor and assignment are defined without destructor or move operations. For a singleton, explicitly deleting move operations and defaulting the destructor clarifies intent.
🔎 Proposed fix
// Delete copy constructor and assignment operator AOTCompileContext(const AOTCompileContext&) = delete; AOTCompileContext& operator=(const AOTCompileContext&) = delete; + AOTCompileContext(AOTCompileContext&&) = delete; + AOTCompileContext& operator=(AOTCompileContext&&) = delete; + ~AOTCompileContext() = default;
28-28: Rename parameterfpto a more descriptive name.Static analysis flags
fpas too short (minimum 3 characters). Consider renaming tofile_pathorconfig_pathfor clarity.🔎 Proposed fix
- void setConfig(const std::string& fp); + void setConfig(const std::string& config_path);mllm/backends/cpu/ops/WhereOp.cpp (1)
4-4: Unused include.The
<cstring>header is included but not used in this file.🔎 Proposed fix
-#include <cstring>mllm/backends/qnn/aot/visitor/Elewise.cpp (1)
29-31: Incomplete TODO needs description.The TODO comment lacks a description of what needs to be implemented. Per coding guidelines, TODO comments should follow the format
TODO: <description>. CurrentlyaddNodevalidates casts but performs no actual node addition logic.Would you like me to help outline the expected implementation for adding the QNN node, or should I open an issue to track this task?
🔎 Proposed fix for TODO format
- // TODO + // TODO: Implement QNN graph node addition for AddOpmllm/backends/qnn/aot/passes/AOTPipeline.cpp (2)
7-11: Consider null-checking theenvparameter.The
envpointer is passed directly tosetEnv()without validation. If a null pointer is passed, it could cause issues downstream when the environment is accessed.🔎 Proposed fix
std::vector<std::shared_ptr<ir::Pass>> createQnnAOTLoweringPipeline(QnnAOTEnv* env, const std::string& config_path) { std::vector<ir::Pass::ptr_t> ret; + if (!env) { + throw std::invalid_argument("QnnAOTEnv pointer cannot be null"); + } + AOTCompileContext::getInstance().setEnv(env); AOTCompileContext::getInstance().setConfig(config_path);
1-5: Missing license header.For consistency with other new files in this PR, consider adding the standard license header.
🔎 Proposed fix
+// Copyright (c) MLLM Team. +// Licensed under the MIT License. + #include "mllm/backends/qnn/aot/passes/AOTPipeline.hpp" #include "mllm/backends/qnn/aot/passes/AOTCompileContext.hpp"examples/qwen3_qnn_aot/CMakeLists.txt (1)
5-8: Consider validatingQAIRT_SDK_ROOTenvironment variable.If
QAIRT_SDK_ROOTis not set, the include paths will be empty, potentially causing cryptic compilation errors. Consider adding validation:🔎 Proposed fix
+if(NOT DEFINED ENV{QAIRT_SDK_ROOT}) + message(FATAL_ERROR "QAIRT_SDK_ROOT environment variable is not set. Please set it to your QNN SDK installation path.") +endif() + target_include_directories(mllm-qwen3-aot-c PRIVATE $ENV{QAIRT_SDK_ROOT}/include # QNN SDK include $ENV{QAIRT_SDK_ROOT}/include/QNN # QNN SDK include )mllm/backends/cpu/ops/WhereOp.hpp (1)
18-23: Inconsistentfinalspecifier on factory class.
CPUWhereOpFactoryis missing thefinalspecifier, whileCPUEqualOpFactoryinCmpOp.hppis markedfinal. For consistency, both factory classes should follow the same pattern.🔎 Proposed fix
-class CPUWhereOpFactory : public TypedOpFactory<OpTypes::kWhere, aops::WhereOpOptions> { +class CPUWhereOpFactory final : public TypedOpFactory<OpTypes::kWhere, aops::WhereOpOptions> {mllm/backends/qnn/aot/passes/OpNamingPass.cpp (1)
74-80: Unused parameter in lambda.The
writerparameter in the lambda is captured but never used. Consider using[[maybe_unused]]or simply omitting the parameter name.🔎 Proposed fix
writer.walk<ir::graph::CallGraphOp>( - [&](ir::IRWriter& /*writer*/, const ir::graph::CallGraphOp::ptr_t& call_op) -> ir::IRWriter::WalkResult { + [&](ir::IRWriter& /*unused*/, const ir::graph::CallGraphOp::ptr_t& call_op) -> ir::IRWriter::WalkResult {Or if the signature allows, use
auto:writer.walk<ir::graph::CallGraphOp>( - [&](ir::IRWriter& /*writer*/, const ir::graph::CallGraphOp::ptr_t& call_op) -> ir::IRWriter::WalkResult { + [&](auto&&, const ir::graph::CallGraphOp::ptr_t& call_op) -> ir::IRWriter::WalkResult {examples/qwen3_qnn_aot/compile.cpp (1)
23-24: Consider makingNandCLconfigurable.These constants control sequence length and cache length. For flexibility across different model configurations, consider exposing them via CLI arguments or deriving from model config.
mllm/backends/qnn/aot/passes/MarkQnnGraphPass.cpp (1)
35-35: Unnecessary trailing semicolon after function definition.Minor style issue - the semicolon after the closing brace of
recursiveVisitGraphis not needed.🔎 Proposed fix
- }); -}; + }); +}mllm/core/aops/WhereOp.hpp (1)
27-27: Inconsistentoptions()return type compared toEqualOp.
WhereOp::options()returns a mutable referenceWhereOpOptions&, while the relatedEqualOp::options()inCmpOp.hppreturnsconst EqualOpOptions&. For consistency and to prevent unintended modifications, consider using a const return type unless mutability is explicitly required.🔎 Proposed fix for consistency
- inline WhereOpOptions& options() { return options_; } + inline const WhereOpOptions& options() const { return options_; }mllm/core/DataTypes.hpp (1)
294-296: Non-ASCII characters in code comment.The comment contains Chinese text ("拼接" meaning "concatenate"). While valid UTF-8, using English in code comments improves accessibility for all contributors.
🔎 Proposed fix
// These are placeholder types for per-tensor and per-channel quantization. -// They cannot actually run; at runtime, they require three tensors (weight, scale, zp) to be拼接. +// They cannot actually run; at runtime, they require three tensors (weight, scale, zp) to be concatenated.examples/qwen3_qnn_aot/modeling_qwen_qnn_aot.hpp (3)
191-193: Magic number for attention masking.The value
-20used for masked attention should be a named constant with documentation explaining why this specific value was chosen. This improves readability and maintainability.🔎 Proposed fix
+ // Large negative value to effectively zero out masked attention scores after softmax + static constexpr float kMaskedAttnPenalty = -20.0f; + // Masked Softmax auto attn_min = attn.min(-1, true); - float minus_value = -20; - attn = nn::functional::where(causal_mask.equal(0.f), attn, attn_min + minus_value); + attn = nn::functional::where(causal_mask.equal(0.f), attn, attn_min + kMaskedAttnPenalty);
202-203: Public member variablelayer_idx_breaks encapsulation.
layer_idx_is declared public but is only set externally (line 253). Consider making it private with a setter, or initializing it via the constructor.🔎 Proposed fix
+ public: + void setLayerIdx(int idx) { layer_idx_ = idx; } + private: - int layer_idx_; + int layer_idx_ = 0; };And update the usage in line 253:
for (auto [idx, b] : enumerate(decode_blocks_.list())) { b.self_attn_.setLayerIdx(idx); }
325-343: Duplicate error handling logic for KV caches.The same pattern for checking and throwing on missing KV cache is repeated for both keys (lines 321-331) and values (lines 333-344). Consider extracting a helper to reduce duplication.
🔎 Proposed fix
+ auto getKVCache = [&](const std::string& name) -> Tensor { + if (input.count(name)) { + return input.at(name); + } + throw std::runtime_error("Missing KV cache: " + name); + }; + // Append Key for (int i = 0; i < cfg.num_hidden_layers; ++i) { - auto past_key_name = "past_key_" + std::to_string(i); - if (input.count(past_key_name)) { - kv_caches.push_back(input.at(past_key_name)); - } else { - throw std::runtime_error("Missing KV cache for layer " + std::to_string(i)); - } + kv_caches.push_back(getKVCache("past_key_" + std::to_string(i))); } // Append Value for (int i = 0; i < cfg.num_hidden_layers; ++i) { - auto past_value_name = "past_value_" + std::to_string(i); - if (input.count(past_value_name)) { - kv_caches.push_back(input.at(past_value_name)); - } else { - throw std::runtime_error("Missing KV cache for layer " + std::to_string(i)); - } + kv_caches.push_back(getKVCache("past_value_" + std::to_string(i))); }mllm/backends/qnn/aot/QnnWrappersAPI.cpp (1)
55-56: Consider usingsizeofinstead of magic number for int64 size.For clarity and maintainability, use
sizeof(int64_t)instead of the literal8.🔎 Proposed fix
case QNN_DATATYPE_INT_64: - case QNN_DATATYPE_UINT_64: return 8; + case QNN_DATATYPE_UINT_64: return sizeof(int64_t);mllm/backends/qnn/aot/QnnWrappersAPI.hpp (1)
28-28: Reserved identifier uses double underscore prefix.The function
__mllmLoggerCallback4QnnLoggeruses a double underscore prefix, which is reserved for compiler/standard library implementations. Consider renaming to avoid potential conflicts.🔎 Proposed fix
-void __mllmLoggerCallback4QnnLogger(const char* fmt, QnnLog_Level_t level, uint64_t times_tamp, va_list argp); +void mllmLoggerCallback4QnnLogger(const char* fmt, QnnLog_Level_t level, uint64_t times_tamp, va_list argp);Note: Also update the implementation in
QnnWrappersAPI.cppand any call sites.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (74)
examples/CMakeLists.txtexamples/qwen3_qnn_aot/CMakeLists.txtexamples/qwen3_qnn_aot/compile.cppexamples/qwen3_qnn_aot/config_1.7B.jsonexamples/qwen3_qnn_aot/modeling_qwen_qnn_aot.hppexamples/qwen3_qnn_aot/qnn_aot_cfg.jsonexamples/qwen3_qnn_aot/qwen3_qnn_aot.mirmllm/backends/cpu/CPUBackend.cppmllm/backends/cpu/kernels/common/ggml/quantize/quantize.hppmllm/backends/cpu/ops/CmpOp.cppmllm/backends/cpu/ops/CmpOp.hppmllm/backends/cpu/ops/LinearOp.cppmllm/backends/cpu/ops/WhereOp.cppmllm/backends/cpu/ops/WhereOp.hppmllm/backends/qnn/QNNUtils.hppmllm/backends/qnn/Register.cppmllm/backends/qnn/aot/QnnTargetMachineParser.cppmllm/backends/qnn/aot/QnnTargetMachineParser.hppmllm/backends/qnn/aot/QnnWrappersAPI.cppmllm/backends/qnn/aot/QnnWrappersAPI.hppmllm/backends/qnn/aot/passes/AOTCompileContext.cppmllm/backends/qnn/aot/passes/AOTCompileContext.hppmllm/backends/qnn/aot/passes/AOTPipeline.cppmllm/backends/qnn/aot/passes/AOTPipeline.hppmllm/backends/qnn/aot/passes/AutoInsertQuanAndDeQuantPass.cppmllm/backends/qnn/aot/passes/AutoInsertQuanAndDeQuantPass.hppmllm/backends/qnn/aot/passes/MarkQnnGraphPass.cppmllm/backends/qnn/aot/passes/MarkQnnGraphPass.hppmllm/backends/qnn/aot/passes/OpNamingPass.cppmllm/backends/qnn/aot/passes/OpNamingPass.hppmllm/backends/qnn/aot/passes/TensorRenamingPass.cppmllm/backends/qnn/aot/passes/TensorRenamingPass.hppmllm/backends/qnn/aot/visitor/Base.hppmllm/backends/qnn/aot/visitor/Concat.cppmllm/backends/qnn/aot/visitor/Concat.hppmllm/backends/qnn/aot/visitor/Elewise.cppmllm/backends/qnn/aot/visitor/Elewise.hppmllm/backends/qnn/aot/visitor/Equal.cppmllm/backends/qnn/aot/visitor/Equal.hppmllm/backends/qnn/aot/visitor/Index.cppmllm/backends/qnn/aot/visitor/Index.hppmllm/backends/qnn/aot/visitor/Reduce.cppmllm/backends/qnn/aot/visitor/Reduce.hppmllm/backends/qnn/aot/visitor/Repeat.cppmllm/backends/qnn/aot/visitor/Repeat.hppmllm/backends/qnn/aot/visitor/Slice.cppmllm/backends/qnn/aot/visitor/Slice.hppmllm/backends/qnn/aot/visitor/Transpose.cppmllm/backends/qnn/aot/visitor/Transpose.hppmllm/backends/qnn/aot/visitor/Where.cppmllm/backends/qnn/aot/visitor/Where.hppmllm/compile/ir/GeneratedRTTIKind.hppmllm/compile/ir/Node.cppmllm/compile/ir/Node.hppmllm/compile/ir/NodeRTTIClassOfImpl.hppmllm/compile/ir/graph/Op.cppmllm/compile/ir/linalg/Op.cppmllm/compile/ir/linalg/Op.hppmllm/compile/ir/rtti_kind_gen.pymllm/compile/ir/tensor/Op.cppmllm/core/DataTypes.cppmllm/core/DataTypes.hppmllm/core/OpTypes.hppmllm/core/Tensor.cppmllm/core/Tensor.hppmllm/core/aops/CmpOp.cppmllm/core/aops/CmpOp.hppmllm/core/aops/LinearOp.hppmllm/core/aops/WhereOp.cppmllm/core/aops/WhereOp.hppmllm/nn/Functional.cppmllm/nn/Functional.hppmllm/nn/Module.hpptasks/build_x86_qnn_aot.yaml
💤 Files with no reviewable changes (1)
- mllm/backends/qnn/Register.cpp
🧰 Additional context used
📓 Path-based instructions (5)
{mllm,mllm-cli,pymllm}/**/*
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*: Files must not contain C0 control codes 0x00–0x08, 0x0B–0x0C, 0x0E–0x1F, C1 control codes 0x7F–0x9F, or DEL 0x7F. Horizontal tab (0x09) and line feed (0x0A) are explicitly allowed.
All files must be encoded in UTF-8 without BOM.
Any violation of character set (Rule 1) or encoding (Rule 2) requirements must cause the review to fail.
No line may end with trailing whitespace.
Use Unix line endings (LF).
File and directory names must consist only of printable Unicode characters, excluding C0 control codes 0x00–0x08, 0x0B–0x0C, 0x0E–0x1F, C1 control codes 0x7F–0x9F, and DEL 0x7F.
Only use acceptable file extensions: .c, .cc, .cpp, .cxx, .h, .hh, .hpp, .py, .pyi, .sh, .txt, .md, .yml, .yaml, .json, .toml.
Optional license headers, if present, must comply with character set rules (no C0/C1 control codes except tab and line feed).
Files:
mllm/core/Tensor.hppmllm/nn/Functional.hppmllm/compile/ir/GeneratedRTTIKind.hppmllm/backends/qnn/aot/passes/AOTPipeline.hppmllm/backends/qnn/QNNUtils.hppmllm/core/OpTypes.hppmllm/backends/qnn/aot/passes/AOTCompileContext.cppmllm/backends/cpu/ops/CmpOp.hppmllm/core/aops/CmpOp.hppmllm/backends/qnn/aot/visitor/Elewise.cppmllm/backends/qnn/aot/passes/OpNamingPass.hppmllm/backends/qnn/aot/passes/AOTCompileContext.hppmllm/nn/Module.hppmllm/compile/ir/rtti_kind_gen.pymllm/core/aops/WhereOp.hppmllm/core/DataTypes.cppmllm/core/aops/LinearOp.hppmllm/backends/cpu/ops/WhereOp.hppmllm/backends/cpu/ops/CmpOp.cppmllm/backends/qnn/aot/visitor/Elewise.hppmllm/backends/qnn/aot/QnnTargetMachineParser.hppmllm/backends/cpu/ops/WhereOp.cppmllm/nn/Functional.cppmllm/backends/qnn/aot/passes/OpNamingPass.cppmllm/backends/qnn/aot/visitor/Base.hppmllm/core/aops/WhereOp.cppmllm/compile/ir/NodeRTTIClassOfImpl.hppmllm/backends/qnn/aot/passes/MarkQnnGraphPass.hppmllm/backends/qnn/aot/passes/AOTPipeline.cppmllm/backends/cpu/kernels/common/ggml/quantize/quantize.hppmllm/compile/ir/Node.hppmllm/compile/ir/Node.cppmllm/core/DataTypes.hppmllm/compile/ir/graph/Op.cppmllm/compile/ir/linalg/Op.hppmllm/backends/cpu/CPUBackend.cppmllm/core/Tensor.cppmllm/backends/cpu/ops/LinearOp.cppmllm/backends/qnn/aot/QnnTargetMachineParser.cppmllm/compile/ir/tensor/Op.cppmllm/compile/ir/linalg/Op.cppmllm/backends/qnn/aot/passes/MarkQnnGraphPass.cppmllm/backends/qnn/aot/QnnWrappersAPI.cppmllm/core/aops/CmpOp.cppmllm/backends/qnn/aot/QnnWrappersAPI.hpp
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi,sh}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi,sh}: TODO and FIXME comments must be written as 'TODO:' or 'FIXME:' followed by UTF-8 text that adheres to character set rules.
Encourage consistent coding style and patterns with the existing codebase.
Ensure code is portable across supported platforms (e.g., Linux, Windows) unless explicitly platform-specific.
Files:
mllm/core/Tensor.hppmllm/nn/Functional.hppmllm/compile/ir/GeneratedRTTIKind.hppmllm/backends/qnn/aot/passes/AOTPipeline.hppmllm/backends/qnn/QNNUtils.hppmllm/core/OpTypes.hppmllm/backends/qnn/aot/passes/AOTCompileContext.cppmllm/backends/cpu/ops/CmpOp.hppmllm/core/aops/CmpOp.hppmllm/backends/qnn/aot/visitor/Elewise.cppmllm/backends/qnn/aot/passes/OpNamingPass.hppmllm/backends/qnn/aot/passes/AOTCompileContext.hppmllm/nn/Module.hppmllm/compile/ir/rtti_kind_gen.pymllm/core/aops/WhereOp.hppmllm/core/DataTypes.cppmllm/core/aops/LinearOp.hppmllm/backends/cpu/ops/WhereOp.hppmllm/backends/cpu/ops/CmpOp.cppmllm/backends/qnn/aot/visitor/Elewise.hppmllm/backends/qnn/aot/QnnTargetMachineParser.hppmllm/backends/cpu/ops/WhereOp.cppmllm/nn/Functional.cppmllm/backends/qnn/aot/passes/OpNamingPass.cppmllm/backends/qnn/aot/visitor/Base.hppmllm/core/aops/WhereOp.cppmllm/compile/ir/NodeRTTIClassOfImpl.hppmllm/backends/qnn/aot/passes/MarkQnnGraphPass.hppmllm/backends/qnn/aot/passes/AOTPipeline.cppmllm/backends/cpu/kernels/common/ggml/quantize/quantize.hppmllm/compile/ir/Node.hppmllm/compile/ir/Node.cppmllm/core/DataTypes.hppmllm/compile/ir/graph/Op.cppmllm/compile/ir/linalg/Op.hppmllm/backends/cpu/CPUBackend.cppmllm/core/Tensor.cppmllm/backends/cpu/ops/LinearOp.cppmllm/backends/qnn/aot/QnnTargetMachineParser.cppmllm/compile/ir/tensor/Op.cppmllm/compile/ir/linalg/Op.cppmllm/backends/qnn/aot/passes/MarkQnnGraphPass.cppmllm/backends/qnn/aot/QnnWrappersAPI.cppmllm/core/aops/CmpOp.cppmllm/backends/qnn/aot/QnnWrappersAPI.hpp
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi}: Ensure public APIs, classes, and functions have clear docstrings or comments explaining purpose, parameters, returns, and errors.
Adhere to language-specific best practices and idioms (e.g., PEP 8 for Python, Google C++ Style Guide for C++).
Files:
mllm/core/Tensor.hppmllm/nn/Functional.hppmllm/compile/ir/GeneratedRTTIKind.hppmllm/backends/qnn/aot/passes/AOTPipeline.hppmllm/backends/qnn/QNNUtils.hppmllm/core/OpTypes.hppmllm/backends/qnn/aot/passes/AOTCompileContext.cppmllm/backends/cpu/ops/CmpOp.hppmllm/core/aops/CmpOp.hppmllm/backends/qnn/aot/visitor/Elewise.cppmllm/backends/qnn/aot/passes/OpNamingPass.hppmllm/backends/qnn/aot/passes/AOTCompileContext.hppmllm/nn/Module.hppmllm/compile/ir/rtti_kind_gen.pymllm/core/aops/WhereOp.hppmllm/core/DataTypes.cppmllm/core/aops/LinearOp.hppmllm/backends/cpu/ops/WhereOp.hppmllm/backends/cpu/ops/CmpOp.cppmllm/backends/qnn/aot/visitor/Elewise.hppmllm/backends/qnn/aot/QnnTargetMachineParser.hppmllm/backends/cpu/ops/WhereOp.cppmllm/nn/Functional.cppmllm/backends/qnn/aot/passes/OpNamingPass.cppmllm/backends/qnn/aot/visitor/Base.hppmllm/core/aops/WhereOp.cppmllm/compile/ir/NodeRTTIClassOfImpl.hppmllm/backends/qnn/aot/passes/MarkQnnGraphPass.hppmllm/backends/qnn/aot/passes/AOTPipeline.cppmllm/backends/cpu/kernels/common/ggml/quantize/quantize.hppmllm/compile/ir/Node.hppmllm/compile/ir/Node.cppmllm/core/DataTypes.hppmllm/compile/ir/graph/Op.cppmllm/compile/ir/linalg/Op.hppmllm/backends/cpu/CPUBackend.cppmllm/core/Tensor.cppmllm/backends/cpu/ops/LinearOp.cppmllm/backends/qnn/aot/QnnTargetMachineParser.cppmllm/compile/ir/tensor/Op.cppmllm/compile/ir/linalg/Op.cppmllm/backends/qnn/aot/passes/MarkQnnGraphPass.cppmllm/backends/qnn/aot/QnnWrappersAPI.cppmllm/core/aops/CmpOp.cppmllm/backends/qnn/aot/QnnWrappersAPI.hpp
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,py,pyi}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,py,pyi}: Prioritize production-ready code quality by evaluating time and space complexity of algorithms and data structures, and suggest more efficient alternatives for operations with high complexity (e.g., O(n^2) or worse) when feasible.
Avoid unnecessary object creation in loops or hot paths.
Check for proper error handling and resource cleanup (e.g., using try-finally, context managers, or RAII).
Ensure functions that can fail return appropriate error codes or raise exceptions.
Validate inputs for public APIs and critical internal functions.
Add comments for complex algorithms or non-obvious logic.
Identify potential security issues (e.g., buffer overflows, injection risks, insecure temporary files) and recommend using secure alternatives (e.g., parameterized queries, secure random generators).
Suggest adding unit tests for untested complex logic or edge cases.
Ensure code is testable by avoiding global state and using dependency injection.
Flag overly complex functions (e.g., high cyclomatic complexity) and suggest breaking them down.
Use named constants instead of magic numbers.
Add appropriate logging (e.g., debug, info, warning, error) for significant events and errors, avoiding sensitive data exposure.
Files:
mllm/backends/qnn/aot/passes/AOTCompileContext.cppmllm/backends/qnn/aot/visitor/Elewise.cppmllm/compile/ir/rtti_kind_gen.pymllm/core/DataTypes.cppmllm/backends/cpu/ops/CmpOp.cppmllm/backends/cpu/ops/WhereOp.cppmllm/nn/Functional.cppmllm/backends/qnn/aot/passes/OpNamingPass.cppmllm/core/aops/WhereOp.cppmllm/backends/qnn/aot/passes/AOTPipeline.cppmllm/compile/ir/Node.cppmllm/compile/ir/graph/Op.cppmllm/backends/cpu/CPUBackend.cppmllm/core/Tensor.cppmllm/backends/cpu/ops/LinearOp.cppmllm/backends/qnn/aot/QnnTargetMachineParser.cppmllm/compile/ir/tensor/Op.cppmllm/compile/ir/linalg/Op.cppmllm/backends/qnn/aot/passes/MarkQnnGraphPass.cppmllm/backends/qnn/aot/QnnWrappersAPI.cppmllm/core/aops/CmpOp.cpp
{mllm,mllm-cli,pymllm}/**/*.{sh,py}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
If a file starts with a shebang, it must be the first line and contain only the interpreter path and optional single argument, encoded in UTF-8.
Files:
mllm/compile/ir/rtti_kind_gen.py
🧬 Code graph analysis (27)
mllm/backends/qnn/aot/passes/AOTPipeline.hpp (1)
mllm/backends/qnn/aot/passes/AOTCompileContext.hpp (2)
env(26-26)env(26-26)
mllm/backends/qnn/aot/passes/AOTCompileContext.cpp (1)
mllm/backends/qnn/aot/passes/AOTCompileContext.hpp (1)
fp(28-28)
mllm/backends/cpu/ops/CmpOp.hpp (3)
mllm/backends/cpu/ops/WhereOp.hpp (3)
options(20-22)options(20-20)inputs(15-15)mllm/core/aops/CmpOp.hpp (3)
inputs(21-21)inputs(23-23)inputs(25-25)mllm/core/aops/WhereOp.hpp (3)
inputs(21-21)inputs(23-23)inputs(25-25)
mllm/core/aops/CmpOp.hpp (5)
mllm/core/BaseOp.hpp (5)
ploader(159-159)trace_context(161-161)inputs(163-163)inputs(163-163)inputs(165-165)mllm/core/aops/WhereOp.hpp (6)
ploader(17-17)trace_context(19-19)inputs(21-21)inputs(23-23)inputs(25-25)options_(27-27)mllm/core/aops/PadOp.hpp (3)
ploader(28-28)trace_context(30-30)options_(38-38)mllm/core/aops/MatMulOp.hpp (3)
ploader(54-54)trace_context(56-56)options_(64-64)mllm/core/aops/IndexOp.hpp (3)
ploader(20-20)trace_context(22-22)options_(30-30)
mllm/backends/qnn/aot/visitor/Elewise.cpp (4)
mllm/backends/qnn/aot/passes/OpNamingPass.hpp (1)
op(17-17)mllm/backends/qnn/aot/visitor/Base.hpp (3)
op(17-17)op(17-17)g_name(21-23)mllm/backends/qnn/aot/visitor/Elewise.hpp (2)
op(15-15)g_name(17-18)mllm/backends/qnn/aot/passes/MarkQnnGraphPass.hpp (1)
op(17-17)
examples/qwen3_qnn_aot/compile.cpp (6)
mllm/utils/Argparse.hpp (1)
Argparse(135-135)examples/qwen3_qnn_aot/modeling_qwen_qnn_aot.hpp (6)
Qwen3ForCausalLM(297-309)Qwen3ForCausalLM(297-297)makeRoPEInvFreq(17-22)makeRoPEInvFreq(17-17)makeRotaryPosEmbedding(24-71)makeRotaryPosEmbedding(24-25)mllm/backends/qnn/aot/QnnWrappersAPI.hpp (2)
QnnAOTEnv(232-232)QnnAOTEnv(234-234)mllm/backends/qnn/aot/QnnWrappersAPI.cpp (2)
QnnAOTEnv(250-250)QnnAOTEnv(252-254)mllm/backends/qnn/aot/QnnTargetMachineParser.cpp (2)
parseQcomTargetMachineFromJSONFile(88-96)parseQcomTargetMachineFromJSONFile(88-88)mllm/backends/qnn/aot/passes/AOTPipeline.cpp (2)
createQnnAOTLoweringPipeline(7-17)createQnnAOTLoweringPipeline(7-7)
mllm/nn/Module.hpp (1)
mllm/utils/AnyValue.hpp (2)
value(90-95)value(90-91)
mllm/core/aops/WhereOp.hpp (2)
mllm/backends/cpu/ops/WhereOp.hpp (3)
options(20-22)options(20-20)inputs(15-15)mllm/core/aops/CmpOp.hpp (6)
ploader(17-17)trace_context(19-19)inputs(21-21)inputs(23-23)inputs(25-25)options_(27-27)
mllm/backends/cpu/ops/CmpOp.cpp (2)
mllm/core/aops/CmpOp.hpp (3)
inputs(21-21)inputs(23-23)inputs(25-25)mllm/core/Tensor.cpp (2)
dtype(402-402)dtype(402-402)
examples/qwen3_qnn_aot/modeling_qwen_qnn_aot.hpp (2)
mllm/models/qwen_npu/modeling_qwen_npu_cpu.hpp (2)
self_attn_(214-214)embedding_(277-277)mllm/models/qwen_npu/modeling_qwen_npu.hpp (1)
block(442-444)
mllm/backends/qnn/aot/visitor/Elewise.hpp (2)
mllm/backends/qnn/aot/visitor/Base.hpp (3)
op(17-17)op(17-17)g_name(21-23)mllm/backends/qnn/aot/passes/MarkQnnGraphPass.hpp (1)
op(17-17)
mllm/backends/qnn/aot/QnnTargetMachineParser.hpp (1)
mllm/engine/ConfigFile.hpp (1)
json_str(39-39)
mllm/backends/cpu/ops/WhereOp.cpp (3)
mllm/backends/cpu/ops/WhereOp.hpp (4)
CPUWhereOp(13-13)options(20-22)options(20-20)inputs(15-15)mllm/core/aops/WhereOp.cpp (2)
forward(23-25)forward(23-23)mllm/core/aops/WhereOp.hpp (3)
inputs(21-21)inputs(23-23)inputs(25-25)
mllm/nn/Functional.cpp (3)
mllm/core/Tensor.hpp (4)
v(26-29)v(26-26)v(32-35)v(32-32)mllm/compile/ir/linalg/Op.hpp (1)
ctx(155-155)mllm/core/Tensor.cpp (1)
instance(324-324)
mllm/backends/qnn/aot/passes/OpNamingPass.cpp (1)
mllm/backends/qnn/aot/passes/OpNamingPass.hpp (1)
op(17-17)
mllm/backends/qnn/aot/visitor/Base.hpp (1)
mllm/backends/qnn/aot/visitor/Elewise.hpp (2)
op(15-15)g_name(17-18)
mllm/core/aops/WhereOp.cpp (2)
mllm/core/aops/WhereOp.hpp (6)
WhereOp(15-15)ploader(17-17)trace_context(19-19)inputs(21-21)inputs(23-23)inputs(25-25)mllm/backends/cpu/ops/WhereOp.hpp (3)
options(20-22)options(20-20)inputs(15-15)
mllm/backends/qnn/aot/passes/MarkQnnGraphPass.hpp (2)
mllm/backends/qnn/aot/visitor/Elewise.hpp (1)
op(15-15)mllm/compile/ir/Node.hpp (2)
op(468-468)op(470-470)
mllm/backends/qnn/aot/passes/AOTPipeline.cpp (5)
mllm/backends/qnn/aot/passes/AOTCompileContext.hpp (2)
env(26-26)env(26-26)mllm/backends/qnn/aot/passes/MarkQnnGraphPass.cpp (2)
createMarkQnnGraphPass(67-67)createMarkQnnGraphPass(67-67)mllm/backends/qnn/aot/passes/MarkQnnGraphPass.hpp (1)
createMarkQnnGraphPass(20-20)mllm/backends/qnn/aot/passes/OpNamingPass.cpp (2)
createOpNamingPass(88-88)createOpNamingPass(88-88)mllm/backends/qnn/aot/passes/OpNamingPass.hpp (1)
createOpNamingPass(20-20)
mllm/compile/ir/graph/Op.cpp (2)
mllm/compile/ir/Node.cpp (2)
dumpAttributes(17-30)dumpAttributes(17-17)mllm/compile/ir/Node.hpp (7)
p(56-56)p(56-56)p(58-58)p(131-131)p(163-163)p(187-187)p(198-198)
mllm/compile/ir/linalg/Op.hpp (4)
mllm/core/aops/CmpOp.cpp (1)
EqualOp(36-36)mllm/core/aops/CmpOp.hpp (1)
EqualOp(15-15)mllm/core/aops/WhereOp.cpp (1)
WhereOp(12-12)mllm/core/aops/WhereOp.hpp (1)
WhereOp(15-15)
mllm/backends/qnn/aot/QnnTargetMachineParser.cpp (3)
pymllm/ffi/__init__.py (25)
NONE(341-342)V68(345-346)V69(349-350)V73(353-354)V75(357-358)V79(361-362)V81(365-366)UNKNOWN_SM(374-375)SA8295(378-379)SM8350(382-383)SM8450(386-387)SM8475(390-391)SM8550(394-395)SM8650(398-399)SM8750(402-403)SM8850(406-407)SSG2115P(410-411)SSG2125P(414-415)SXR1230P(418-419)SXR2230P(422-423)SXR2330P(426-427)QCS9100(430-431)SAR2230P(434-435)SA8255(438-439)SW6100(442-443)mllm/engine/ConfigFile.hpp (1)
json_str(39-39)mllm/backends/qnn/aot/QnnTargetMachineParser.hpp (1)
parseQcomTargetMachineFromJSON(23-23)
mllm/compile/ir/linalg/Op.cpp (4)
mllm/core/aops/CmpOp.cpp (1)
EqualOp(36-36)mllm/core/aops/CmpOp.hpp (1)
EqualOp(15-15)mllm/core/aops/WhereOp.cpp (1)
WhereOp(12-12)mllm/core/aops/WhereOp.hpp (1)
WhereOp(15-15)
mllm/backends/qnn/aot/passes/MarkQnnGraphPass.cpp (4)
mllm/compile/ir/linalg/Op.hpp (1)
ctx(155-155)mllm/compile/ir/Node.cpp (3)
IRWriter(290-291)name(202-202)name(202-202)mllm/compile/ir/Node.hpp (6)
IRWriter(354-354)IRWriter(355-355)op(468-468)op(470-470)name(302-302)name(306-306)mllm/backends/qnn/aot/passes/MarkQnnGraphPass.hpp (1)
op(17-17)
mllm/backends/qnn/aot/QnnWrappersAPI.cpp (1)
mllm/backends/qnn/aot/QnnWrappersAPI.hpp (12)
QnnAOTParamScalar(51-51)QnnAOTParamScalar(53-53)QnnAOTParamScalar(55-55)name(236-236)name(238-238)name(240-240)v(102-102)v(102-102)param(133-133)param(137-137)QnnAOTEnv(232-232)QnnAOTEnv(234-234)
mllm/core/aops/CmpOp.cpp (2)
mllm/core/aops/CmpOp.hpp (7)
EqualOp(15-15)ploader(17-17)trace_context(19-19)inputs(21-21)inputs(23-23)inputs(25-25)options_(27-27)mllm/backends/cpu/ops/CmpOp.hpp (3)
options(20-22)options(20-20)inputs(15-15)
mllm/backends/qnn/aot/QnnWrappersAPI.hpp (2)
mllm/backends/qnn/QNNUtils.hpp (2)
name(191-192)name(195-196)mllm/backends/qnn/aot/QnnWrappersAPI.cpp (2)
QnnAOTParamTensor(91-112)QnnAOTParamTensor(114-117)
🪛 Clang (14.0.6)
mllm/core/Tensor.hpp
[error] 420-420: parameter name 'v' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 426-426: parameter name 'v' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
mllm/core/aops/CmpOp.hpp
[error] 27-27: function 'options' should be marked [[nodiscard]]
(modernize-use-nodiscard,-warnings-as-errors)
[error] 30-30: member variable 'options_' has protected visibility
(cppcoreguidelines-non-private-member-variables-in-classes,-warnings-as-errors)
mllm/backends/qnn/aot/passes/OpNamingPass.hpp
[error] 11-11: class 'OpNamingPass' defines a default destructor but does not define a copy constructor, a copy assignment operator, a move constructor or a move assignment operator
(cppcoreguidelines-special-member-functions,-warnings-as-errors)
mllm/backends/qnn/aot/passes/AOTCompileContext.hpp
[error] 11-11: constructor does not initialize these fields: config_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 11-11: class 'AOTCompileContext' defines a copy constructor and a copy assignment operator but does not define a destructor, a move constructor or a move assignment operator
(cppcoreguidelines-special-member-functions,-warnings-as-errors)
[error] 28-28: parameter name 'fp' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 34-34: constructor does not initialize these fields: config_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
examples/qwen3_qnn_aot/compile.cpp
[error] 15-15: variable 'MLLM_MAIN' is non-const and globally accessible, consider making it const
(cppcoreguidelines-avoid-non-const-global-variables,-warnings-as-errors)
[error] 15-15: invalid case style for variable 'MLLM_MAIN'
(readability-identifier-naming,-warnings-as-errors)
mllm/core/aops/WhereOp.hpp
[error] 30-30: member variable 'options_' has protected visibility
(cppcoreguidelines-non-private-member-variables-in-classes,-warnings-as-errors)
examples/qwen3_qnn_aot/modeling_qwen_qnn_aot.hpp
[error] 17-17: 2 adjacent parameters of 'makeRoPEInvFreq' of convertible types are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 24-24: 2 adjacent parameters of 'makeRotaryPosEmbedding' of convertible types are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 73-73: constructor does not initialize these fields: gate_proj_, up_proj_, down_proj_, silu_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 81-81: constructor does not initialize these fields: gate_proj_, up_proj_, down_proj_, silu_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 81-81: 2 adjacent parameters of 'Qwen3MLP' of similar type ('const int &') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 88-88: 2 adjacent parameters of 'forward' of similar type ('const int &') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 89-89: variable name 'x' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 91-91: variable name 'y' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 98-98: constructor does not initialize these fields: q_proj_, k_proj_, v_proj_, o_proj_, rms_norm_q_, rms_norm_k_, q_rope_, k_rope_, mask_, softmax_, hidden_size_, head_dim_, num_attention_heads_, num_key_value_heads_, num_key_value_groups_, scale_, layer_idx_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 120-120: constructor does not initialize these fields: q_proj_, k_proj_, v_proj_, o_proj_, rms_norm_q_, rms_norm_k_, q_rope_, k_rope_, mask_, softmax_, hidden_size_, head_dim_, num_attention_heads_, num_key_value_heads_, layer_idx_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 120-120: 2 adjacent parameters of 'Qwen3Attention' of similar type ('const int &') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 125-125: 'num_key_value_groups_' should be initialized in a member initializer of the constructor
(cppcoreguidelines-prefer-member-initializer,-warnings-as-errors)
[error] 126-126: 'scale_' should be initialized in a member initializer of the constructor
(cppcoreguidelines-prefer-member-initializer,-warnings-as-errors)
[error] 126-126: floating point literal has suffix 'f', which is not uppercase
(readability-uppercase-literal-suffix,-warnings-as-errors)
[error] 145-145: 2 adjacent parameters of 'forward' of similar type ('const int &') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 146-146: variable name 'x' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 179-179: variable name 'kh' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 180-180: variable name 'vh' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 192-192: 20 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 195-195: variable name 'y' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 202-202: member variable 'layer_idx_' has public visibility
(cppcoreguidelines-non-private-member-variables-in-classes,-warnings-as-errors)
[error] 205-205: constructor does not initialize these fields: input_layer_norm_, post_attention_layer_norm_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 214-214: constructor does not initialize these fields: input_layer_norm_, post_attention_layer_norm_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 214-214: 2 adjacent parameters of 'Qwen3Decoder' of similar type ('const int &') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 221-221: 2 adjacent parameters of 'forward' of similar type ('const int &') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 228-228: variable name 'x' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 229-229: variable name '_' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 239-239: constructor does not initialize these fields: decode_blocks_, norm_, embedding_, rope_sin_, rope_cos_, num_hidden_layers_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 250-250: constructor does not initialize these fields: decode_blocks_, norm_, embedding_, rope_sin_, rope_cos_, num_hidden_layers_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 250-250: 2 adjacent parameters of 'Qwen3Text' of similar type ('const int &') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 260-260: 2 adjacent parameters of 'forward' of similar type ('const int &') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 264-264: variable name 'x' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 297-297: constructor does not initialize these fields: lm_head_, tie_word_embeddings_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 311-311: 2 adjacent parameters of 'trace' of similar type ('const int &') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 313-313: variable 'llm_ir' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 350-350: variable 'position_ids' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 389-389: 2 adjacent parameters of 'forward' of similar type ('const int &') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
mllm/backends/qnn/aot/visitor/Elewise.hpp
[error] 15-15: parameter name 'op' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 17-17: parameter name 'op' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
mllm/nn/Functional.cpp
[error] 203-203: 3 adjacent parameters of 'where' of similar type ('const int &') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
mllm/backends/qnn/aot/passes/OpNamingPass.cpp
[error] 21-21: 2 adjacent parameters of 'visitSubGraph' of similar type ('const int &') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 49-49: 2 adjacent parameters of 'visitCallGraph' of similar type ('const int &') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
mllm/backends/qnn/aot/visitor/Base.hpp
[error] 17-17: method 'isMatch' can be made static
(readability-convert-member-functions-to-static,-warnings-as-errors)
[error] 17-17: parameter name 'op' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 19-19: method 'rewrite' can be made static
(readability-convert-member-functions-to-static,-warnings-as-errors)
[error] 21-21: parameter name 'op' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
mllm/compile/ir/NodeRTTIClassOfImpl.hpp
[error] 223-223: function-like macro 'RTTI_RK_OP_LINALGIROP_EQUALOP_IMPL' used; consider a 'constexpr' template function
(cppcoreguidelines-macro-usage,-warnings-as-errors)
[error] 226-226: function-like macro 'RTTI_RK_OP_LINALGIROP_WHEREOP_IMPL' used; consider a 'constexpr' template function
(cppcoreguidelines-macro-usage,-warnings-as-errors)
mllm/backends/qnn/aot/passes/MarkQnnGraphPass.hpp
[error] 11-11: class 'MarkQnnGraphPass' defines a default destructor but does not define a copy constructor, a copy assignment operator, a move constructor or a move assignment operator
(cppcoreguidelines-special-member-functions,-warnings-as-errors)
mllm/backends/cpu/kernels/common/ggml/quantize/quantize.hpp
[error] 130-130: function-like macro 'MLLM_FP16_TO_FP32' used; consider a 'constexpr' template function
(cppcoreguidelines-macro-usage,-warnings-as-errors)
[error] 133-133: function-like macro 'MLLM_FP32_TO_FP16' used; consider a 'constexpr' template function
(cppcoreguidelines-macro-usage,-warnings-as-errors)
mllm/compile/ir/linalg/Op.hpp
[error] 247-247: parameter name 'p' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 247-247: class 'EqualOp' defines a non-default destructor but does not define a copy constructor, a copy assignment operator, a move constructor or a move assignment operator
(cppcoreguidelines-special-member-functions,-warnings-as-errors)
[error] 248-248: parameter name 'p' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 248-248: class 'WhereOp' defines a non-default destructor but does not define a copy constructor, a copy assignment operator, a move constructor or a move assignment operator
(cppcoreguidelines-special-member-functions,-warnings-as-errors)
mllm/backends/qnn/aot/QnnTargetMachineParser.cpp
[error] 75-75: variable name 'tm' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 92-92: variable 'buffer' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
mllm/backends/qnn/aot/passes/MarkQnnGraphPass.cpp
[error] 24-24: 2 adjacent parameters of 'recursiveVisitGraph' of similar type ('const int &') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
mllm/backends/qnn/aot/QnnWrappersAPI.cpp
[error] 56-56: 8 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
mllm/core/aops/CmpOp.cpp
[error] 10-10: all parameters should be named in a function
(readability-named-parameter,-warnings-as-errors)
mllm/backends/qnn/aot/QnnWrappersAPI.hpp
[error] 28-28: declaration uses identifier '__mllmLoggerCallback4QnnLogger', which is a reserved identifier
(bugprone-reserved-identifier,-warnings-as-errors)
[error] 64-64: class 'QnnAOTParamTensor' defines a non-default destructor but does not define a copy constructor, a copy assignment operator, a move constructor or a move assignment operator
(cppcoreguidelines-special-member-functions,-warnings-as-errors)
[error] 68-68: 4 adjacent parameters of 'create' of similar type are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 75-75: 4 adjacent parameters of 'create' of similar type are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 104-104: parameter name 'v' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 119-119: constructor does not initialize these fields: op_name_, param_scalar, param_tensor, inputs, outputs
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 152-152: constructor does not initialize these fields: op_node_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: build-macos
- GitHub Check: build-x86
- GitHub Check: build-android
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (2)
mllm/backends/qnn/aot/QnnWrappersAPI.cpp (2)
114-118: Previous issue not fully resolved: destructor still crashes ifalloc()never called.The assertion on line 116 will still fail if
dataisnullptr(whenalloc()was never called), which was the issue flagged in the previous review. Theif (data)check on line 117 is redundant because it comes after the assertion.🔎 Proposed fix
QnnAOTParamTensor::~QnnAOTParamTensor() { auto data = QNN_TENSOR_GET_CLIENT_BUF(qnn_param_.tensorParam).data; - MLLM_RT_ASSERT(data != nullptr); - if (data) { free(data); } + if (data != nullptr) { + free(data); + } }Based on past review comments, the destructor should gracefully handle unallocated tensors.
140-154: Previous issue not resolved: TODO stubs still return without values for non-void functions.These methods have non-void return types but contain no implementation, which causes undefined behavior if they are called. The issue flagged in the previous review remains unaddressed.
🔎 Proposed fix - add explicit error handling
Qnn_TensorType_t QnnAOTNodeTensor::parseQnnTensorTypeFromIR() { - // TODO + // TODO: Implement tensor type parsing from IR + MLLM_ERROR("parseQnnTensorTypeFromIR not implemented"); + MLLM_RT_ASSERT(false && "parseQnnTensorTypeFromIR not implemented"); + return QNN_TENSOR_TYPE_UNDEFINED; } Qnn_DataType_t QnnAOTNodeTensor::parseQnnDataTypeFromIR() { - // TODO + // TODO: Implement data type parsing from IR + MLLM_ERROR("parseQnnDataTypeFromIR not implemented"); + MLLM_RT_ASSERT(false && "parseQnnDataTypeFromIR not implemented"); + return QNN_DATATYPE_UNDEFINED; } std::string QnnAOTNodeTensor::parseQnnTensorNameFromIR() { - // TODO + // TODO: Implement tensor name parsing from IR + MLLM_ERROR("parseQnnTensorNameFromIR not implemented"); + MLLM_RT_ASSERT(false && "parseQnnTensorNameFromIR not implemented"); + return ""; } Qnn_QuantizeParams_t QnnAOTNodeTensor::parseQnnQuantizeParamFromIR() { - // TODO + // TODO: Implement quantize param parsing from IR + MLLM_ERROR("parseQnnQuantizeParamFromIR not implemented"); + MLLM_RT_ASSERT(false && "parseQnnQuantizeParamFromIR not implemented"); + return Qnn_QuantizeParams_t{}; }Based on past review comments, these stubs should have explicit error handling to avoid undefined behavior.
🧹 Nitpick comments (2)
mllm/compile/ir/linalg/Op.cpp (1)
119-119: Verify no trailing whitespace on empty line.Empty lines should not contain trailing whitespace per the coding guidelines.
Run the following script to check for trailing whitespace:
#!/bin/bash # Check for trailing whitespace on line 119 sed -n '119p' mllm/compile/ir/linalg/Op.cpp | grep -E '\s+$' && echo "Trailing whitespace found on line 119" || echo "No trailing whitespace on line 119"As per coding guidelines, rule 4 states: "No line may end with trailing whitespace."
mllm/backends/qnn/aot/passes/MarkQnnGraphPass.cpp (1)
43-43: Consider passing strings by const reference.The range-based for loops copy each string by value. For better efficiency, especially if the configuration lists are large, consider using
const auto&orconst std::string&.🔎 Suggested optimization
Line 43:
- for (std::string item : config["graph_on_qnn"]) { + for (const auto& item : config["graph_on_qnn"]) {Line 50:
- for (std::string item : config["op_on_qnn"]) { op_visited.insert({item, false}); } + for (const auto& item : config["op_on_qnn"]) { op_visited.insert({item, false}); }Also applies to: 50-50
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
mllm/backends/qnn/aot/QnnWrappersAPI.cppmllm/backends/qnn/aot/passes/MarkQnnGraphPass.cppmllm/backends/qnn/aot/visitor/Elewise.cppmllm/compile/ir/linalg/Op.cpp
🧰 Additional context used
📓 Path-based instructions (4)
{mllm,mllm-cli,pymllm}/**/*
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*: Files must not contain C0 control codes 0x00–0x08, 0x0B–0x0C, 0x0E–0x1F, C1 control codes 0x7F–0x9F, or DEL 0x7F. Horizontal tab (0x09) and line feed (0x0A) are explicitly allowed.
All files must be encoded in UTF-8 without BOM.
Any violation of character set (Rule 1) or encoding (Rule 2) requirements must cause the review to fail.
No line may end with trailing whitespace.
Use Unix line endings (LF).
File and directory names must consist only of printable Unicode characters, excluding C0 control codes 0x00–0x08, 0x0B–0x0C, 0x0E–0x1F, C1 control codes 0x7F–0x9F, and DEL 0x7F.
Only use acceptable file extensions: .c, .cc, .cpp, .cxx, .h, .hh, .hpp, .py, .pyi, .sh, .txt, .md, .yml, .yaml, .json, .toml.
Optional license headers, if present, must comply with character set rules (no C0/C1 control codes except tab and line feed).
Files:
mllm/backends/qnn/aot/passes/MarkQnnGraphPass.cppmllm/compile/ir/linalg/Op.cppmllm/backends/qnn/aot/visitor/Elewise.cppmllm/backends/qnn/aot/QnnWrappersAPI.cpp
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi,sh}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi,sh}: TODO and FIXME comments must be written as 'TODO:' or 'FIXME:' followed by UTF-8 text that adheres to character set rules.
Encourage consistent coding style and patterns with the existing codebase.
Ensure code is portable across supported platforms (e.g., Linux, Windows) unless explicitly platform-specific.
Files:
mllm/backends/qnn/aot/passes/MarkQnnGraphPass.cppmllm/compile/ir/linalg/Op.cppmllm/backends/qnn/aot/visitor/Elewise.cppmllm/backends/qnn/aot/QnnWrappersAPI.cpp
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,py,pyi}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,py,pyi}: Prioritize production-ready code quality by evaluating time and space complexity of algorithms and data structures, and suggest more efficient alternatives for operations with high complexity (e.g., O(n^2) or worse) when feasible.
Avoid unnecessary object creation in loops or hot paths.
Check for proper error handling and resource cleanup (e.g., using try-finally, context managers, or RAII).
Ensure functions that can fail return appropriate error codes or raise exceptions.
Validate inputs for public APIs and critical internal functions.
Add comments for complex algorithms or non-obvious logic.
Identify potential security issues (e.g., buffer overflows, injection risks, insecure temporary files) and recommend using secure alternatives (e.g., parameterized queries, secure random generators).
Suggest adding unit tests for untested complex logic or edge cases.
Ensure code is testable by avoiding global state and using dependency injection.
Flag overly complex functions (e.g., high cyclomatic complexity) and suggest breaking them down.
Use named constants instead of magic numbers.
Add appropriate logging (e.g., debug, info, warning, error) for significant events and errors, avoiding sensitive data exposure.
Files:
mllm/backends/qnn/aot/passes/MarkQnnGraphPass.cppmllm/compile/ir/linalg/Op.cppmllm/backends/qnn/aot/visitor/Elewise.cppmllm/backends/qnn/aot/QnnWrappersAPI.cpp
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi}: Ensure public APIs, classes, and functions have clear docstrings or comments explaining purpose, parameters, returns, and errors.
Adhere to language-specific best practices and idioms (e.g., PEP 8 for Python, Google C++ Style Guide for C++).
Files:
mllm/backends/qnn/aot/passes/MarkQnnGraphPass.cppmllm/compile/ir/linalg/Op.cppmllm/backends/qnn/aot/visitor/Elewise.cppmllm/backends/qnn/aot/QnnWrappersAPI.cpp
🧬 Code graph analysis (4)
mllm/backends/qnn/aot/passes/MarkQnnGraphPass.cpp (4)
mllm/compile/ir/linalg/Op.hpp (1)
ctx(155-155)mllm/compile/ir/Node.hpp (6)
IRWriter(354-354)IRWriter(355-355)op(468-468)op(470-470)name(302-302)name(306-306)mllm/compile/ir/Node.cpp (3)
IRWriter(290-291)name(202-202)name(202-202)mllm/backends/qnn/aot/passes/MarkQnnGraphPass.hpp (1)
op(17-17)
mllm/compile/ir/linalg/Op.cpp (4)
mllm/core/aops/CmpOp.cpp (1)
EqualOp(36-36)mllm/core/aops/CmpOp.hpp (1)
EqualOp(15-15)mllm/core/aops/WhereOp.hpp (1)
WhereOp(15-15)mllm/core/aops/WhereOp.cpp (1)
WhereOp(12-12)
mllm/backends/qnn/aot/visitor/Elewise.cpp (4)
mllm/backends/qnn/aot/passes/OpNamingPass.hpp (1)
op(17-17)mllm/backends/qnn/aot/visitor/Base.hpp (3)
op(17-17)op(17-17)g_name(21-23)mllm/backends/qnn/aot/visitor/Elewise.hpp (2)
op(15-15)g_name(17-18)mllm/backends/qnn/aot/passes/MarkQnnGraphPass.hpp (1)
op(17-17)
mllm/backends/qnn/aot/QnnWrappersAPI.cpp (1)
mllm/backends/qnn/aot/QnnWrappersAPI.hpp (28)
QnnAOTParamScalar(51-51)QnnAOTParamScalar(53-53)QnnAOTParamScalar(55-55)name(236-236)name(238-238)name(240-240)QnnAOTParamTensor(80-81)QnnAOTParamTensor(83-83)param_name(68-73)param_name(68-69)param_name(75-78)param_name(75-76)QnnAOTDataTypeSize(30-30)QnnAOTNodeTensor(104-104)v(102-102)v(102-102)ins(123-123)ous(125-125)input(127-127)output(129-129)params(131-131)params(135-135)param(133-133)param(137-137)op_name(139-139)package_name(141-141)QnnAOTEnv(232-232)QnnAOTEnv(234-234)
🪛 Clang (14.0.6)
mllm/backends/qnn/aot/passes/MarkQnnGraphPass.cpp
[error] 24-24: 2 adjacent parameters of 'recursiveVisitGraph' of similar type ('const int &') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
mllm/backends/qnn/aot/QnnWrappersAPI.cpp
[error] 56-56: 8 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: build-x86
- GitHub Check: build-macos
- GitHub Check: build-android
🔇 Additional comments (11)
mllm/compile/ir/linalg/Op.cpp (1)
117-118: LGTM! Operator registrations are correct.The mappings correctly associate
OpTypes::kEqualwithEqualOpandOpTypes::kWherewithWhereOp, matching their respective constructor implementations. The previous issue where WhereOp was incorrectly mapped toOpTypes::kArgsorthas been resolved.mllm/backends/qnn/aot/passes/MarkQnnGraphPass.cpp (2)
24-24: Static analysis warning is a false positive.The Clang warning about "easily swappable parameters" is incorrect. The two parameters have distinct types:
const ir::IRContext::ptr_t&(shared pointer to IRContext) andconst ir::graph::GraphIROp::ptr_t&(shared pointer to GraphIROp), so they cannot be accidentally swapped.
58-60: Past issue has been correctly addressed.The
op_visitedmap is now properly updated at line 60 withop_visited[name] = true;, ensuring that each matching op name is marked only on its first occurrence. This resolves the concern raised in the previous review.mllm/backends/qnn/aot/visitor/Elewise.cpp (3)
10-12: LGTM!The
isMatchimplementation correctly identifies AddOp nodes with the "using_qnn" attribute for QNN AOT pattern matching.
17-21: LGTM!The cast to
linalg::AddOpincludes appropriate error handling and logging.
23-27: LGTM!The error message has been corrected to accurately reference
aops::AddOpinstead of the previous incorrect message. The cast logic and error handling are appropriate.mllm/backends/qnn/aot/QnnWrappersAPI.cpp (5)
65-89: LGTM!The scalar parameter constructors correctly initialize the QNN parameter structure for each type (bool, uint32_t, float), and the name lifetime is properly managed via the member variable.
91-112: LGTM!The tensor parameter constructor properly initializes all QNN tensor fields and correctly manages string/vector member lifetimes.
157-205: LGTM!The builder pattern methods correctly implement the fluent interface, allowing chainable operation configuration. The distinction between
add*(for vectors) andemplace*(for single items) is clear and consistent.
251-255: LGTM!The constructor signature update to accept
const QcomTargetMachine&is appropriate and correctly applied to both overloads.
378-390: LGTM!The conditional allocation of the context config array with null-termination is correctly implemented. The size check prevents unnecessary allocation, and memory is properly tracked via
unreachable_handle_.
| default: | ||
| MLLM_ERROR("QnnAOTDataTypeSize: unsupported Qnn_DataType_t {}", static_cast<int>(dtype)); | ||
| MLLM_RT_ASSERT(false); | ||
| return 0; | ||
| } |
There was a problem hiding this comment.
Guard against assertion-disabled builds in error path.
If assertions are compiled out, the function returns 0 for unsupported types, which would cause incorrect memory calculations (potentially leading to zero-sized allocations or buffer underflows). Consider using a fatal error mechanism that cannot be disabled, or throw an exception.
🔎 Proposed fix
default:
MLLM_ERROR("QnnAOTDataTypeSize: unsupported Qnn_DataType_t {}", static_cast<int>(dtype));
- MLLM_RT_ASSERT(false);
- return 0;
+ std::abort(); // or throw std::runtime_error("Unsupported Qnn_DataType_t");
}Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In mllm/backends/qnn/aot/QnnWrappersAPI.cpp around lines 58-62, the
default/error path currently relies on MLLM_RT_ASSERT(false) followed by
returning 0 which can silently return 0 when assertions are disabled; replace
that with an unconditional fatal behavior that cannot be compiled out — e.g.,
throw a std::runtime_error with a descriptive message that includes the
unsupported dtype (static_cast<int>(dtype)), or if exceptions are not allowed in
this module call a guaranteed-terminate function (abort/std::terminate or your
project’s MLLM_FATAL equivalent) with the same descriptive message; ensure the
function does not return a value after that call so callers cannot receive 0 for
an unsupported type.
| void* QnnAOTParamTensor::alloc() { | ||
| uint32_t data_size = QnnAOTDataTypeSize(QNN_TENSOR_GET_DATA_TYPE(qnn_param_.tensorParam)); | ||
| for (int i = 0; i < QNN_TENSOR_GET_RANK(qnn_param_.tensorParam); i++) { | ||
| data_size *= qnn_param_.tensorParam.v2.dimensions[i]; | ||
| } | ||
| Qnn_ClientBuffer_t clientBuffer = {.data = malloc(data_size), .dataSize = data_size}; | ||
| QNN_TENSOR_SET_CLIENT_BUF(qnn_param_.tensorParam, clientBuffer); | ||
| MLLM_RT_ASSERT(QNN_TENSOR_GET_CLIENT_BUF(qnn_param_.tensorParam).data != nullptr); | ||
| return QNN_TENSOR_GET_CLIENT_BUF(qnn_param_.tensorParam).data; | ||
| } |
There was a problem hiding this comment.
Add overflow protection for size calculation.
The data_size calculation multiplies dimensions without overflow checking. With sufficiently large dimension values, this could overflow and result in a small allocation, leading to buffer overflows when the tensor is populated.
🔎 Proposed fix using checked multiplication
void* QnnAOTParamTensor::alloc() {
- uint32_t data_size = QnnAOTDataTypeSize(QNN_TENSOR_GET_DATA_TYPE(qnn_param_.tensorParam));
+ size_t data_size = QnnAOTDataTypeSize(QNN_TENSOR_GET_DATA_TYPE(qnn_param_.tensorParam));
for (int i = 0; i < QNN_TENSOR_GET_RANK(qnn_param_.tensorParam); i++) {
- data_size *= qnn_param_.tensorParam.v2.dimensions[i];
+ size_t dimension = qnn_param_.tensorParam.v2.dimensions[i];
+ if (data_size > 0 && dimension > SIZE_MAX / data_size) {
+ MLLM_ERROR("QnnAOTParamTensor::alloc: size overflow detected");
+ MLLM_RT_ASSERT(false);
+ return nullptr;
+ }
+ data_size *= dimension;
}
Qnn_ClientBuffer_t clientBuffer = {.data = malloc(data_size), .dataSize = data_size};🤖 Prompt for AI Agents
In mllm/backends/qnn/aot/QnnWrappersAPI.cpp around lines 120 to 129, the
data_size calculation multiplies element size by each dimension without overflow
checks; change data_size to size_t, compute iteratively and before each
multiplication verify that data_size <= SIZE_MAX / dimension (and that dimension
> 0), and if the check fails handle it by logging/erroring/returning nullptr or
asserting instead of proceeding to malloc; also verify QnnAOTDataTypeSize result
fits into size_t before starting and ensure malloc is checked for null before
using the buffer.
| // TODO | ||
|
|
||
| return true; |
There was a problem hiding this comment.
TODO comment format and incomplete implementation.
The TODO comment on line 29 doesn't follow the coding guidelines format. Per the guidelines, TODO comments must be written as TODO: followed by a description.
Additionally, the function currently performs validation but doesn't implement the actual node addition logic. The parameters g_name, inputs, and outputs remain unused, and the function returns true despite not adding any QNN node to the graph.
🔎 Suggested TODO format
- // TODO
+ // TODO: Implement QNN graph node addition for AddOpWould you like me to help implement the QNN node addition logic for this pattern, or open a new issue to track this task?
Based on coding guidelines, TODO comments must follow the format 'TODO:' followed by UTF-8 text.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // TODO | |
| return true; | |
| // TODO: Implement QNN graph node addition for AddOp | |
| return true; |
Summary by CodeRabbit
New Features
Improvements
✏️ Tip: You can customize this high-level summary in your review settings.