hexagon: various Op fixes by max-krasnyansky · Pull Request #17135 · ggml-org/llama.cpp

max-krasnyansky · 2025-11-10T03:05:34Z

Introduce fastdiv and fixtest-backend-ops failures for ADD/SUB/MUL
Thanks @chraac!
Subsumes #17042

Fixed inference with Qwen3-VL models that generate graphs with ne[1] == 0

chraac · 2025-11-10T08:58:17Z

Great work! I've tested the fix on my device and it works well.

One small thing: init_fastdiv_values is called for each binary operation during each calculation.
However, this should have minimal performance impact since it only contains constant shifts and mpys, so I think it's acceptable.

Looking ahead, have you considered maintaining a similar structure to store these values on the CPU? That could be optimal since the weight dimensions remain fixed throughout all stages.

chraac

lgtm!

llm_graph_context::build_inp_out_ids() can generate tensors with zero nrows. Somehow other backends seems to handle this without obvious explicit checks. In the hexagon case we need to check explicitly and skip them.

Co-authored-by: chraac <chraac@gmail.com>

max-krasnyansky · 2025-11-10T23:50:23Z

Great work! I've tested the fix on my device and it works well.

One small thing: init_fastdiv_values is called for each binary operation during each calculation. However, this should have minimal performance impact since it only contains constant shifts and mpys, so I think it's acceptable.

Looking ahead, have you considered maintaining a similar structure to store these values on the CPU? That could be optimal since the weight dimensions remain fixed throughout all stages.

Yeah, I did lots of profiling runs and the overall perf is the same as before but all test-backend-ops for ADD/SUB/MUl/ADD_ID are passing now. Definitely acceptable :)

And yes, let's think about caching those. I was thinking host at first as we discussed in the other PR. But maybe it makes sense to allocate a little bit of vtcm and cache them there.

max-krasnyansky · 2025-11-10T23:51:04Z

@lhez please take a look

lhez

Looks good!

* hexagon: explicitly check for ops with zero nrows llm_graph_context::build_inp_out_ids() can generate tensors with zero nrows. Somehow other backends seems to handle this without obvious explicit checks. In the hexagon case we need to check explicitly and skip them. * hexagon: introduce fastdiv, fix test-backend-ops for ADD/SUB/MUL Co-authored-by: chraac <chraac@gmail.com> * hexagon: use fastdiv in ADD_ID * hexagon: use ggml_op_is_empty and ggml_is_empty to check for NOPs --------- Co-authored-by: chraac <chraac@gmail.com>

github-actions Bot added the ggml changes relating to the ggml tensor library for machine learning label Nov 10, 2025

max-krasnyansky mentioned this pull request Nov 10, 2025

ggml-hexagon: fix test-backend-ops failures on specific binary ops #17042

Closed

DajanaV mentioned this pull request Nov 10, 2025

UPSTREAM PR #17135: hexagon: various Op fixes auroralabs-loci/llama.cpp#154

Closed

chraac approved these changes Nov 10, 2025

View reviewed changes

max-krasnyansky and others added 4 commits November 10, 2025 15:03

hexagon: explicitly check for ops with zero nrows

40d7e7d

llm_graph_context::build_inp_out_ids() can generate tensors with zero nrows. Somehow other backends seems to handle this without obvious explicit checks. In the hexagon case we need to check explicitly and skip them.

hexagon: introduce fastdiv, fix test-backend-ops for ADD/SUB/MUL

21c1cb8

Co-authored-by: chraac <chraac@gmail.com>

hexagon: use fastdiv in ADD_ID

aee43d8

hexagon: use ggml_op_is_empty and ggml_is_empty to check for NOPs

313f261

max-krasnyansky force-pushed the hexagon-op-fixes branch from 918e859 to 313f261 Compare November 10, 2025 23:44

max-krasnyansky marked this pull request as ready for review November 10, 2025 23:45

max-krasnyansky requested a review from lhez as a code owner November 10, 2025 23:45

lhez approved these changes Nov 11, 2025

View reviewed changes

max-krasnyansky merged commit c273d75 into ggml-org:master Nov 11, 2025
70 of 71 checks passed

max-krasnyansky deleted the hexagon-op-fixes branch January 25, 2026 23:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hexagon: various Op fixes#17135

hexagon: various Op fixes#17135
max-krasnyansky merged 4 commits intoggml-org:masterfrom
qualcomm:hexagon-op-fixes

max-krasnyansky commented Nov 10, 2025

Uh oh!

chraac commented Nov 10, 2025 •

edited

Loading

Uh oh!

chraac left a comment

Uh oh!

max-krasnyansky commented Nov 10, 2025

Uh oh!

max-krasnyansky commented Nov 10, 2025

Uh oh!

lhez left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

max-krasnyansky commented Nov 10, 2025

Uh oh!

chraac commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chraac left a comment

Choose a reason for hiding this comment

Uh oh!

max-krasnyansky commented Nov 10, 2025

Uh oh!

max-krasnyansky commented Nov 10, 2025

Uh oh!

lhez left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

chraac commented Nov 10, 2025 •

edited

Loading