CANN: add optional support for ACL Graph execution#15065
Merged
hipudding merged 4 commits intoggml-org:masterfrom Aug 6, 2025
Merged
CANN: add optional support for ACL Graph execution#15065hipudding merged 4 commits intoggml-org:masterfrom
hipudding merged 4 commits intoggml-org:masterfrom
Conversation
07c34de to
794511d
Compare
Collaborator
Author
|
Model: Qwen2.5-0.5B
With ACL Graph Test: user
Building a website can be done in 10 steps:
assistant
Certainly! Here are 10 steps to help you build a website:
1. **Define Your Purpose and Goals**
- Identify what you want your website to do.
- Determine your audience and what they want.
2. **Choose the Right Platform**
- Select a web development platform that suits your needs.
- Consider factors like stability, security, and scalability.
3. **Set Up a Domain Name**
- Choose a domain name that is easy to remember and reflects your website.
- Ensure it is available and easy to register.
4. **Choose a Domain Hosting Provider**
- Select a reliable domain hosting service that meets your needs.
- Consider features like speed, security, and scalability.
5. **Choose a Content Management System (CMS)**
- Select a CMS that fits your needs and the complexity of your website.
- Choose based on features, ease of use, and community support.
6. **Design Your Website**
- Choose a design that is visually appealing and easy to navigate.
- Consider the style, color scheme, and layout.
7. **Choose a Content Writer**
- Hire a skilled content writer to create the content.
- Consider the writer's skills, experience, and style.
8. **Choose a Website Builder**
- Select a website builder that suits your needs.
- Choose based on features, ease of use, and scalability.
9. **Implement Your Website**
- Use your chosen content management system and website builder to create your website.
- Test the website thoroughly for errors and usability.
10. **Launch and Promote Your Website**
- Launch your website and start promoting it.
- Engage with your audience through social media, email marketing, and other channels.
11. **Monitor and Update**
- Monitor the performance of your website.
- Keep your website updated with new content and features.
12. **Refine and Improve**
- Regularly review and refine your website content and design.
- Stay updated with the latest trends and best practices in web development.
These steps should help you build a website that meets your needs and helps drive traffic to your site.
>
llama_perf_sampler_print: sampling time = 117.75 ms / 469 runs ( 0.25 ms per token, 3982.91 tokens per second)
llama_perf_context_print: load time = 7527.43 ms
llama_perf_context_print: prompt eval time = 40.20 ms / 20 tokens ( 2.01 ms per token, 497.49 tokens per second)
llama_perf_context_print: eval time = 3754.24 ms / 448 runs ( 8.38 ms per token, 119.33 tokens per second)
llama_perf_context_print: total time = 4803.95 ms / 468 tokens
llama_perf_context_print: graphs reused = 433 |
794511d to
a8dc6ac
Compare
hipudding
reviewed
Aug 5, 2025
Contributor
hipudding
left a comment
There was a problem hiding this comment.
Thank you for your contribution. Enabling acl_graph support greatly reduces NPU idle cycles, with particularly noticeable performance gains on models with smaller parameter sizes.
The code is basically fine; only a few minor changes are needed.
Contributor
There was a problem hiding this comment.
I think use_cann_graphis not necessary. You need only set cann_graph_update_required = true for both acl_graph on and off.
Collaborator
Author
There was a problem hiding this comment.
I think use_cann_graph is still needed. Even when cann_graph is enabled, we may fall back to eager mode if LLAMA_SET_ROWS is not set. So we need a way to track that cann_graph is not being used in this case.
This commit adds support for executing ggml computational graphs using
Huawei's ACL graph mode via the USE_CANN_GRAPH flag. The support can be
enabled at compile time using the CMake option:
-DUSE_CANN_GRAPH=ON
By default, ACL graph execution is **disabled**, and the fallback path
uses node-by-node execution.
Key additions:
- CMake option to toggle graph mode
- Graph capture and execution logic using
- Tensor property matching to determine whether graph update is required
- Safe fallback and logging if the environment variable LLAMA_SET_ROWS
is unset or invalid
This prepares the backend for performance improvements in repetitive graph
execution scenarios on Ascend devices.
Signed-off-by: noemotiovon <757486878@qq.com>
Signed-off-by: noemotiovon <757486878@qq.com>
Signed-off-by: noemotiovon <757486878@qq.com>
24e17d5 to
fed26f7
Compare
hipudding
approved these changes
Aug 6, 2025
wangweixuan
pushed a commit
to wangweixuan/llama.cpp
that referenced
this pull request
Dec 4, 2025
* feat(cann): add optional support for ACL Graph execution
This commit adds support for executing ggml computational graphs using
Huawei's ACL graph mode via the USE_CANN_GRAPH flag. The support can be
enabled at compile time using the CMake option:
-DUSE_CANN_GRAPH=ON
By default, ACL graph execution is **disabled**, and the fallback path
uses node-by-node execution.
Key additions:
- CMake option to toggle graph mode
- Graph capture and execution logic using
- Tensor property matching to determine whether graph update is required
- Safe fallback and logging if the environment variable LLAMA_SET_ROWS
is unset or invalid
This prepares the backend for performance improvements in repetitive graph
execution scenarios on Ascend devices.
Signed-off-by: noemotiovon <757486878@qq.com>
* Fix review comments
Signed-off-by: noemotiovon <757486878@qq.com>
* remane USE_CANN_GRAPH to USE_ACL_GRAPH
Signed-off-by: noemotiovon <757486878@qq.com>
* fix typo
Signed-off-by: noemotiovon <757486878@qq.com>
---------
Signed-off-by: noemotiovon <757486878@qq.com>
blime4
referenced
this pull request
in blime4/llama.cpp
Feb 5, 2026
* feat(cann): add optional support for ACL Graph execution
This commit adds support for executing ggml computational graphs using
Huawei's ACL graph mode via the USE_CANN_GRAPH flag. The support can be
enabled at compile time using the CMake option:
-DUSE_CANN_GRAPH=ON
By default, ACL graph execution is **disabled**, and the fallback path
uses node-by-node execution.
Key additions:
- CMake option to toggle graph mode
- Graph capture and execution logic using
- Tensor property matching to determine whether graph update is required
- Safe fallback and logging if the environment variable LLAMA_SET_ROWS
is unset or invalid
This prepares the backend for performance improvements in repetitive graph
execution scenarios on Ascend devices.
Signed-off-by: noemotiovon <757486878@qq.com>
* Fix review comments
Signed-off-by: noemotiovon <757486878@qq.com>
* remane USE_CANN_GRAPH to USE_ACL_GRAPH
Signed-off-by: noemotiovon <757486878@qq.com>
* fix typo
Signed-off-by: noemotiovon <757486878@qq.com>
---------
Signed-off-by: noemotiovon <757486878@qq.com>
Seunghhon
pushed a commit
to Seunghhon/llama.cpp
that referenced
this pull request
Apr 26, 2026
* feat(cann): add optional support for ACL Graph execution
This commit adds support for executing ggml computational graphs using
Huawei's ACL graph mode via the USE_CANN_GRAPH flag. The support can be
enabled at compile time using the CMake option:
-DUSE_CANN_GRAPH=ON
By default, ACL graph execution is **disabled**, and the fallback path
uses node-by-node execution.
Key additions:
- CMake option to toggle graph mode
- Graph capture and execution logic using
- Tensor property matching to determine whether graph update is required
- Safe fallback and logging if the environment variable LLAMA_SET_ROWS
is unset or invalid
This prepares the backend for performance improvements in repetitive graph
execution scenarios on Ascend devices.
Signed-off-by: noemotiovon <757486878@qq.com>
* Fix review comments
Signed-off-by: noemotiovon <757486878@qq.com>
* remane USE_CANN_GRAPH to USE_ACL_GRAPH
Signed-off-by: noemotiovon <757486878@qq.com>
* fix typo
Signed-off-by: noemotiovon <757486878@qq.com>
---------
Signed-off-by: noemotiovon <757486878@qq.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This commit adds support for executing ggml computational graphs using Huawei's ACL graph mode via the USE_ACL_GRAPH flag. The support can be enabled at compile time using the CMake option:
By default, ACL graph execution is disabled, and the fallback path uses node-by-node execution.
Key additions:
This prepares the backend for performance improvements in repetitive graph execution scenarios on Ascend devices.