Improve graph build time#2329
Conversation
|
@JohannesGaessler this may explain the performance uplift that you observed in #2230 (comment). |
7a007fb to
cec3185
Compare
cec3185 to
6542a03
Compare
|
Replaced the |
|
I think this is as good as it is going to get, ultimately it is a choice between maintaining backwards compatibility or memory usage. I am inclined to prefer the hash table version, but it increases the size of a struct that is already very large. Won't be an issue after we change graphs to allocate them on the heap/context. Some performance numbers:
Some recent changes in master reduced the graph build times as well, I have no idea why. |
ggerganov
left a comment
There was a problem hiding this comment.
Nice, we should implement ggml-org/ggml#299 soon.
* improve graph build time * ggml_tensor : use 1 bit per flag * use a hash table instead
* improve graph build time * ggml_tensor : use 1 bit per flag * use a hash table instead
Instead of checking the list of current nodes in
ggml_visit_parents()to determine if a node has already been visited, avisitedflag is added toggml_tensor.Unfortunately this requires resetting the
visitedflag to reuse the tensors in a different graph, and because in some cases graphs are built in multiple steps, it is not easy to do this automatically. Currently, this is done automatically when callingggml_graph_plan(), but there may be some cases where it can cause current ggml code to fail silently.