spec : save the dynamic/static ngram cache file#22055
spec : save the dynamic/static ngram cache file#22055petersid2022 wants to merge 1 commit intoggml-org:masterfrom
Conversation
2e1c956 to
430c0ca
Compare
d5448ea to
ba99720
Compare
cf7a308 to
8ae6c04
Compare
afc3295 to
dc2ab62
Compare
c402b3d to
9da23a4
Compare
89b10b8 to
5c5bea4
Compare
| }; | ||
|
|
||
| struct common_params_speculative_ngram_cache { | ||
| struct common_params_speculative_ngram_cache : common_params_speculative_ngram_map { |
There was a problem hiding this comment.
This is probably the wrong way of going about this, but I am curious if the same concept of m-gram speculative tokens can be applied in the ngram-cache implemetantion
e3017a4 to
268d95e
Compare
|
The new parameters are never populated. Did you test this change? What is the goal of this PR? |
b4ad275 to
4fe77aa
Compare
|
first of all, thanks for taking the time to review my PR! TBH, my initial scope for this PR (after coming across the TODO on line 930 of common/speculative.cpp) was to move the The way I went about testing my changes was using the below command: P.S: In the |
4fe77aa to
3f65c81
Compare
* fix todo on providing n_draft, save_static and save_dynamic from common/common.h * implement the functionality by saving the cache at the common_speculative_state_ngram_cache destruction
3f65c81 to
719eb8b
Compare
Overview
When we select the
COMMON_SPECULATIVE_TYPE_NGRAM_CACHEspeculative implementation we create a newcommon_speculative_state_ngram_cachestate usingcreate_state_ngram_cache, where we instantiate the new state by specifying various parameters (e.g,n_draft,save_staticandsave_dynamic) by hardcoding them.Instead we extend
common_params_speculativeto include those options as well.An attempt was also made to implement the
save_static/save_dynamicbehavior by callingcommon_ngram_cache_saveon object destruction.Additional information
Add self‑speculative decoding (no draft model required)#18471
Requirements