Skip to content

add missing kv clear in llama_beam_search#6664

Merged
compilade merged 1 commit intoggml-org:masterfrom
dwrensha:fix-beam-search
Apr 14, 2024
Merged

add missing kv clear in llama_beam_search#6664
compilade merged 1 commit intoggml-org:masterfrom
dwrensha:fix-beam-search

Conversation

@dwrensha
Copy link
Copy Markdown
Contributor

Adds a call to llama_kv_cache_seq_rm() in llama_beam_search_data::fill_next_beams_by_top_probabilities().

This seems to be necessary for the subsequent llama_decode() calls -- which can act on the same sequence at the same position with different test tokens -- to return reasonable results.

Before this change:

$ bin/beam-search ~/models/mistral-7b-v0.1.Q8_0.gguf 3 "fibonacci: 1, 1, 2, 3, 5, 8, 13,"
...
 21, 341, 610, 987, 1597, 233

After this change:

$ bin/beam-search ~/models/mistral-7b-v0.1.Q8_0.gguf 3 "fibonacci: 1, 1, 2, 3, 5, 8, 13,"
...
 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765, 10946, 17711, 28657, 46368, 75025, 121393, 196418, 317811, 514229, 832040, 1346269, 2178309, 3524578, 5702887, 9227465, 14930352, 24157817, 39088169, 63245986, 102334155, 165580141, 26791429

@dwrensha
Copy link
Copy Markdown
Contributor Author

dwrensha commented Apr 13, 2024

cc @mattpulver, who added beam search in #2267.

Copy link
Copy Markdown
Collaborator

@compilade compilade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!

Throwing an idea here for later: I think the beam search should eventually be adapted to use one seq_id per beam to facilitate the logic for sharing KV cells between beams, and to allow parallel beam search.

@dwrensha
Copy link
Copy Markdown
Contributor Author

I think the beam search should eventually be adapted to use one seq_id per beam to facilitate the logic for sharing KV cells between beams, and to allow parallel beam search.

I agree!

@compilade compilade merged commit 1958f7e into ggml-org:master Apr 14, 2024
@dwrensha dwrensha deleted the fix-beam-search branch April 14, 2024 19:26
Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
phuongncn pushed a commit to phuongncn/llama.cpp-gx10-dgx-sparks-deepseekv4 that referenced this pull request Apr 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants