llama : infill sampling handle very long tokens by ggerganov · Pull Request #9924 · ggml-org/llama.cpp

ggerganov · 2024-10-17T13:05:01Z

The infill sampler now handles correctly tokens with very long texts (e.g. line indentations). The token-merging logic should be more clear as well.

API changes

Remove the recently added llama_token_is_prefix. It was technically incorrect for very long tokens and I don't want to make it allocate too much stack memory as we have no upper bound for the token string length.

ggml-ci

* llama : infill sampling handle very long tokens ggml-ci * cont : better indices ggml-ci

ggerganov added 2 commits October 17, 2024 16:00

llama : infill sampling handle very long tokens

99c4a39

ggml-ci

cont : better indices

7899c67

ggml-ci

ggerganov mentioned this pull request Oct 17, 2024

server : add n_indent parameter for line indentation requirement #9929

Merged

ggerganov merged commit 99bd4ac into master Oct 17, 2024

ggerganov deleted the gg/infill-4 branch October 17, 2024 19:32

drollings pushed a commit to drollings/llama.cpp that referenced this pull request Oct 18, 2024

llama : infill sampling handle very long tokens (ggml-org#9924)

b440d03

* llama : infill sampling handle very long tokens ggml-ci * cont : better indices ggml-ci

dsx1986 pushed a commit to dsx1986/llama.cpp that referenced this pull request Oct 29, 2024

llama : infill sampling handle very long tokens (ggml-org#9924)

d0c3418

* llama : infill sampling handle very long tokens ggml-ci * cont : better indices ggml-ci

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 15, 2024

llama : infill sampling handle very long tokens (ggml-org#9924)

d066fa7

* llama : infill sampling handle very long tokens ggml-ci * cont : better indices ggml-ci

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 18, 2024

llama : infill sampling handle very long tokens (ggml-org#9924)

be7c08c

* llama : infill sampling handle very long tokens ggml-ci * cont : better indices ggml-ci

Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026

llama : infill sampling handle very long tokens (ggml-org#9924)

e790309

* llama : infill sampling handle very long tokens ggml-ci * cont : better indices ggml-ci

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama : infill sampling handle very long tokens#9924

llama : infill sampling handle very long tokens#9924
ggerganov merged 2 commits intomasterfrom
gg/infill-4

ggerganov commented Oct 17, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ggerganov commented Oct 17, 2024

API changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant