Check for reverse prompt by characters instead of tokens (#292)#330

Merged

ggerganov merged 5 commits intoggml-org:masterfrom

tjohnman:bugfix-292

Mar 21, 2023

Contributor

tjohnman commented Mar 20, 2023 •

edited

Loading

This fixes bug #292 as suggested here.

Johnman and others added 3 commits

March 20, 2023 16:06


          Check for reverse prompt by characters instead of tokens (ggml-org#292)

b646ffa


          Update main.cpp

e9f7747

Wording.


          Cleanup.

6242d1c

gjmulder added the bug label

Collaborator

Green-Sky commented Mar 20, 2023 •

edited

Loading

Not sure we need the string stream here.

Contributor Author

tjohnman commented Mar 20, 2023 •

edited

Loading

Not sure we need the string stream here.

@Green-Sky Should we use std::string::append() or the sum operator instead? What do you suggest as the most efficient alternative?

EDIT: I used addition. Please let me know if there are any other issues.

Johnman and others added 2 commits

March 20, 2023 16:31


          Remove unnecessary use of std::stringstream.


          Merge branch 'master' into bugfix-292

c8e940e

ggerganov approved these changes

View reviewed changes

ggerganov merged commit 6a61295 into ggml-org:master

glinscott pushed a commit to glinscott/llama.cpp that referenced this pull request


          Check for reverse prompt by characters instead of tokens (ggml-org#292)…

d5f56a5

… (ggml-org#330)

* Check for reverse prompt by characters instead of tokens (ggml-org#292)

* Update main.cpp

Wording.

* Cleanup.

* Remove unnecessary use of std::stringstream.

---------

Co-authored-by: Johnman <tjohnman@github>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

tjohnman deleted the bugfix-292 branch

March 21, 2023 16:26

rabidcopy mentioned this pull request

Replace EOS with newline to prevent context/memory being flushed by EOS in interactive mode #333

Merged

ggerganov added a commit that referenced this pull request


          Replace EOS with newline to prevent context/memory being flushed by E…

2e17dfd

…OS in interactive mode (#333)

* Improve interactive mode's coherence after EOS

Aims to improve coherence and ability to resume the interactive session when the user is given input back after an end of text token is reached.
Not sure what token 13 is or why it seems to help. See conversation for examples.

* Make newline token a constant

* dynamically determine newline token

* relocate previous newline token const

* cleanup whitespace

* print a new line on end of text in interactive

this may need to be looked into further when not using a reverse prompt

* only print manual newline with reverse prompt

fix formatting of reverse prompts so they don't end up at the end of the current line while not introducing unnecessary new lines otherwise

* alternate approach to replace end of text tokens

* Inject the reverse prompt again after eos in interactive mode

* tokenize reverse prompt when needed

makes this PR compatible with #330

* tokenize and inject only first reverse prompt

thanks to tjohnman

* tokenize first reverse prompt once

* add newline token

* add newline token

* tokenize/inject reverse prompt for refactor

this doesn't seem right though

* tokenize nothing for antiprompt if no reverse

* Update main.cpp

* Update main.cpp

* tokenize and inject reverse prompt as needed

this doesn't seem to work if the reverse prompt is tokenized outside earlier on

* not needed

* remove newline token

* remove newline token

* tokenize newline token

* add space to comment

* Update main.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: Slaren <2141330+slaren@users.noreply.github.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

joshmackwilliams mentioned this pull request

feat: add "stop" keywords as alternative to eot token #769

Closed

aroidzap pushed a commit to aroidzap/llama.cpp that referenced this pull request


          Check for reverse prompt by characters instead of tokens (ggml-org#292)…

eecfb26

… (ggml-org#330)

* Check for reverse prompt by characters instead of tokens (ggml-org#292)

* Update main.cpp

Wording.

* Cleanup.

* Remove unnecessary use of std::stringstream.

---------

Co-authored-by: Johnman <tjohnman@github>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

Deadsg pushed a commit to Deadsg/llama.cpp that referenced this pull request


          Fix resize issue. Closes ggml-org#330

8b4968e

YuMJie pushed a commit to YuMJie/powerinfer that referenced this pull request


          Replace EOS with newline to prevent context/memory being flushed by E…

8faa6c7

…OS in interactive mode (#333)

* Improve interactive mode's coherence after EOS

Aims to improve coherence and ability to resume the interactive session when the user is given input back after an end of text token is reached.
Not sure what token 13 is or why it seems to help. See conversation for examples.

* Make newline token a constant

* dynamically determine newline token

* relocate previous newline token const

* cleanup whitespace

* print a new line on end of text in interactive

this may need to be looked into further when not using a reverse prompt

* only print manual newline with reverse prompt

fix formatting of reverse prompts so they don't end up at the end of the current line while not introducing unnecessary new lines otherwise

* alternate approach to replace end of text tokens

* Inject the reverse prompt again after eos in interactive mode

* tokenize reverse prompt when needed

makes this PR compatible with ggml-org/llama.cpp#330

* tokenize and inject only first reverse prompt

thanks to tjohnman

* tokenize first reverse prompt once

* add newline token

* add newline token

* tokenize/inject reverse prompt for refactor

this doesn't seem right though

* tokenize nothing for antiprompt if no reverse

* Update main.cpp

* Update main.cpp

* tokenize and inject reverse prompt as needed

this doesn't seem to work if the reverse prompt is tokenized outside earlier on

* not needed

* remove newline token

* remove newline token

* tokenize newline token

* add space to comment

* Update main.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: Slaren <2141330+slaren@users.noreply.github.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

Bearsaerker mentioned this pull request

Eval bug: Gemma 3 extremly slow prompt processing when using quantized kv cache. #12352

Closed

Vithulep pushed a commit to Vithulep/llama.cpp that referenced this pull request


          Check for reverse prompt by characters instead of tokens (ggml-org#292)…

90c9ca6

… (ggml-org#330)

* Check for reverse prompt by characters instead of tokens (ggml-org#292)

* Update main.cpp

Wording.

* Cleanup.

* Remove unnecessary use of std::stringstream.

---------

Co-authored-by: Johnman <tjohnman@github>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

ganminghao pushed a commit to ganminghao/PowerInfer that referenced this pull request


          Replace EOS with newline to prevent context/memory being flushed by E…

f31fc15

…OS in interactive mode (#333)

* Improve interactive mode's coherence after EOS

Aims to improve coherence and ability to resume the interactive session when the user is given input back after an end of text token is reached.
Not sure what token 13 is or why it seems to help. See conversation for examples.

* Make newline token a constant

* dynamically determine newline token

* relocate previous newline token const

* cleanup whitespace

* print a new line on end of text in interactive

this may need to be looked into further when not using a reverse prompt

* only print manual newline with reverse prompt

fix formatting of reverse prompts so they don't end up at the end of the current line while not introducing unnecessary new lines otherwise

* alternate approach to replace end of text tokens

* Inject the reverse prompt again after eos in interactive mode

* tokenize reverse prompt when needed

makes this PR compatible with ggml-org/llama.cpp#330

* tokenize and inject only first reverse prompt

thanks to tjohnman

* tokenize first reverse prompt once

* add newline token

* add newline token

* tokenize/inject reverse prompt for refactor

this doesn't seem right though

* tokenize nothing for antiprompt if no reverse

* Update main.cpp

* Update main.cpp

* tokenize and inject reverse prompt as needed

this doesn't seem to work if the reverse prompt is tokenized outside earlier on

* not needed

* remove newline token

* remove newline token

* tokenize newline token

* add space to comment

* Update main.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: Slaren <2141330+slaren@users.noreply.github.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

leozoro-cloud added a commit to leozoro-cloud/llama.cpp that referenced this pull request


          Replace EOS with newline to prevent context/memory being flushed by E…

3d97fb5

…OS in interactive mode (#333)

* Improve interactive mode's coherence after EOS

Aims to improve coherence and ability to resume the interactive session when the user is given input back after an end of text token is reached.
Not sure what token 13 is or why it seems to help. See conversation for examples.

* Make newline token a constant

* dynamically determine newline token

* relocate previous newline token const

* cleanup whitespace

* print a new line on end of text in interactive

this may need to be looked into further when not using a reverse prompt

* only print manual newline with reverse prompt

fix formatting of reverse prompts so they don't end up at the end of the current line while not introducing unnecessary new lines otherwise

* alternate approach to replace end of text tokens

* Inject the reverse prompt again after eos in interactive mode

* tokenize reverse prompt when needed

makes this PR compatible with ggml-org/llama.cpp#330

* tokenize and inject only first reverse prompt

thanks to tjohnman

* tokenize first reverse prompt once

* add newline token

* add newline token

* tokenize/inject reverse prompt for refactor

this doesn't seem right though

* tokenize nothing for antiprompt if no reverse

* Update main.cpp

* Update main.cpp

* tokenize and inject reverse prompt as needed

this doesn't seem to work if the reverse prompt is tokenized outside earlier on

* not needed

* remove newline token

* remove newline token

* tokenize newline token

* add space to comment

* Update main.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: Slaren <2141330+slaren@users.noreply.github.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request


          Check for reverse prompt by characters instead of tokens (ggml-org#292)…

f571f4b

… (ggml-org#330)

* Check for reverse prompt by characters instead of tokens (ggml-org#292)

* Update main.cpp

Wording.

* Cleanup.

* Remove unnecessary use of std::stringstream.

---------

Co-authored-by: Johnman <tjohnman@github>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request


          Replace EOS with newline to prevent context/memory being flushed by E…

25a886c

…OS in interactive mode (ggml-org#333)

* Improve interactive mode's coherence after EOS

Aims to improve coherence and ability to resume the interactive session when the user is given input back after an end of text token is reached.
Not sure what token 13 is or why it seems to help. See conversation for examples.

* Make newline token a constant

* dynamically determine newline token

* relocate previous newline token const

* cleanup whitespace

* print a new line on end of text in interactive

this may need to be looked into further when not using a reverse prompt

* only print manual newline with reverse prompt

fix formatting of reverse prompts so they don't end up at the end of the current line while not introducing unnecessary new lines otherwise

* alternate approach to replace end of text tokens

* Inject the reverse prompt again after eos in interactive mode

* tokenize reverse prompt when needed

makes this PR compatible with ggml-org#330

* tokenize and inject only first reverse prompt

thanks to tjohnman

* tokenize first reverse prompt once

* add newline token

* add newline token

* tokenize/inject reverse prompt for refactor

this doesn't seem right though

* tokenize nothing for antiprompt if no reverse

* Update main.cpp

* Update main.cpp

* tokenize and inject reverse prompt as needed

this doesn't seem to work if the reverse prompt is tokenized outside earlier on

* not needed

* remove newline token

* remove newline token

* tokenize newline token

* add space to comment

* Update main.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: Slaren <2141330+slaren@users.noreply.github.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

phuongncn pushed a commit to phuongncn/llama.cpp-gx10-dgx-sparks-deepseekv4 that referenced this pull request


          Check for reverse prompt by characters instead of tokens (ggml-org#292)…

2a44311

… (ggml-org#330)

* Check for reverse prompt by characters instead of tokens (ggml-org#292)

* Update main.cpp

Wording.

* Cleanup.

* Remove unnecessary use of std::stringstream.

---------

Co-authored-by: Johnman <tjohnman@github>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

phuongncn pushed a commit to phuongncn/llama.cpp-gx10-dgx-sparks-deepseekv4 that referenced this pull request


          Allow q8_0 KV cache for head size 256 (ggml-org#330)

1a78685

* Allow q8_0 KV cache for head size 256

* We need also these

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>

ljubomirj pushed a commit to ljubomirj/llama.cpp that referenced this pull request


          Check for reverse prompt by characters instead of tokens (ggml-org#292)…

b17b173

… (ggml-org#330)

* Check for reverse prompt by characters instead of tokens (ggml-org#292)

* Update main.cpp

Wording.

* Cleanup.

* Remove unnecessary use of std::stringstream.

---------

Co-authored-by: Johnman <tjohnman@github>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

ljubomirj pushed a commit to ljubomirj/llama.cpp that referenced this pull request


          Check for reverse prompt by characters instead of tokens (ggml-org#292)…

ca21d0e

… (ggml-org#330)

* Check for reverse prompt by characters instead of tokens (ggml-org#292)

* Update main.cpp

Wording.

* Cleanup.

* Remove unnecessary use of std::stringstream.

---------

Co-authored-by: Johnman <tjohnman@github>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug