Skip to content

feat: add "stop" keywords as alternative to eot token#769

Closed
longregen wants to merge 2 commits intoggml-org:masterfrom
longregen:stop-keywords
Closed

feat: add "stop" keywords as alternative to eot token#769
longregen wants to merge 2 commits intoggml-org:masterfrom
longregen:stop-keywords

Conversation

@longregen
Copy link
Copy Markdown
Contributor

Rewrite of #365 (addresses #57) by @joshmackwilliams, updated to the new folder/file structure as the PR might have been abandoned.

From the original author:

Implements #57.
Stop keywords can be specified using the "--stop" parameter. Upon seeing one of these keywords in the generated output, the model will terminate generation immediately. Like reverse prompts, multiple stop keywords can be specified by specifying the --stop argument multiple times.
The implementation is heavily based on the reverse prompt implementation to keep things simple. Tested using 7B (quantized) in both interactive and non-interactive modes.

Comment thread examples/main/main.cpp Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should check the token id instead of the string for stop.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For context, the precedent set by #330 is to check for the string (in reverse prompts). I think checking for tokens caused a bug, or at least unintuitive behavior (#292).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, this should work in the same way as "antiprompts" to provide a better UX to users. Users should be able to add to the CLI parameters --stop "### Assistant" (for example, in the spirit of the trained vicuna model) or --stop PAUSE (for example, to implement Simon Willinson's ReAct Python example), even though these are multi-token markers.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we know why? It will be a good learning for me to understand why token will not work? Because one stop string can be generated by the different tokens??

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, see #292 (comment)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also noticed that the stop words weren't always consistently formatted. Sometimes it would do STOP instead of stop. I'd appreciate a normalize function to down case before comparing.

@joshmackwilliams
Copy link
Copy Markdown

Yes, I got busy and no longer have time to maintain that PR. Thanks for rewriting!

@joshmackwilliams joshmackwilliams mentioned this pull request Apr 10, 2023
Copy link
Copy Markdown
Member

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't looked in details.
Merge it if you have confirmed to be working as expected

@longregen
Copy link
Copy Markdown
Contributor Author

longregen commented Apr 14, 2023

While #863 appears to be more polished, I'd merge these changes since it's uncertain when that one will be reviewed and integrated. Plus, it's likely that any significant use of this code would involve utilizing the library rather than relying on this example binary.

@ggerganov
Copy link
Copy Markdown
Member

@longregen
#863 is now ready to merge. Is there anything that this PR adds that is not covered by #863 ?
If yes, maybe we can create a separate PR after the merge?

@longregen longregen closed this Apr 14, 2023
@longregen
Copy link
Copy Markdown
Contributor Author

longregen commented Apr 14, 2023

@longregen #863 is now ready to merge. Is there anything that this PR adds that is not covered by #863 ? If yes, maybe we can create a separate PR after the merge?

#863 is a superset of this PR. Closed this one as that other got merged. Thank you for your awesome work!

@longregen longregen deleted the stop-keywords branch April 14, 2023 15:40
@KevinColemanInc
Copy link
Copy Markdown

@longregen The PR you mentioned has been reverted (c85e03d). Is the plan to wait for that to be tested more or should we reopen this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants