Skip to content

update AutoGuess template#1627

Closed
tsite wants to merge 2 commits intoLostRuins:concedofrom
tsite:test
Closed

update AutoGuess template#1627
tsite wants to merge 2 commits intoLostRuins:concedofrom
tsite:test

Conversation

@tsite
Copy link
Copy Markdown

@tsite tsite commented Jun 29, 2025

fix rwkv bug
add vicuna, dots llm

fix rwkv bug
add vicuna, dots llm
@LostRuins
Copy link
Copy Markdown
Owner

did you test the vicuna one? I have not found a single model that matches it. if its not a real template in use we should not add it.

@tsite
Copy link
Copy Markdown
Author

tsite commented Jun 30, 2025

wizardlm-2 (https://huggingface.co/alpindale/WizardLM-2-8x22B) uses a variant of Vicuna - I updated the AutoGuess template to match

@LostRuins
Copy link
Copy Markdown
Owner

LostRuins commented Jun 30, 2025

Also what's wrong with the rwkv template, that was from @henk717

Generally for these weird and uncommon formats I feel its not ideal to clutter the autoguess with something that literally nobody uses except for 1 tiny case, vicuna is basically unused in the wild and when necessary alpaca outperforms it in all cases.

@henk717
Copy link
Copy Markdown
Collaborator

henk717 commented Jun 30, 2025

Rwkv should remain as is. It was directly given to me by one of the rwkv developers so only they should have the say on it. Just like with Mistral i like to keep the official templates official.

@LostRuins LostRuins added the invalid This doesn't seem right label Jun 30, 2025
@tsite
Copy link
Copy Markdown
Author

tsite commented Jun 30, 2025

The rwkv template in autoguess.json does not match the one in https://github.com/BlinkDL/ChatRWKV/blob/main/API_DEMO_CHAT.py. Will close this pr out as the added templates are for "uncommon" models, but I recommend fixing the rwkv template in a separate pr.

@tsite tsite closed this Jun 30, 2025
@tsite tsite deleted the test branch June 30, 2025 21:52
@kallewoof
Copy link
Copy Markdown

kallewoof commented Jul 18, 2025

+1 on fixing the RWKV chat templates. This is what the ChatRWKV demo code looks like:

        out = run_rnn("User: " + msg + "\n\nAssistant:")

where run_rnn simply tokenizes the first arg and passes it to the model, so at least in this case, the format is "User: " for the user prefix and "Assistant:" (no space after ":") for the assistant. If @henk717 could verify that would be a plus.

@LostRuins LostRuins mentioned this pull request Jul 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

invalid This doesn't seem right

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants