Double-bos needs more general implementation

i think we may need to generalize that bos check. i found another place where it fails:

https://github.com/NVIDIA-NeMo/RL/blob/b6269f70f40fd33ca5adfd1f9aa0cac4619581d4/nemo_rl/data/processors.py#L61-L67

in the deepscaler base when you try to eval, it’ll add the double-bos, but the chat-template doesn’t have any obvious indicators b/c it’s a more complicated jinja template

https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B/blob/main/tokenizer_config.json#L34

i think we may need some kind of `apply_safe_chat_template()` or something that we re-use throughout the repo that remembers the decision to handle this bos-token. do you think you can look into a general fix?

	message = tokenizer.apply_chat_template(
	[user_message],
	tokenize=False,
	add_generation_prompt=True,
	add_special_tokens=False,
	)
	user_message["token_ids"] = tokenizer(message, return_tensors="pt")["input_ids"][0]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Double-bos needs more general implementation #855

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Double-bos needs more general implementation #855

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions