Skip to content

Conversation

@corbt
Copy link
Contributor

@corbt corbt commented Jan 21, 2026

Summary

  • Avoid dropping tool calls from assistant messages in context by only inserting sentinels for trainable assistant messages.
  • Preserve tool_calls in the chat template for trainable assistants and fail fast if a tool-call dict would be tokenized via content-only splicing.
  • Proposal: drop support for allow_training_without_logprobs (importance sampling needs logprobs; this mode complicates paths and enables subtle tool-call bugs).

Test plan

  • Not run (tokenization-only change)

Only splice trainable assistant spans and keep tool_calls
in the template; error if tool_calls would be dropped.
- Use .get() instead of direct [] access for tool_calls to handle
  message types that don't have this key
- Cast message to dict[str, Any] when appending to token_template_messages
@corbt corbt merged commit b220140 into main Jan 21, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants