Skip to content

common : add common_chat_split_by_role#21885

Draft
aldehir wants to merge 2 commits intoggml-org:masterfrom
aldehir:split-prompt-by-role
Draft

common : add common_chat_split_by_role#21885
aldehir wants to merge 2 commits intoggml-org:masterfrom
aldehir:split-prompt-by-role

Conversation

@aldehir
Copy link
Copy Markdown
Contributor

@aldehir aldehir commented Apr 14, 2026

Overview

Implement common_chat_split_by_role to split a prompt into chunks by role.

struct common_chat_msg_span {
    std::string role;
    std::size_t pos = 0;
    std::size_t len = 0;
};

std::vector<common_chat_msg_span> common_chat_split_by_role(const std::string & prompt, const std::vector<common_chat_msg_delimiter> & delims);

// Example, populated in init_params
data.message_spans = common_chat_split_by_role(prompt, {
    { "assistant", "<|start|>assistant" },
    { "user",      "<|start|>user"      },
    { "system",    "<|start|>developer" },
    { "system",    "<|start|>system"    },
    { "tool",      "<|start|>functions" },
});

Since each chunk should end at a clean boundary, so they can be tokenized individually.

ref: #20424 (review)

@ggerganov is something like this what you had in mind?

Requirements

  • I have read and agree with the contributing guidelines
  • AI usage disclosure: YES, I had it write the function as I wanted it.

@github-actions github-actions Bot added the testing Everything test related label Apr 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

testing Everything test related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant