Skip to content

[Mirror] server : refactor oai_parser_opt, move it to server_chat_params#83

Open
ngxson wants to merge 4 commits intomasterfrom
xsn/server_chat_params
Open

[Mirror] server : refactor oai_parser_opt, move it to server_chat_params#83
ngxson wants to merge 4 commits intomasterfrom
xsn/server_chat_params

Conversation

@ngxson
Copy link
Owner

@ngxson ngxson commented Jan 19, 2026

Mirror from upstream PR: ggml-org#18937

Note: @coderabbitai use my 'Mirror PR' preset for reviewing this.

Summary by CodeRabbit

  • Refactor
    • Consolidated chat parameter handling and template management throughout the system.
    • Updated chat template API signatures to use modern data types.
    • Reorganized CLI input processing for improved server-side handling.
    • Refined server metadata structure for clearer API responses.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Jan 19, 2026

📝 Walkthrough

Walkthrough

The PR refactors chat template handling by converting common_chat_templates_source to return std::string instead of const char*, consolidating chat configuration into a server_chat_params struct (renamed from oaicompat_parser_options), adding a format_chat() method to CLI context, and replacing JSON-based CLI input with structured cli flag and cli_prompt fields in server tasks.

Changes

Cohort / File(s) Summary
Chat Template API
common/chat.h, common/chat.cpp
Updated common_chat_templates_source signature: returns std::string instead of const char*; parameter changed from const char* to const std::string& with empty string default; function treats empty variant as no variant and returns empty string instead of nullptr.
Server Chat Parameters
tools/server/server-common.h, tools/server/server-common.cpp
Renamed struct oaicompat_parser_options to server_chat_params; replaced raw pointer tmpls with smart pointer common_chat_templates_ptr; updated oaicompat_chat_params_parse signature to use new struct type.
CLI Chat Formatting
tools/cli/cli.cpp
Added public method format_chat() to cli_context that constructs and applies chat templates; integrated formatting into generate_completion flow, storing result in task.cli_prompt and setting task.cli flag.
Server Context Integration
tools/server/server-context.h, tools/server/server-context.cpp
Consolidated chat configuration: replaced separate chat_templates and oai_parser_opt members with single chat_params field; updated metadata exposure, template source queries, and thinking enablement logic to use chat_params.tmpls.
Server Task Structure
tools/server/server-task.h
Replaced json cli_input field with bool cli flag and std::string cli_prompt for explicit CLI context handling on server side.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 Templates now flow with strings so bright,
Chat params consolidated—what a sight!
Format and apply with CLI's delight,
Refactored with care, the code takes flight! ✨

🚥 Pre-merge checks | ✅ 1 | ❌ 2
❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 29.41% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive The PR description is minimal and primarily references an upstream PR with a reviewer preset note, lacking substantive details about the changes, motivation, testing, or impact on the codebase. Expand the description to explain the refactoring motivation, key changes across files, breaking API changes, testing approach, and any migration guidance for users or downstream code.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately reflects the main objective of refactoring oai_parser_opt and moving it to server_chat_params, which is clearly demonstrated across multiple files in the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@tools/server/server-common.h`:
- Around line 277-286: The struct server_chat_params currently relies on
external aggregate initialization; add in-class value defaults to make it
resilient by initializing POD members and containers directly in the struct
definition: set booleans like use_jinja, prefill_assistant, allow_image,
allow_audio to false, initialize enable_thinking = true (preserve current),
initialize media_path to an empty string, default-construct chat_template_kwargs
(std::map) and tmpls (common_chat_templates_ptr) and give a sensible default or
zero-init for reasoning_format; update the declaration of server_chat_params so
these defaults exist inline to ensure deterministic values even if init() is not
called.
🧹 Nitpick comments (2)
tools/server/server-context.h (1)

23-25: Consider making chat_params a const reference.
server_context_meta is described as read-only; using a non-const reference permits accidental mutation through meta. A const reference would enforce intent.

🔧 Suggested change
-    server_chat_params & chat_params;
+    const server_chat_params & chat_params;
tools/server/server-context.cpp (1)

3184-3186: Unusual but intentional variable shadowing pattern.

This pattern of declaring bool ctx_server to shadow the member variable is an unconventional way to prevent accidental usage during sleeping state. The comment "do NOT delete this line" makes the intent clear.

While this works, it's a bit fragile - if someone adds (void)ctx_server; or uses this->ctx_server, the protection is bypassed.

Consider a more robust approach like extracting these sleep-safe handlers to a separate class or using a static helper, but this is low priority given the clear warning comment.

Comment on lines +277 to 286
struct server_chat_params {
bool use_jinja;
bool prefill_assistant;
common_reasoning_format reasoning_format;
std::map<std::string,std::string> chat_template_kwargs;
common_chat_templates * tmpls;
std::map<std::string, std::string> chat_template_kwargs; // mapping key --> json value
common_chat_templates_ptr tmpls;
bool allow_image;
bool allow_audio;
bool enable_thinking = true;
std::string media_path;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

rg -n -C3 '\bserver_chat_params\b' --type-list | head -20

Repository: ngxson/llama.cpp

Length of output: 449


🏁 Script executed:

rg -n -C5 '\bserver_chat_params\b' -t cpp -t h

Repository: ngxson/llama.cpp

Length of output: 3512


🏁 Script executed:

rg -n -C10 'server_chat_params chat_params' tools/server/server-context.cpp

Repository: ngxson/llama.cpp

Length of output: 774


🏁 Script executed:

rg -n 'chat_params\s*=' -A2 -B2 tools/server/server-context.cpp | head -40

Repository: ngxson/llama.cpp

Length of output: 336


🏁 Script executed:

sed -n '895,925p' tools/server/server-context.cpp

Repository: ngxson/llama.cpp

Length of output: 1403


🏁 Script executed:

rg -n 'chat_params' tools/server/server-context.cpp | head -30

Repository: ngxson/llama.cpp

Length of output: 1122


🏁 Script executed:

sed -n '880,920p' tools/server/server-context.cpp

Repository: ngxson/llama.cpp

Length of output: 2147


🏁 Script executed:

rg -n 'struct server_context_impl' tools/server/server-context.cpp -A30 | head -50

Repository: ngxson/llama.cpp

Length of output: 1092


🏁 Script executed:

sed -n '850,880p' tools/server/server-context.cpp

Repository: ngxson/llama.cpp

Length of output: 1021


🏁 Script executed:

rg -n 'void.*init\|bool.*init' tools/server/server-context.cpp | head -15

Repository: ngxson/llama.cpp

Length of output: 42


🏁 Script executed:

sed -n '820,860p' tools/server/server-context.cpp

Repository: ngxson/llama.cpp

Length of output: 1566


🏁 Script executed:

grep -n "populate chat template params" tools/server/server-context.cpp -B30 | head -40

Repository: ngxson/llama.cpp

Length of output: 1174


🏁 Script executed:

sed -n '847,950p' tools/server/server-context.cpp

Repository: ngxson/llama.cpp

Length of output: 4167


Consider in-class value-initialization for server_chat_params fields.
While the struct is properly initialized via aggregate initialization in init() before any use, relying on this pattern is fragile. Adding in-class defaults (e.g., bool use_jinja = false;) would make the struct more resilient and ensure deterministic values even if initialization paths change. Currently, POD types have indeterminate values from line 538 until the assignment at line 900, and the code depends on init() always being called.

🤖 Prompt for AI Agents
In `@tools/server/server-common.h` around lines 277 - 286, The struct
server_chat_params currently relies on external aggregate initialization; add
in-class value defaults to make it resilient by initializing POD members and
containers directly in the struct definition: set booleans like use_jinja,
prefill_assistant, allow_image, allow_audio to false, initialize enable_thinking
= true (preserve current), initialize media_path to an empty string,
default-construct chat_template_kwargs (std::map) and tmpls
(common_chat_templates_ptr) and give a sensible default or zero-init for
reasoning_format; update the declaration of server_chat_params so these defaults
exist inline to ensure deterministic values even if init() is not called.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant