server/webui: cleanup dual representation approach, simplify to openai-compat by pwilkin · Pull Request #21090 · ggml-org/llama.cpp

pwilkin · 2026-03-28T00:06:58Z

Overview

Removes the double-representation from WebUI, bases everything on OpenAI-compat message history

Additional information

Since the current dual representation approach seems to run into a lot of problems (see eg. #21087 ) and will probably run into more when we add the MCP proxy server, I reckoned I'd try to clean it up by removing the double representation and making everything derive from simple chat history - and it seems to be working out pretty well from my tests (tested both normal reasoning and reasoning with tool calling).

Requirements

I have read and agree with the contributing guidelines
AI usage disclosure: Yes, no way I rewrite all of that by hand 😛

@allozaur @ServeurpersoCom could you take a look and see if it makes sense to you?

ServeurpersoCom · 2026-03-28T00:36:53Z

This will be a major refactor with many edge cases!

pwilkin · 2026-03-28T00:40:20Z

@ServeurpersoCom I thought so too, but it went surprisingly smoothly, all tests pass and nothing seems broken at least with cursory testing, so maybe it's not as bad as it looks?

ServeurpersoCom · 2026-03-28T00:45:58Z

Technically it only touches the storage/rendering side, I'll pull the branch tomorrow and give it a proper test run, because honestly this is how it was originally supposed to be done! This is really the final piece of the MCP client integration, which started as a simple OAI-compat proxy outside the WebUI.

That said, there are a lot of subtle edge cases we've been polishing over the past weeks (streaming interruptions mid-tool-call, partial marker handling, continue generation with open reasoning blocks, tool result truncation, etc.), so I want to make sure nothing regresses before giving the thumbs up!

ServeurpersoCom · 2026-03-28T10:52:40Z

I'm currently testing it: it's encouraging, it just works. This layer of abstraction was indeed the final blemish on our implementation. And it was much less disruptive than I thought because we naturally converged towards the same structure.

Some obsolete Settings options to remove:

showRawOutputSwitch
alwaysShowAgenticTurns
maxToolPreviewLines

I'm keeping your commit on my everyday server; after a few days of use, I'll be able to spot any edge cases I might have missed.

allozaur · 2026-03-28T10:57:54Z

I don't thikn showRawOutputSwitch is obsolete, we still might want to use it to debug Markdown rendering

ServeurpersoCom · 2026-03-28T11:00:28Z

I don't thikn showRawOutputSwitch is obsolete, we still might want to use it to debug Markdown rendering

Ah yes, it needs to be simplified/reworked (broken on the commit for now). We need it for markdown.

ServeurpersoCom

Honestly, we can merge quickly and fix what we find, we'll add commits on top and we'll make progress! It works.

allozaur · 2026-03-28T13:13:13Z

Honestly, we can merge quickly and fix what we find, we'll add commits on top and we'll make progress! It works.

Not exactly 😅 Let's first address any regressions that this PR introduces and then let's merge.

ServeurpersoCom · 2026-03-28T14:01:08Z

Not exactly 😅 Let's first address any regressions that this PR introduces and then let's merge.

We're going to work together to eradicate them 😅

pwilkin · 2026-03-28T18:30:42Z

Can you tell me what the regressions found are?

allozaur · 2026-03-29T19:30:56Z

Can you tell me what the regressions found are?

I am yet to finish reviewing and testing this PR tomorrow, I will let u know from my side.

allozaur · 2026-03-30T16:34:39Z

@pwilkin i've fixed the agentic turns UI regression + added few more tests. This is a middle-step to what we've internally discussed about having separate structure for chat messages data that is 100% OpenAI compatible, but on the UI end we want to show the assistant + tool messages as a joint agentic chain of messages

…i-compat

ServeurpersoCom · 2026-03-30T17:07:13Z

Cool, it's definitely more compact this way while retaining the advantages of clean, structured data, good thinking Alek!

pwilkin · 2026-03-30T18:48:38Z

Looking good!

ServeurpersoCom · 2026-03-30T18:54:01Z

All that remains is to correct the agent turn display options (visually useful only if there are multiple CoT + Toolcall pairs), and fix the raw Markdown display switch. I've rebased the pull request on my server to try more:)

allozaur · 2026-03-30T21:50:44Z

All that remains is to correct the agent turn display options (visually useful only if there are multiple CoT + Toolcall pairs), and fix the raw Markdown display switch. I've rebased the pull request on my server to try more:)

these should already be fixed by now

aldehir · 2026-04-01T06:35:22Z

Good work on this PR. The conversation history looks like it's a proper alternating sequence of assistant/tool roles.

One slight issue, the reasoning content is still not sent back on assistant messages so I opened #21249 to address.

…i-compat (ggml-org#21090) * server/webui: cleanup dual representation approach, simplify to openai-compat * feat: Fix regression for Agentic Loop UI * chore: update webui build output * refactor: Post-review code improvements * chore: update webui build output * refactor: Cleanup * chore: update webui build output --------- Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>

pwilkin requested a review from a team as a code owner March 28, 2026 00:06

github-actions Bot added examples server labels Mar 28, 2026

loci-dev mentioned this pull request Mar 28, 2026

UPSTREAM PR #21090: server/webui: cleanup dual representation approach, simplify to openai-compat auroralabs-loci/llama.cpp#1306

Open

ServeurpersoCom approved these changes Mar 28, 2026

View reviewed changes

allozaur force-pushed the cleanup-webui branch from 7b73e73 to 863641c Compare March 30, 2026 16:32

pwilkin and others added 3 commits March 30, 2026 18:54

server/webui: cleanup dual representation approach, simplify to opena…

ae9bc9d

…i-compat

feat: Fix regression for Agentic Loop UI

13da99c

chore: update webui build output

17f4ed4

allozaur added 2 commits March 30, 2026 19:28

refactor: Post-review code improvements

963f354

chore: update webui build output

36128db

allozaur force-pushed the cleanup-webui branch from 863641c to 36128db Compare March 30, 2026 17:29

allozaur added 2 commits March 30, 2026 21:24

refactor: Cleanup

446c6d4

chore: update webui build output

bde9f85

allozaur approved these changes Mar 30, 2026

View reviewed changes

Comment thread tools/server/webui/src/lib/stores/conversations.svelte.ts

Comment thread tools/server/webui/src/lib/types/chat.d.ts Outdated

Comment thread tools/server/webui/src/lib/utils/legacy-migration.ts

allozaur requested a review from ggerganov March 30, 2026 19:54

ServeurpersoCom approved these changes Mar 30, 2026

View reviewed changes

allozaur merged commit 4453e77 into ggml-org:master Mar 31, 2026
6 checks passed

Conversation

pwilkin commented Mar 28, 2026

Overview

Additional information

Requirements

Uh oh!

ServeurpersoCom commented Mar 28, 2026

Uh oh!

pwilkin commented Mar 28, 2026

Uh oh!

ServeurpersoCom commented Mar 28, 2026

Uh oh!

ServeurpersoCom commented Mar 28, 2026

Uh oh!

allozaur commented Mar 28, 2026

Uh oh!

ServeurpersoCom commented Mar 28, 2026

Uh oh!

ServeurpersoCom left a comment

Choose a reason for hiding this comment

Uh oh!

allozaur commented Mar 28, 2026

Uh oh!

ServeurpersoCom commented Mar 28, 2026

Uh oh!

pwilkin commented Mar 28, 2026

Uh oh!

allozaur commented Mar 29, 2026

Uh oh!

allozaur commented Mar 30, 2026

Uh oh!

ServeurpersoCom commented Mar 30, 2026

Uh oh!

pwilkin commented Mar 30, 2026

Uh oh!

ServeurpersoCom commented Mar 30, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

allozaur commented Mar 30, 2026

Uh oh!

Uh oh!

aldehir commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants