fix(widgets): include change magnitude in patch/render status text by nsyring · Pull Request #32 · agent0ai/space-agent

nsyring · 2026-04-26T21:11:41Z

fix(widgets): include change magnitude in patch/render status text

Summary

patchWidget, renderWidget, and upsertWidget currently return the same one-line success status regardless of how big the actual edit was: Widget "X" patched, rendered ok, loaded to TRANSIENT. for a one-line tweak and the same wording for a 600-line full-renderer rewrite. This PR exposes pre-write and post-write renderer line counts on the tool result envelope as structured fields and renders them as a magnitude fragment in the existing status string. Both the agent and a reviewer reading the chat see the size of the change at a glance.

After:

Small patch: Widget "X" patched, 353 renderer lines (was 351, +2), rendered ok, loaded to TRANSIENT.
Full rewrite: Widget "X" saved, 610 renderer lines (was 50, +560), rendered ok, loaded to TRANSIENT.
New widget: Widget "X" saved, 50 renderer lines, rendered ok, loaded to TRANSIENT.
Unchanged length: Widget "X" patched, 351 renderer lines, rendered ok, loaded to TRANSIENT.

Why

Without the magnitude, the tool result is ambiguous. The agent gets the same positive signal whether it patched two characters or rewrote the whole widget, and during local development I reproduced this scope-creep loop repeatedly:

User asks for a small UI tweak, e.g. "hide the API-key input field, we don't need it"
Agent patches the relevant lines correctly and the runtime confirms Widget "X" patched, rendered ok, loaded to TRANSIENT.
Agent then continues "cleaning up" — drops localStorage persistence, setAuth(...) helper, Authorization header logic, etc. — and emits renderWidget(...) with a 296-line body to replace the existing 351-line renderer
Runtime confirms the rewrite with the same wording: Widget "X" saved, rendered ok, loaded to TRANSIENT.
Nothing in the result told the agent (or the reviewer reading the chat) that step 3 replaced the entire renderer for a request that only needed step 2

The point is not to forbid renderWidget — sometimes a full rewrite is the right call (broken renderer, explicit user request to start over). The point is to give the same observability that a Senior Dev applies in code review: the size of the change relative to the asked-for change. With magnitude visible in the status string, the agent's next-turn reasoning has the data it needs to spot scope creep, and a human reviewing the chat transcript can spot it at a glance.

This is the soft-nudge approach. No new skill rule, no validator veto, no API surface change for callers — just more information in an existing string the agent already reads every turn, plus structured numeric fields on the tool result envelope so eval harnesses, skill rules, and future tooling can read the change magnitude without parsing the status string.

What changed

`app/L0/_all/mod/_core/spaces/storage.js` (+24 / -3)

getWidgetRendererReadLines(widgetRecord) is unchanged in behavior; its dedent + LF-normalize + split body is extracted into a new getRendererSourceReadLines(rendererSource) helper so a raw renderer-source string can produce the same line array. A second helper countRendererSourceReadLines(rendererSource) returns the line count via the same path.
buildWidgetWriteResult(spaceRecord, widgetId, priorRendererSource = null) accepts an optional pre-write renderer source. It computes the line count via countRendererSourceReadLines(...) for both the prior source and the post-write widgetRecord.rendererSource, exposing them as priorRendererLineCount and nextRendererLineCount on the result envelope. The full source strings are not exposed.
patchWidget(...) and upsertWidget(...) capture the existing widget's rendererSource before they overwrite it and pass it through. For an upsertWidget call against a brand-new widget id priorRendererSource is null so only nextRendererLineCount is set; the magnitude formatter degrades cleanly.

The line counts use the same dedent + LF-normalize + split path that formatWidgetRecordForRead(widgetRecord) produces in widgetText, so the count printed in the status string always matches the highest line index the agent counts in the numbered renderer readback.

`app/L0/_all/mod/_core/spaces/store.js` (+19 / -3)

Adds one pure helper formatWidgetOperationChangeMagnitude({priorLineCount, nextLineCount}):

Number.isFinite guards against missing counts (e.g. on reloadWidget and other callers that don't carry a source mutation through buildWidgetWriteResult); returns "" so the existing status string is unchanged
Branches cleanly across new-widget ("50 renderer lines"), full-delete ("0 renderer lines (was 351)"), unchanged-length ("351 renderer lines"), growth ("353 renderer lines (was 351, +2)"), and shrink ("296 renderer lines (was 351, -55)")
ASCII - and + only — no Unicode minus sign, keeps status text 7-bit clean

formatWidgetOperationStatusText(...) accepts new priorLineCount and nextLineCount options; the magnitude fragment goes between the verb (patched/saved/...) and the render-status fragment (rendered ok/render failed/not live-tested). The substring , rendered ok, is preserved, just relocated within the sentence; consumers that grep on the verb or on the render-status fragment continue to work.

buildWidgetToolResult(...) reads priorRendererLineCount and nextRendererLineCount off the result envelope and passes them through. Storage results that did not carry a source mutation (e.g. reloadWidget) end up with both counts null, the magnitude fragment is empty, and the status string is byte-identical to today.

Behavior matrix

Case	Status before	Status after
`patchWidget` 1-line replace on existing	`patched, rendered ok, ...`	`patched, 353 renderer lines (was 351, +2), rendered ok, ...`
`renderWidget` on existing id	`saved, rendered ok, ...`	`saved, 296 renderer lines (was 351, -55), rendered ok, ...`
`renderWidget` for new id	`saved, rendered ok, ...`	`saved, 50 renderer lines, rendered ok, ...`
Patch with no length change	`patched, rendered ok, ...`	`patched, 351 renderer lines, rendered ok, ...`
Full delete (rare, replace-with-empty edge case)	`patched, rendered ok, ...`	`patched, 0 renderer lines (was 351), rendered ok, ...`
`reloadWidget` (no source mutation)	`reloaded, rendered ok, ...`	unchanged — no prior/next on the result
`removeWidget` and other no-source-write paths	unchanged	unchanged — no prior/next on the result

No callers of formatWidgetOperationStatusText or buildWidgetWriteResult need to change. The new line-count info is purely additive to buildWidgetWriteResult's return value, and the new options on the format function default to null/undefined and skip the fragment.

Structured access for tooling

Skill rules, eval harnesses, and downstream tools can read the change magnitude via the structured fields rather than regex-extracting from the status string:

const result = await space.current.patchWidget(...);
// result is the existing object plus:
//   priorRendererLineCount: 351,  // when present
//   nextRendererLineCount:  353,  // when present

This is the upgrade path mentioned by the reviewer: a future skill rule that reacts to large rewrites can read these numeric fields directly instead of parsing strings.

Test plan

node --check on both modified files passes
Pure-helper sanity table verified: new widget, small patch, full rewrite, full delete, unchanged length, growth, shrink, missing-data and prior-only cases all return the documented status fragments. ASCII - confirmed; no U+2212.
Existing tests/agent_llm_performance fixtures preserved as-is. The fixtures encode historic conversation snapshots used as model-input contexts; the test harness does not assert on the exact framework status string (grep -c "patched, rendered ok" tests/agent_llm_performance/test.mjs → 0). New conversations will reach the model with the magnitude fragment included; the existing fixtures continue to drive the eval baseline they were captured against.
Manual verification across two providers in npm run desktop:pack builds:
- gpt-5.4 via OpenAI Codex provider — renderWidget rewrites now surface their magnitude (e.g. saved, 296 renderer lines (was 351, -55), rendered ok, ...); patch operations on the same widget show +2-ish magnitudes. The contrast is immediately legible in the chat transcript.
- Local qwen3-coder model — same status format. The smaller model's tool-result rendering reads the magnitude fragment cleanly, no breakage for any caller.

Out of scope (possible follow-ups)

A skill rule that reacts to magnitude (e.g. "if a renderWidget would change >40% of lines, prefer patchWidget unless explicit"). The current PR keeps SKILL.md untouched and lets the agent decide; the magnitude is just data, not policy. With priorRendererLineCount/nextRendererLineCount exposed on the result envelope, such a rule has structured input.
Counting changed lines instead of total-line delta (a +1 status can hide a 200-line shuffle that produces a net +1). Diff-aware metrics would need a real line-diff implementation; the current line-count delta is cheap, deterministic, and good enough to flag the common scope-creep cases reproduced above.
Updating the tests/agent_llm_performance fixtures with the new format. The fixtures are captured snapshots of historic sessions used as eval baselines; updating them would invalidate prior eval comparisons. A new fixture set on the post-PR format would be the right shape.

🤖 Generated with Claude Code

Tool results from patchWidget, renderWidget, and upsertWidget previously returned the same one-line success status regardless of whether the call replaced two characters or rewrote the entire renderer. The agent could not tell from the status whether its action was proportional to the request, and a reviewer reading the chat transcript could not spot scope creep at a glance. This change captures the renderer source size before the write, exposes prior and next renderer line counts as structured numeric fields on the write result envelope, and renders them as a magnitude fragment in the existing status string between the verb and the render-status fragment: - "patched, 353 renderer lines (was 351, +2), rendered ok, ..." - "saved, 296 renderer lines (was 351, -55), rendered ok, ..." - "saved, 50 renderer lines, rendered ok, ..." (new widget, no prior) - "patched, 351 renderer lines, rendered ok, ..." (no length change) - "reloaded, rendered ok, ..." (unchanged) Implementation: - spaces/storage.js extracts the dedent + LF-normalize + split body of getWidgetRendererReadLines into shared helpers (getRendererSourceReadLines, countRendererSourceReadLines) so a raw renderer source string can produce the same line count the agent derives from the numbered renderer readback. The line counts in the status are guaranteed to match the highest line index visible in widgetText. - buildWidgetWriteResult(spaceRecord, widgetId, priorRendererSource) accepts the captured pre-write source and exposes priorRendererLineCount and nextRendererLineCount on the result envelope as structured numbers. Full source strings are not exposed. - patchWidget and upsertWidget thread the existing source through. - spaces/store.js adds formatWidgetOperationChangeMagnitude that renders the fragment with ASCII +/- only, returns empty string when counts are missing so callers without source mutation (reloadWidget, removeWidget) keep their existing status text byte-identical. - formatWidgetOperationStatusText accepts priorLineCount and nextLineCount options and inserts the magnitude fragment between the verb and the render-status fragment, preserving the existing substrings consumers grep on. The magnitude is observability, not policy: existing skill rules continue to own when patchWidget vs renderWidget is appropriate. The fragment just gives the agent and reviewer the same data a Senior Dev applies in code review. Skill rules and eval harnesses that want to react programmatically can read the structured priorRendererLineCount / nextRendererLineCount fields without parsing the status string.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(widgets): include change magnitude in patch/render status text#32

fix(widgets): include change magnitude in patch/render status text#32
nsyring wants to merge 1 commit intoagent0ai:mainfrom
nsyring:fix/widget-status-change-magnitude

nsyring commented Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nsyring commented Apr 26, 2026

fix(widgets): include change magnitude in patch/render status text

Summary

Why

What changed

app/L0/_all/mod/_core/spaces/storage.js (+24 / -3)

app/L0/_all/mod/_core/spaces/store.js (+19 / -3)

Behavior matrix

Structured access for tooling

Test plan

Out of scope (possible follow-ups)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`app/L0/_all/mod/_core/spaces/storage.js` (+24 / -3)

`app/L0/_all/mod/_core/spaces/store.js` (+19 / -3)