Skip to content

feat(agent): expose agent version to UI for capability gating#2031

Merged
VojtechBartos merged 1 commit intomainfrom
posthog-code/agent-version-gating
May 5, 2026
Merged

feat(agent): expose agent version to UI for capability gating#2031
VojtechBartos merged 1 commit intomainfrom
posthog-code/agent-version-gating

Conversation

@VojtechBartos
Copy link
Copy Markdown
Member

@VojtechBartos VojtechBartos commented May 5, 2026

Problem

Cloud sandbox agents can run a different (typically older) version than the desktop UI expects. We've had multiple incidents where the UI assumed a notification, field, or behaviour that the cloud agent was too old to emit, and the only fallback today is racing a cloud agent rollout to match desktop expectations.

There's currently no way for the renderer to know which agent version it's talking to — the agent only logs its version locally and stamps it onto outbound user-agent strings.

Solution

Plumb the agent's semver from the agent process to the renderer and provide a small range-based gating helper.

  • Agent emits agentVersion on the _posthog/run_started notification (and on the SSE connected event for symmetry). Putting it on run_started is deliberate — that notification is persisted to the session log and replayed on warm sandbox reconnects, so the renderer re-learns the version after the sandbox restarts.
  • Renderer captures the version into the session store, exposes it via useAgentVersion(taskId), and adds useIsAgentVersion(taskId, range) / isAgentVersion(version, range) for gating. Backed by the semver package so the full range grammar works:
    const canDoNewThing = useIsAgentVersion(taskId, ">=0.40.1");
    const canDoNewThing = useIsAgentVersion(taskId, ">=0.40.0 <1.0.0");
    Fails closed when the agent version is unknown, so an unknown agent never accidentally enables a newer code path.

Local runs

This works for local sessions too — the version-capture branch is unconditional, only the cloud-specific status: "connected" flip stays gated on isCloud. So the same useIsAgentVersion(taskId, …) call works regardless of where the session runs, which gives us consistent gating behaviour across local/cloud (useful for dev builds where someone might be running a stale local agent).


Created with PostHog Code

@VojtechBartos VojtechBartos force-pushed the posthog-code/agent-version-gating branch 2 times, most recently from 6353658 to aa442c3 Compare May 5, 2026 13:36
@VojtechBartos VojtechBartos self-assigned this May 5, 2026
@VojtechBartos VojtechBartos requested a review from a team May 5, 2026 13:47
@VojtechBartos VojtechBartos marked this pull request as ready for review May 5, 2026 13:47
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 5, 2026

Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
apps/code/src/renderer/utils/agentVersion.test.ts:5-35
These test blocks contain multiple inline assertions for clearly data-driven inputs, which the team's style guide flags as a preference for parameterised tests (`it.each`). If the first assertion fails, subsequent ones never run, making debugging harder. Consider refactoring into `it.each` tables.

```suggestion
  it.each([
    ["0.40.1", ">=0.40.1", true],
    ["0.41.0", ">=0.40.1", true],
    ["1.0.0", ">=0.40.1", true],
    ["0.40.0", ">=0.40.1", false],
    ["0.39.99", ">=0.40.1", false],
    ["1.0.1", ">1.0.0", true],
    ["1.0.0", ">1.0.0", false],
    ["0.9.9", "<1.0.0", true],
    ["1.0.0", "<1.0.0", false],
    ["1.0.0", "<=1.0.0", true],
    ["1.2.5", "^1.2.0", true],
    ["2.0.0", "^1.2.0", false],
    ["1.2.5", "~1.2.0", true],
    ["1.3.0", "~1.2.0", false],
    ["0.50.0", ">=0.40.0 <1.0.0", true],
    ["1.0.0", ">=0.40.0 <1.0.0", false],
    ["0.39.0", ">=0.40.0 <1.0.0", false],
  ])(
    "isAgentVersion(%s, %s) === %s",
    (version, range, expected) => {
      expect(isAgentVersion(version, range)).toBe(expected);
    },
  );
```

### Issue 2 of 2
apps/code/src/renderer/features/sessions/service/service.test.ts:1534-1587
**Missing local-session coverage for `agentVersion` capture**

The PR description explicitly states that version capture is unconditional — it applies to both cloud and local sessions. The new test only exercises the cloud path (`isCloud: true`). A complementary test for a local (non-cloud) session verifying that `agentVersion` is stored — and that `status` is NOT flipped to `"connected"` — would confirm the unconditional behaviour described in the PR and guard against an accidental `isCloud` gate being added in future.

Reviews (1): Last reviewed commit: "feat(agent): expose agent version to UI ..." | Re-trigger Greptile

Comment on lines +5 to +35
it("returns true when actual satisfies a >= range", () => {
expect(isAgentVersion("0.40.1", ">=0.40.1")).toBe(true);
expect(isAgentVersion("0.41.0", ">=0.40.1")).toBe(true);
expect(isAgentVersion("1.0.0", ">=0.40.1")).toBe(true);
});

it("returns false when actual is below a >= range", () => {
expect(isAgentVersion("0.40.0", ">=0.40.1")).toBe(false);
expect(isAgentVersion("0.39.99", ">=0.40.1")).toBe(false);
});

it("supports strict >, <, and <= comparators", () => {
expect(isAgentVersion("1.0.1", ">1.0.0")).toBe(true);
expect(isAgentVersion("1.0.0", ">1.0.0")).toBe(false);
expect(isAgentVersion("0.9.9", "<1.0.0")).toBe(true);
expect(isAgentVersion("1.0.0", "<1.0.0")).toBe(false);
expect(isAgentVersion("1.0.0", "<=1.0.0")).toBe(true);
});

it("supports caret and tilde ranges", () => {
expect(isAgentVersion("1.2.5", "^1.2.0")).toBe(true);
expect(isAgentVersion("2.0.0", "^1.2.0")).toBe(false);
expect(isAgentVersion("1.2.5", "~1.2.0")).toBe(true);
expect(isAgentVersion("1.3.0", "~1.2.0")).toBe(false);
});

it("supports compound ranges", () => {
expect(isAgentVersion("0.50.0", ">=0.40.0 <1.0.0")).toBe(true);
expect(isAgentVersion("1.0.0", ">=0.40.0 <1.0.0")).toBe(false);
expect(isAgentVersion("0.39.0", ">=0.40.0 <1.0.0")).toBe(false);
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 These test blocks contain multiple inline assertions for clearly data-driven inputs, which the team's style guide flags as a preference for parameterised tests (it.each). If the first assertion fails, subsequent ones never run, making debugging harder. Consider refactoring into it.each tables.

Suggested change
it("returns true when actual satisfies a >= range", () => {
expect(isAgentVersion("0.40.1", ">=0.40.1")).toBe(true);
expect(isAgentVersion("0.41.0", ">=0.40.1")).toBe(true);
expect(isAgentVersion("1.0.0", ">=0.40.1")).toBe(true);
});
it("returns false when actual is below a >= range", () => {
expect(isAgentVersion("0.40.0", ">=0.40.1")).toBe(false);
expect(isAgentVersion("0.39.99", ">=0.40.1")).toBe(false);
});
it("supports strict >, <, and <= comparators", () => {
expect(isAgentVersion("1.0.1", ">1.0.0")).toBe(true);
expect(isAgentVersion("1.0.0", ">1.0.0")).toBe(false);
expect(isAgentVersion("0.9.9", "<1.0.0")).toBe(true);
expect(isAgentVersion("1.0.0", "<1.0.0")).toBe(false);
expect(isAgentVersion("1.0.0", "<=1.0.0")).toBe(true);
});
it("supports caret and tilde ranges", () => {
expect(isAgentVersion("1.2.5", "^1.2.0")).toBe(true);
expect(isAgentVersion("2.0.0", "^1.2.0")).toBe(false);
expect(isAgentVersion("1.2.5", "~1.2.0")).toBe(true);
expect(isAgentVersion("1.3.0", "~1.2.0")).toBe(false);
});
it("supports compound ranges", () => {
expect(isAgentVersion("0.50.0", ">=0.40.0 <1.0.0")).toBe(true);
expect(isAgentVersion("1.0.0", ">=0.40.0 <1.0.0")).toBe(false);
expect(isAgentVersion("0.39.0", ">=0.40.0 <1.0.0")).toBe(false);
});
it.each([
["0.40.1", ">=0.40.1", true],
["0.41.0", ">=0.40.1", true],
["1.0.0", ">=0.40.1", true],
["0.40.0", ">=0.40.1", false],
["0.39.99", ">=0.40.1", false],
["1.0.1", ">1.0.0", true],
["1.0.0", ">1.0.0", false],
["0.9.9", "<1.0.0", true],
["1.0.0", "<1.0.0", false],
["1.0.0", "<=1.0.0", true],
["1.2.5", "^1.2.0", true],
["2.0.0", "^1.2.0", false],
["1.2.5", "~1.2.0", true],
["1.3.0", "~1.2.0", false],
["0.50.0", ">=0.40.0 <1.0.0", true],
["1.0.0", ">=0.40.0 <1.0.0", false],
["0.39.0", ">=0.40.0 <1.0.0", false],
])(
"isAgentVersion(%s, %s) === %s",
(version, range, expected) => {
expect(isAgentVersion(version, range)).toBe(expected);
},
);
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/code/src/renderer/utils/agentVersion.test.ts
Line: 5-35

Comment:
These test blocks contain multiple inline assertions for clearly data-driven inputs, which the team's style guide flags as a preference for parameterised tests (`it.each`). If the first assertion fails, subsequent ones never run, making debugging harder. Consider refactoring into `it.each` tables.

```suggestion
  it.each([
    ["0.40.1", ">=0.40.1", true],
    ["0.41.0", ">=0.40.1", true],
    ["1.0.0", ">=0.40.1", true],
    ["0.40.0", ">=0.40.1", false],
    ["0.39.99", ">=0.40.1", false],
    ["1.0.1", ">1.0.0", true],
    ["1.0.0", ">1.0.0", false],
    ["0.9.9", "<1.0.0", true],
    ["1.0.0", "<1.0.0", false],
    ["1.0.0", "<=1.0.0", true],
    ["1.2.5", "^1.2.0", true],
    ["2.0.0", "^1.2.0", false],
    ["1.2.5", "~1.2.0", true],
    ["1.3.0", "~1.2.0", false],
    ["0.50.0", ">=0.40.0 <1.0.0", true],
    ["1.0.0", ">=0.40.0 <1.0.0", false],
    ["0.39.0", ">=0.40.0 <1.0.0", false],
  ])(
    "isAgentVersion(%s, %s) === %s",
    (version, range, expected) => {
      expect(isAgentVersion(version, range)).toBe(expected);
    },
  );
```

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment on lines +1534 to +1587
it("captures agentVersion from run_started params onto the session", async () => {
const service = getSessionService();
const hydratedSession = createMockSession({
taskRunId: "run-123",
taskId: "task-123",
status: "disconnected",
isCloud: true,
events: [],
});
mockSessionStoreSetters.getSessionByTaskId.mockReturnValue(
hydratedSession,
);
mockSessionStoreSetters.getSessions.mockReturnValue({
"run-123": hydratedSession,
});
mockTrpcLogs.readLocalLogs.query.mockResolvedValue("");
mockTrpcLogs.fetchS3Logs.query.mockResolvedValue("{}");
mockTrpcLogs.writeLocalLogs.mutate.mockResolvedValue(undefined);

const runStartedEvent = {
type: "acp_message" as const,
ts: 1700000000,
message: {
jsonrpc: "2.0" as const,
method: "_posthog/run_started",
params: {
sessionId: "acp-session",
runId: "run-123",
taskId: "task-123",
agentVersion: "0.42.3",
},
},
};
mockConvertStoredEntriesToEvents.mockReturnValueOnce([runStartedEvent]);

service.watchCloudTask(
"task-123",
"run-123",
"https://api.anthropic.com",
123,
undefined,
"https://logs.example.com/run-123",
);

await vi.waitFor(() => {
expect(mockSessionStoreSetters.updateSession).toHaveBeenCalledWith(
"run-123",
expect.objectContaining({
agentVersion: "0.42.3",
status: "connected",
}),
);
});
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Missing local-session coverage for agentVersion capture

The PR description explicitly states that version capture is unconditional — it applies to both cloud and local sessions. The new test only exercises the cloud path (isCloud: true). A complementary test for a local (non-cloud) session verifying that agentVersion is stored — and that status is NOT flipped to "connected" — would confirm the unconditional behaviour described in the PR and guard against an accidental isCloud gate being added in future.

Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/code/src/renderer/features/sessions/service/service.test.ts
Line: 1534-1587

Comment:
**Missing local-session coverage for `agentVersion` capture**

The PR description explicitly states that version capture is unconditional — it applies to both cloud and local sessions. The new test only exercises the cloud path (`isCloud: true`). A complementary test for a local (non-cloud) session verifying that `agentVersion` is stored — and that `status` is NOT flipped to `"connected"` — would confirm the unconditional behaviour described in the PR and guard against an accidental `isCloud` gate being added in future.

How can I resolve this? If you propose a fix, please make it concise.

Comment thread pnpm-lock.yaml Outdated
Copy link
Copy Markdown
Contributor

@tatoalo tatoalo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Agent emits agentVersion on the _posthog/run_started notification (and on the SSE connected event for symmetry)"

Can we remove it? no need to have symmetry if that's just dead code

Cloud sandbox agents can run a different (older) version than the
desktop UI expects, which has caused multiple incidents where the UI
assumed a notification or behaviour the agent doesn't yet emit.

The agent now reports its semver on the `_posthog/run_started`
notification, the renderer caches it on the session, and feature
code can call `isAgentVersion(version, ">=0.40.1")` (full semver
range grammar via the `semver` package) to gate UI behaviour.

Generated-By: PostHog Code
Task-Id: 9329a81e-85d2-4a7e-84cf-c2a9d0b89237
@VojtechBartos VojtechBartos force-pushed the posthog-code/agent-version-gating branch from aa442c3 to 930b5ff Compare May 5, 2026 14:25
@VojtechBartos
Copy link
Copy Markdown
Member Author

@tatoalo both things addressed, thanks!

@VojtechBartos VojtechBartos enabled auto-merge (squash) May 5, 2026 14:30
@VojtechBartos VojtechBartos merged commit 375cf28 into main May 5, 2026
15 checks passed
@VojtechBartos VojtechBartos deleted the posthog-code/agent-version-gating branch May 5, 2026 14:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants