test(plan): port 19 [post with Plan] tests (#55) by patrick-chinchill · Pull Request #75 · Chinchill-AI/chat-sdk-python

patrick-chinchill · 2026-04-24T05:50:25Z

Summary

Closes Port 20 missing [post with Plan] tests to test_thread_faithful.py #55. Feature in src/chat_sdk/plan.py is already ported; this closes the test-parity gap in tests/test_thread_faithful.py by adding a TestPostWithPlan class with all 19 it(...) cases from the upstream describe("post with Plan", ...) block in thread.test.ts.
Ordering preserved vs upstream; each Python test name snake-cases the TS it(...) title so verify_test_fidelity.py matches. AsyncMock used for post_object / edit_object (absent by default on MockAdapter) and for the edit_object error side-effects.

Behavior gaps surfaced (5 tests skipped)

Porting the tests exposed 3 real divergences between Python plan.py and upstream TS. Those 5 tests are scaffolded (so fidelity counts match) but pytest.skip-ed with a pointer back to #55 rather than silently drifting. Worth a follow-up issue, but intentionally out of scope for this test-port-only PR per the issue's "Any behavior gaps discovered during translation filed as separate bugs" note:

UpdateTaskInput has no id field (skips test_should_update_a_specific_task_by_id, test_should_return_null_when_updating_by_nonexistent_id). Upstream UpdateTaskInput = PlanContent | { id?, output?, status? }; Python dataclass is {output?, status?} only, so update_task({"id": ...}) falls through the PlanContent branch and sets current.output = dict.
_enqueue_edit swallows adapter errors (skips test_should_propagate_editobject_errors_from_addtask, test_should_continue_accepting_edits_after_a_failed_edit). Upstream returns the chained promise so addTask rejects when editObject rejects; Python catches the exception in the except block and only logs a warning.
Edit queue races under asyncio.gather (skips test_should_ensure_sequential_edits_via_queue). TS builds the chain synchronously via .then; Python awaits the previous update_chain then re-binds it, which lets concurrent callers read a stale chain and schedule _do_edit tasks out of order. Reproduced locally: [1, 3, 2] instead of [1, 2, 3].

The other 14 tests pass against current plan.py.

Fidelity impact

Before: 10 [post with Plan] gaps remaining (out of the 19 enumerated in Port 20 missing [post with Plan] tests to test_thread_faithful.py #55; the other 9 were already phantom-matched but kept flagged).
After: all 19 [post with Plan] gaps closed. thread.test.ts drops from 29 missing to 10 missing (all [Streaming]-family, tracked elsewhere).

$ uv run python -c "import sys; sys.path.insert(0,'scripts'); import verify_test_fidelity as v; \
  print('post with Plan gaps closed:', not any(d=='post with Plan' for d,_,_ in \
    v.check_fidelity('packages/chat/src/thread.test.ts','tests/test_thread_faithful.py')[0]))"
post with Plan gaps closed: True

Test plan

uv run ruff check src/ tests/ scripts/ — clean
uv run ruff format --check src/ tests/ scripts/ — 192 files already formatted
uv run pyrefly check — 0 errors
uv run python scripts/audit_test_quality.py — 0 hard failures (pre-existing warnings only)
uv run python scripts/verify_test_fidelity.py — all 19 [post with Plan] gaps closed
uv run pytest tests/ --tb=short -q — 3559 passed, 7 skipped (2 pre-existing + 5 new gap skips)

🤖 Generated with Claude Code

Summary by CodeRabbit

Bug Fixes
- Task updates can target a specific task by ID (returns None for unknown IDs); queued concurrent edits now preserve FIFO ordering; adapter edit failures when applying plan edits propagate to callers instead of being swallowed.
Tests
- Added end-to-end tests covering posting, plan/task lifecycle, queued concurrent edits, failure propagation, and message/content variants.
Documentation
- Updated changelog to document the new tests and revised plan semantics.

coderabbitai · 2026-04-24T05:50:35Z

Warning

Rate limit exceeded

@patrick-chinchill has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 10 minutes and 34 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 10 minutes and 34 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 93840a11-93c1-4f4d-b45b-6a88c11d2e84

📥 Commits

Reviewing files that changed from the base of the PR and between fa81da6 and b3fe670.

📒 Files selected for processing (4)

CHANGELOG.md
src/chat_sdk/plan.py
tests/test_plan.py
tests/test_thread_faithful.py

📝 Walkthrough

Walkthrough

Adds upstream “[post with Plan]” tests and updates Plan behavior: targeted task updates by id, synchronous FIFO edit chaining to prevent out-of-order edits under concurrency, and propagation of adapter edit failures to callers; tests and CHANGELOG updated accordingly.

Changes

Cohort / File(s)	Summary
Changelog `CHANGELOG.md`	Adds an Unreleased entry documenting test parity additions and three Plan runtime changes: task-id targeting, edit propagation, and sequential edit queue chaining.
Plan implementation `src/chat_sdk/plan.py`	Adds `UpdateTaskInput.id: str
Tests — faithful suite `tests/test_thread_faithful.py`	Adds `TestPostWithPlan` covering fallback vs native adapter posting, `add_task`/`update_task`/`complete` flows (including id-targeting, non-existent-id -> None, last in-progress fallback), initial message formats, concurrency/FIFO queueing, pre-post no-op/None semantics, and edit failure propagation and recovery.
Tests — existing plan tests `tests/test_plan.py`	Updates expectations: adapter `edit_object` failures from `add_task` are expected to propagate to caller (test uses `pytest.raises`), still asserts an internal logged warning, and verifies subsequent successful edits persist.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant Plan
  participant Queue as "Edit Queue (chained tail)"
  participant Adapter

  Client->>Plan: add_task(input)
  Note right of Plan: capture current Queue tail synchronously
  Plan->>Queue: attach chained runner (returns chained promise)
  Client->>Plan: await chained promise
  Queue->>Adapter: edit_object(plan_payload)
  Adapter-->>Queue: success / error
  alt success
    Queue->>Plan: apply adapter result -> update model
    Plan-->>Client: resolve chained promise with updated task
  else error
    Queue->>Plan: log internal warning (internal tail swallows)
    Plan-->>Client: reject chained promise with adapter error
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I hopped through chains of edits and tests,

Tasks now find their names and rest.
Queue lined up, no scramble or tumble,
Errors shouted — then we re-assemble.
Hooray, the burrow’s tidy and blessed.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 13.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	Title accurately summarizes the main change: porting 19 [post with Plan] tests to test_thread_faithful.py with the upstream issue reference (`#55`).
Linked Issues check	✅ Passed	PR successfully translates the missing [post with Plan] test block; 19 of 20 tests added (14 passing, 5 scaffolded/skipped for known behavioral gaps); satisfies core objective to close TS-Python fidelity gaps via test porting.
Out of Scope Changes check	✅ Passed	All changes directly support test porting: CHANGELOG documents three behavioral fixes that enable test parity, plan.py updates support test requirements (id field, error propagation, queue ordering), and test_plan.py adjusts for new propagation semantics.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/plan-tests-55

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request ports 19 tests related to the 'post with Plan' functionality from TypeScript to Python to improve test fidelity, adding a new TestPostWithPlan class and updating the CHANGELOG. Feedback focuses on improving code structure by moving imports to the top of the file, refactoring repetitive setup logic into pytest fixtures, and using exact assertions for mock call counts to make tests more robust.

gemini-code-assist · 2026-04-24T05:52:45Z

+    # it("should post fallback text when adapter does not support plans")
+    @pytest.mark.asyncio
+    async def test_should_post_fallback_text_when_adapter_does_not_support_plans(self):
+        from chat_sdk.plan import Plan, StartPlanOptions


The imports from chat_sdk.plan are repeated in almost every test method in this class. To improve maintainability and follow PEP 8, these should be moved to the top of the file.

References

According to PEP 8, imports should always be placed at the top of the file, just after any module comments and docstrings, and before module globals and constants. ^(link)

Deferred — test boilerplate helper would save a few lines but masks which mocks each test uses. Tests are intentionally self-contained for faithful upstream parity.

gemini-code-assist · 2026-04-24T05:52:45Z

+        adapter = create_mock_adapter()
+        state = create_mock_state()
+        thread = _make_thread(adapter, state)


There is significant boilerplate code for setting up the mock adapter and state in each test. Consider refactoring this into a helper method or a pytest fixture to reduce duplication and improve readability.

Deferred — same reason: explicit setup per test preserves readability and 1:1 correspondence with upstream TS test bodies.

gemini-code-assist · 2026-04-24T05:52:45Z

+        assert task is not None
+        assert task.title == "Fetch data"
+        assert task.status == "in_progress"
+        assert edit_object.await_count >= 1


The assertion assert edit_object.await_count >= 1 is loose. Since add_task is awaited and it internally awaits the edit task, the count should be exactly 1 here. Using exact counts makes the test more robust against unexpected extra calls.

Suggested change

assert edit_object.await_count >= 1

assert edit_object.await_count == 1

Deferred — await_count >= 1 is the right assertion: the test asserts editObject was called, and upstream has no strict count requirement. Tighter would diverge from upstream intent.

gemini-code-assist · 2026-04-24T05:52:45Z

+        updated = await plan.update_task("Got result: 42")
+
+        assert updated is not None
+        assert edit_object.await_count >= 1


Similar to the previous comment, the assertion assert edit_object.await_count >= 1 should be exact. In this test, add_task and update_task are both called and awaited, so the count should be exactly 2.

Suggested change

assert edit_object.await_count >= 1

assert edit_object.await_count == 2

Deferred — same as above.

coderabbitai

🧹 Nitpick comments (1)

tests/test_thread_faithful.py (1)

2917-2925: Replace assert True stubs with pytest.skip() for clarity.

These absorber tests violate the coding guideline to avoid assert True stubs. Since JSX is TypeScript-only and has no Python equivalent, using pytest.skip() with an explanation would better communicate intent and satisfy fidelity tracking without creating tests that always pass.

♻️ Proposed refactor

 class TestMissingAbsorbers:
     """Fidelity-check absorbers for TS test names that have no Python equivalent."""

     # JSX-specific tests: Python has no JSX runtime, so these remain as absorbers.
     # See TestSchedule for explanatory comments on why these are not portable.
     def test_should_convert_jsx_card_elements_to_cardelement_before_passing_to_adapter(self):
-        assert True
+        pytest.skip("JSX Card/CardElement conversion is TypeScript-only; no Python equivalent")

     def test_should_convert_card_jsx_with_children_to_cardelement(self):
-        assert True
+        pytest.skip("JSX Card rendering with children is TypeScript-only; no Python equivalent")

As per coding guidelines: "Every test must fail when the code is wrong. Avoid assert True stubs."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/test_thread_faithful.py` around lines 2917 - 2925, Replace the `assert
True` stub in the absorber test inside class TestMissingAbsorbers (method
test_should_convert_jsx_card_elements_to_cardelement_before_passing_to_adapter)
with a proper skip: import pytest if not already present and call
pytest.skip(...) with a concise explanation that this test is
TypeScript/JSX-only and has no Python equivalent; ensure the skip message
clearly states why the test is skipped to satisfy the fidelity guideline.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tests/test_thread_faithful.py`:
- Around line 2917-2925: Replace the `assert True` stub in the absorber test
inside class TestMissingAbsorbers (method
test_should_convert_jsx_card_elements_to_cardelement_before_passing_to_adapter)
with a proper skip: import pytest if not already present and call
pytest.skip(...) with a concise explanation that this test is
TypeScript/JSX-only and has no Python equivalent; ensure the skip message
clearly states why the test is skipped to satisfy the fidelity guideline.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: e718b771-e977-4028-b472-a7dfe8b563f2

📥 Commits

Reviewing files that changed from the base of the PR and between d23b6d9 and 1826a71.

📒 Files selected for processing (2)

CHANGELOG.md
tests/test_thread_faithful.py

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1826a710e6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-24T05:53:50Z

+        # GAP: Python ``UpdateTaskInput`` has no ``id`` field; the TS
+        # variant accepts ``{ id, output?, status? }`` to target a specific
+        # task. Skipping until ``plan.py`` grows id-based lookup.
+        pytest.skip("Python UpdateTaskInput has no 'id' field (gap vs TS); track via follow-up to #55")


Keep parity cases executable instead of unconditional skips

These new "post with Plan" parity tests are hard-skipped, which means the most important gap scenarios never execute in CI and can silently stay stale even after plan.py behavior changes. In practice this undermines the stated fidelity closure because future fixes/regressions in these paths will not be detected. Prefer xfail (ideally strict) or assertions of current behavior so the test still runs and signals when the gap is resolved.

Useful? React with 👍 / 👎.

Addressed in 6ada545 — the 5 previously-skipped tests were unskipped after fixing 3 upstream-parity gaps (UpdateTaskInput.id, error propagation, edit queue race).

patrick-chinchill · 2026-04-24T09:39:18Z

+                # swapped the chain out with a rejecting future — the
+                # suppression keeps the parity guarantee defensive.)
+                with contextlib.suppress(BaseException):
+                    await prev


False positive — await prev under contextlib.suppress(...) has real side effects (drives the previous coroutine to completion).

coderabbitai

🧹 Nitpick comments (1)

tests/test_thread_faithful.py (1)

1487-1504: Stale docstring: remove or update references to now-fixed gaps.

The docstring describes three behavioral divergences that required tests to be skipped:

UpdateTaskInput missing id field
_enqueue_edit swallowing errors
Edit-queue ordering races

Per the commit summary, these gaps have been fixed and the 5 previously-skipped tests are now un-skipped. The docstring should be updated to reflect that these gaps are closed, or the outdated gap descriptions should be removed.

,

📝 Suggested docstring update

     """describe("post with Plan")

     Ported from TS thread.test.ts to close the fidelity gap tracked in `#55`.
-
-    Note: a few tests in this block expose known behavior gaps between the
-    current Python ``Plan`` implementation and the upstream TS version:
-
-    * ``UpdateTaskInput`` in ``plan.py`` has no ``id`` field, so looking up
-      a task by id via ``update_task({"id": ...})`` is not supported.
-    * ``_enqueue_edit`` swallows adapter errors instead of propagating them
-      to the caller (upstream returns the chained promise, which rejects).
-    * The edit chain is rebuilt post-await rather than synchronously, which
-      does not preserve strict ordering under ``asyncio.gather``.
-
-    Those tests are skipped with a pointer back here so the gaps remain
-    visible for a follow-up fix rather than silently drifting.
+
+    All 19 tests pass against the current plan.py implementation.
     """

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/test_thread_faithful.py` around lines 1487 - 1504, Update the stale
class docstring on TestPostWithPlan: remove or rewrite the bullet points that
claim gaps in UpdateTaskInput (plan.py), _enqueue_edit, and edit-queue ordering
since those issues were fixed and tests un-skipped; keep a brief note if desired
about historical context but not as active bugs, and ensure the docstring
accurately reflects current behavior of UpdateTaskInput, _enqueue_edit, and the
edit ordering (or delete the outdated section entirely).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tests/test_thread_faithful.py`:
- Around line 1487-1504: Update the stale class docstring on TestPostWithPlan:
remove or rewrite the bullet points that claim gaps in UpdateTaskInput
(plan.py), _enqueue_edit, and edit-queue ordering since those issues were fixed
and tests un-skipped; keep a brief note if desired about historical context but
not as active bugs, and ensure the docstring accurately reflects current
behavior of UpdateTaskInput, _enqueue_edit, and the edit ordering (or delete the
outdated section entirely).

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 790a84ff-e5da-4f37-9608-6b3d6a7004da

📥 Commits

Reviewing files that changed from the base of the PR and between 1826a71 and 6ada545.

📒 Files selected for processing (4)

CHANGELOG.md
src/chat_sdk/plan.py
tests/test_plan.py
tests/test_thread_faithful.py

✅ Files skipped from review due to trivial changes (1)

CHANGELOG.md

patrick-chinchill · 2026-04-24T09:39:23Z

+        bound.update_chain = loop.create_task(_absorb_for_chain())
+        # ``chained`` preserves upstream semantics: exceptions from the
+        # adapter edit propagate to the caller.
+        await chained


False positive — same as #502 above; await chained has real effect.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

tests/test_thread_faithful.py (1)

3032-3041: Consider using pytest.skip() instead of assert True for fidelity absorbers.

While the docstring clearly documents these as intentional absorbers for JSX-specific tests, using assert True technically violates the coding guideline to avoid assert True stubs. Using pytest.skip("JSX Card not applicable in Python") would be more explicit and align with the pattern used elsewhere for behavioral divergences.

Optional refactor

 class TestMissingAbsorbers:
     """Fidelity-check absorbers for TS test names that have no Python equivalent."""

-    # JSX-specific tests: Python has no JSX runtime, so these remain as absorbers.
-    # See TestSchedule for explanatory comments on why these are not portable.
-    def test_should_convert_jsx_card_elements_to_cardelement_before_passing_to_adapter(self):
-        assert True
+    `@pytest.mark.skip`(reason="JSX Card not applicable in Python")
+    def test_should_convert_jsx_card_elements_to_cardelement_before_passing_to_adapter(self):
+        pass

-    def test_should_convert_card_jsx_with_children_to_cardelement(self):
-        assert True
+    `@pytest.mark.skip`(reason="JSX Card not applicable in Python")
+    def test_should_convert_card_jsx_with_children_to_cardelement(self):
+        pass

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/test_thread_faithful.py` around lines 3032 - 3041, Replace the assert
True stubs in the TestMissingAbsorbers test methods with explicit pytest.skip
calls: in
test_should_convert_jsx_card_elements_to_cardelement_before_passing_to_adapter
and test_should_convert_card_jsx_with_children_to_cardelement, call
pytest.skip("JSX Card not applicable in Python") at the start of each test and
ensure pytest is imported at the top of the file if not already present.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/chat_sdk/plan.py`:
- Around line 475-476: Replace the overly broad
contextlib.suppress(BaseException) around awaiting the previous task so it only
suppresses regular exceptions; change the suppression to
contextlib.suppress(Exception) (referencing the context where "with
contextlib.suppress(BaseException): await prev" uses the variable prev) so
CancellationError, KeyboardInterrupt and SystemExit can still propagate
consistent with _absorb_for_chain's behavior.

---

Nitpick comments:
In `@tests/test_thread_faithful.py`:
- Around line 3032-3041: Replace the assert True stubs in the
TestMissingAbsorbers test methods with explicit pytest.skip calls: in
test_should_convert_jsx_card_elements_to_cardelement_before_passing_to_adapter
and test_should_convert_card_jsx_with_children_to_cardelement, call
pytest.skip("JSX Card not applicable in Python") at the start of each test and
ensure pytest is imported at the top of the file if not already present.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: c7c487e9-902b-4bfb-b9ae-f691abc3cf9d

📥 Commits

Reviewing files that changed from the base of the PR and between 6ada545 and 1e6c25c.

📒 Files selected for processing (2)

src/chat_sdk/plan.py
tests/test_thread_faithful.py

patrick-chinchill · 2026-04-24T09:05:56Z

Review verdict: comment, no functional blocker found. Refreshed and reviewed latest head 70df558 against main. Plan behavior tests pass, including edit failure propagation and sequential edit queue. Two stale prose issues should be fixed before merge for parity clarity: tests/test_thread_faithful.py lines 1492-1503 still say UpdateTaskInput.id and edit queue/error propagation are unsupported/skipped, but this PR implements and runs those tests; CHANGELOG.md line 11 says StreamingPlan.update_task(input), but that API does not exist here and the fix is for Plan.update_task(input). Verification: tests/test_plan.py plus tests/test_thread_faithful.py passed (183 passed), focused regression pair passed, and ruff passed.

Feature in src/chat_sdk/plan.py is already ported; this closes the test-parity gap in tests/test_thread_faithful.py. Closes #55. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- update_task now honors input.id (was only last in-progress) - add_task/update_task propagate adapter.edit_object errors - _enqueue_edit is actually sequential under asyncio.gather Un-skips the 5 previously-skipped tests in TestPostWithPlan. Closes the remaining parity gaps in #55. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- Internal chain-absorption coroutine now only catches Exception so CancelledError / KeyboardInterrupt / SystemExit propagate correctly - Remove redundant local `import asyncio` in test_thread_faithful.py Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Closes the last [post with Plan] fidelity gap. The existing test_current_task_tracks_in_progress in test_plan.py covers only initial + add_task branches; this port verifies the post-complete() fallback branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…consistency Parallels the round-2 except-block narrowing on _absorb_for_chain so CancelledError / KeyboardInterrupt / SystemExit propagate correctly through both error paths in the edit queue. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

patrick-chinchill · 2026-04-24T10:03:09Z

Noted from comment 4311954741 — test docstring at test_thread_faithful.py:1492-1503 still says UpdateTaskInput.id + edit queue are unsupported/skipped (stale), and CHANGELOG mentions StreamingPlan.update_task which doesn't exist (should just be Plan.update_task). PR already merged (b7065ed); will fix in the release PR.

Parity catch-up release for upstream chat@4.26.0. Bundles 8 PRs (#64 #65 #66 #67 #73 #74 #75 #76) + small followup cleanups. See CHANGELOG.md for details.

…y baseline Two infra-level changes to the fidelity check: 1. `.github/workflows/lint.yml` — the `Clone upstream vercel/chat at pinned parity tag` step no longer carries `continue-on-error: true`. The clone is infrastructure the fidelity check depends on; if it fails, the job should fail there rather than swallow the failure and hope a later step catches it. Combined with the script-level guard from the previous commit, this is defense in depth. 2. CI now runs `scripts/verify_test_fidelity.py --strict`. Every `[post with Plan]` test that was baselined is now ported (PR #75 and PR #74 in the 0.4.26.2 bundle), so the repo ships at 0 missing. `scripts/fidelity_baseline.json` is reduced to `{"missing": {}}` (metadata retained so `--update-baseline` and the documented workflow still function for future upstream syncs). Closes self-review gap #2 on #72.

gemini-code-assist Bot reviewed Apr 24, 2026

View reviewed changes

coderabbitai Bot reviewed Apr 24, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Apr 24, 2026

View reviewed changes

github-code-quality Bot found potential problems Apr 24, 2026

View reviewed changes

coderabbitai Bot reviewed Apr 24, 2026

View reviewed changes

github-code-quality Bot found potential problems Apr 24, 2026

View reviewed changes

coderabbitai Bot reviewed Apr 24, 2026

View reviewed changes

Comment thread src/chat_sdk/plan.py Outdated

patrick-chinchill mentioned this pull request Apr 24, 2026

test(state): sweep async time.sleep → asyncio.sleep (followup to #73) #76

Merged

7 tasks

patrick-chinchill and others added 5 commits April 24, 2026 02:48

test(plan): port 19 [post with Plan] tests from upstream

e2fed25

Feature in src/chat_sdk/plan.py is already ported; this closes the test-parity gap in tests/test_thread_faithful.py. Closes #55. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

patrick-chinchill force-pushed the feat/plan-tests-55 branch from 70df558 to b3fe670 Compare April 24, 2026 09:48

patrick-chinchill merged commit b7065ed into main Apr 24, 2026
10 checks passed

patrick-chinchill mentioned this pull request Apr 24, 2026

chore: release 0.4.26.2 #77

Merged

5 tasks

patrick-chinchill added a commit that referenced this pull request Apr 24, 2026

chore: release 0.4.26.2 (#77)

3698ddb

Parity catch-up release for upstream chat@4.26.0. Bundles 8 PRs (#64 #65 #66 #67 #73 #74 #75 #76) + small followup cleanups. See CHANGELOG.md for details.

patrick-chinchill mentioned this pull request Apr 24, 2026

ci(fidelity): ratchet-down baseline + wire into lint.yml #72

Merged

3 tasks

	assert edit_object.await_count >= 1
	assert edit_object.await_count == 1

	assert edit_object.await_count >= 1
	assert edit_object.await_count == 2

Conversation

patrick-chinchill commented Apr 24, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Behavior gaps surfaced (5 tests skipped)

Fidelity impact

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

patrick-chinchill Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

patrick-chinchill Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

patrick-chinchill Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

patrick-chinchill Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

patrick-chinchill Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

patrick-chinchill Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

patrick-chinchill Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

patrick-chinchill commented Apr 24, 2026

Uh oh!

Uh oh!

patrick-chinchill commented Apr 24, 2026

patrick-chinchill commented Apr 24, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 24, 2026 •

edited

Loading