Chinchill-AI · patrick-chinchill · May 8, 2026 · May 9, 2026 · May 9, 2026 · May 9, 2026
@@ -493,7 +493,10 @@ stay explicit instead of being rediscovered in code review.
 | Teams native streaming final-send when first chunk's `id` was empty (DMs) | `_close_stream_session` sends the final `message` activity whenever `text` is non-empty, even if `stream_id` is `None` (Bot Framework REST response returned `{"id": ""}` on the first chunk). The final activity omits `streamId` from `channelData` rather than serializing `None`. | Upstream's `streamViaEmit` awaits the `chunk` event for the first activity's `id`; if Teams returns an empty id, `messageId` becomes `""` and the SDK's auto-close emits the final activity through `IStreamer` regardless | Without the final `message` activity, the Teams client's streaming UI keeps spinning until the platform times the session out — a stuck-loading-state UX failure with no user workaround. We mirror upstream's looser check (text non-empty → ship the final) so the streaming indicator clears even when the Bot Framework REST response surface returns an empty `id`. |
 | `RawMessage.text` override field — **transitional** | New optional `text: str \| None = None` field on `RawMessage` (`src/chat_sdk/types.py`). When set by an adapter, `Thread._handle_stream` MUST prefer it over its own local accumulator when constructing the recorded `SentMessage` body / message-history entry. `None` falls back to the local accumulator (backward-compatible default for adapters that don't need the override). Set by `_stream_via_emit` so the recorded message matches what Teams actually shipped, even when chunks were buffered into the throttle window and cancellation skipped the flush emit. | Upstream's `RawMessage` (`packages/chat/src/types.ts`) is `{ id; raw; threadId }` only. Cancellation-text reconciliation lives inside `@microsoft/teams.apps`'s `IStreamer.emit` (the npm SDK owns the buffer and never surfaces a buffered-but-unsent suffix to `chat`). | Direct consequence of the hand-rolled Teams native streaming row above. Without the override, the SDK's local accumulator (which captures every chunk yielded to the adapter, including chunks coalesced into the throttle window) would diverge from what Teams actually accepted whenever a session is canceled with buffered text pending — recording text the user never saw. Disappears alongside the hand-rolled wire format once we migrate to `microsoft-teams-apps` (Python). Regression coverage: `tests/test_thread_faithful.py::test_should_prefer_raw_message_text_override_over_local_accumulator` (would fail if someone "fixes" Thread.stream back to upstream's local-accumulator-only behavior) and `tests/test_teams_native_streaming.py::test_canceled_stream_sets_raw_message_text_override`. |
 | Teams divider rendering | `card_to_adaptive_card` hoists `separator: True` onto the next sibling (or emits a non-empty Container for a trailing divider) | `convertDividerToElement` emits an empty `Container` with `separator: True` | Upstream shares the same bug: Microsoft Teams renders an empty Container at zero height, so the separator line is effectively invisible. Python port fixes locally (issue #45) rather than blocking on upstream. |
-| `SlackAdapter.current_token` / `current_client` | Public `@property` accessors that return the request-context-bound token and a preconfigured `AsyncWebClient` | Not exposed (`getToken()` is private on the TS `SlackAdapter`) | Python-only addition (issue #47). Downstream code that calls Slack Web APIs from inside a handler — email resolution, user profile fetches, reaction bookkeeping — otherwise depends on underscore-prefixed helpers. |
+| `SlackAdapter.current_token` / `current_token_async` / `current_client` | Public accessors that return the request-context-bound token and a preconfigured `AsyncWebClient`. `current_token` (sync `@property`) reads the cache; `current_token_async` (async method) invokes the resolver on demand for callable `bot_token` configs used outside `handle_webhook`. | Not exposed (`getToken()` is private on the TS `SlackAdapter`) | Python-only addition (issue #47). Downstream code that calls Slack Web APIs from inside a handler — email resolution, user profile fetches, reaction bookkeeping — otherwise depends on underscore-prefixed helpers. The async variant is required because the sync `current_token` cannot drive an async resolver (see `bot_token` resolver invocation site row). |
+| `SlackAdapterConfig.webhook_verifier` | Optional `Callable[[request, body], bool \| str \| None \| Awaitable[...]]` that fully replaces signing-secret HMAC verification. Lets callers integrate platform-managed verification (e.g. Slack Enterprise Grid edge proxies, KMS-signed payloads, test harness escape hatches). Mutually exclusive with `signing_secret` — explicit `signing_secret` wins when both are passed. | Not exposed — TS only honors `SLACK_SIGNING_SECRET` HMAC verification (`signRequest` / `verifyRequestSignature`) | Python-only extension. The contract is documented as a SECURITY surface in `slack/types.py` (`SlackWebhookVerifier`): returning truthy passes the request, falsy/None rejects 401, and a `str` substitutes the request body before dispatch. Closes the same gap that drives `bot_token` resolvers (custom secret-manager integration) for the signing-secret half of the auth model. |
+| Slack `bot_token` resolver invocation site | Resolved once at `handle_webhook` entry into a per-request `ContextVar`; sync `_get_token` reads it for the rest of the request. Public adapter methods (`post_message`, `add_reaction`, `upload_files`, etc.) DON'T re-resolve — calling them outside `handle_webhook` (cron jobs, background tasks) with a callable `bot_token` raises `AuthenticationError` until the caller awaits `current_token_async()` first | TS `getToken` is async and resolves on EVERY API call site, so cron/background usage just works | Python keeps `_get_token` sync to preserve the existing pre-resolver public API and to avoid threading `await` through every adapter call site. The trade-off is that callable-`bot_token` usage outside the webhook flow needs an explicit `await adapter.current_token_async()` (or `await adapter._resolve_default_token()`) before the first sync-token-consuming call. Static-string `bot_token` is unaffected (cache primed at construction). |
+| Slack `bot_token` resolver caching scope | Single resolution per request, cached in `_resolved_default_token` `ContextVar` for the rest of that request | Provider invoked on every API call within a single request | Within-request caching enables the sync `_get_token` path. Functionally equivalent for rotation (TTL >> request lifetime); diverges only if the resolver is itself sensitive to per-call freshness (rare). |
 | `ConcurrencyConfig.max_concurrent` | Enforced via `asyncio.Semaphore` in the `"concurrent"` strategy path; rejects non-integer or `<= 0` values, and rejects any non-`None` `max_concurrent` paired with a non-`"concurrent"` strategy | Accepted into the config type with docstring "Default: Infinity" but never read (3 writes, 0 reads) | Silent correctness bug upstream — consumers setting `max_concurrent=N` with `strategy="concurrent"` reasonably expect an N-way bound on in-flight handlers. We honor the documented contract via a semaphore and fail-fast on misconfiguration so it's never silent. `max_concurrent=None` stays compatible with every strategy (unbounded default). |
 | `ConcurrencyConfig.max_concurrent` slot scope | **Single global `asyncio.Semaphore`** — caps total in-flight handlers across all threads to `max_concurrent` | **Per-thread slot map** — `acquireConcurrentSlot(threadId, maxConcurrent)` keys the in-flight counter by `threadId`, so each thread has its own N-way bound | When upstream caught up (vercel/chat#419) it implemented per-thread slots; the Python port shipped earlier with a global semaphore and the slot-scope distinction wasn't visible in the original divergence row. Result: a deployment with `max_concurrent=2` and 100 active threads serializes everything globally on Python (peak in-flight = 2 across all threads) but allows 200 concurrent handlers on TS (2 per thread × 100). The `chat.test.ts > should track slots per thread independently` fidelity entry is `pytest.mark.skip`-ped in `tests/test_chat_faithful.py` until the implementation is restructured to a `dict[thread_id, asyncio.Semaphore]` (with cleanup-on-empty to avoid unbounded growth). Tracked as a follow-up. |
 | Redis lock token format | `{token_prefix}_{ms}_{secrets.token_hex(16)}` — always 32 hex chars, CSPRNG-sourced | `ioredis_${Date.now()}_${Math.random().toString(36).substring(2, 15)}` — base36, ≤13 chars, **not** CSPRNG | Interop via `IoRedisStateAdapter(token_prefix="ioredis")` still works for lock-release (release/extend compare by full-string equality, and each runtime only releases what it issued), but the token byte-shape diverges. Intentional — CSPRNG should not be regressed to `Math.random()` for cosmetic byte-for-byte compatibility. |