docs: Add comprehensive project documentation#11
Conversation
Add five documentation files covering architecture, upstream sync procedures, security model, testing strategy, and design decisions. Update README.md with a Documentation section linking to all docs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request introduces comprehensive internal documentation for the chat-sdk-python project, covering architecture, design decisions, security protocols, testing strategies, and upstream synchronization with the Vercel Chat TS SDK. Feedback focused on improving technical accuracy regarding the purpose of from __future__ import annotations, clarifying PostgreSQL locking mechanisms in the architecture guide, and refining the regular expression used for camelCase key validation in the testing documentation.
| |---------|---------------|-----------|-----------------| | ||
| | Memory | In-process dict with expiry | Single-process only | No (dev/test) | | ||
| | Redis | `SET NX PX` + Lua scripts | Atomic via Lua | Yes | | ||
| | PostgreSQL | `INSERT ... ON CONFLICT` + row locks | Atomic via transactions | Yes | |
There was a problem hiding this comment.
The description for the PostgreSQL lock mechanism in the table is a bit ambiguous. It mentions INSERT ... ON CONFLICT, which is generally used for upserts (like implementing set_if_not_exists), alongside "row locks". The explanatory text on line 282 correctly clarifies that SELECT FOR UPDATE SKIP LOCKED is used for lock acquisition. To improve clarity and focus on the locking aspect, consider updating the table to reflect the lock acquisition strategy more directly.
| | PostgreSQL | `INSERT ... ON CONFLICT` + row locks | Atomic via transactions | Yes | | |
| | PostgreSQL | Row-level locks (`SELECT FOR UPDATE`) | Atomic via transactions | Yes | |
| 1. **PEP 604 syntax on Python 3.10**: The SDK uses `X | Y` union syntax (e.g., `str | None`) and `list[str]` lowercase generics. Without the future import, these require Python 3.10+. With it, they work on Python 3.7+. Since the SDK targets Python 3.11+, this is belt-and-suspenders. | ||
|
|
||
| 2. **Forward reference resolution**: Annotations are stored as strings and resolved lazily. This eliminates circular reference issues in type hints (e.g., `Thread` referencing `Channel` and vice versa). |
There was a problem hiding this comment.
The rationale in point 1 is slightly inaccurate. from __future__ import annotations does not backport X | Y syntax to older Python versions. Its main benefit is lazy evaluation of annotations, which solves the forward reference problem mentioned in point 2. To avoid confusion, I suggest merging these points to focus on the forward reference resolution benefit.
| 1. **PEP 604 syntax on Python 3.10**: The SDK uses `X | Y` union syntax (e.g., `str | None`) and `list[str]` lowercase generics. Without the future import, these require Python 3.10+. With it, they work on Python 3.7+. Since the SDK targets Python 3.11+, this is belt-and-suspenders. | |
| 2. **Forward reference resolution**: Annotations are stored as strings and resolved lazily. This eliminates circular reference issues in type hints (e.g., `Thread` referencing `Channel` and vice versa). | |
| 1. **Forward reference resolution**: The primary benefit of this import is that type annotations are stored as strings and evaluated lazily. This completely eliminates circular reference issues in type hints (e.g., `Thread` referencing `Channel` and vice versa), which is a common problem in a project with interconnected types. While the SDK targets Python 3.11+ and can use modern syntax like `X | Y` natively, this import is still crucial for managing complex type dependencies. |
| ### Helper Function | ||
|
|
||
| ```python | ||
| CAMEL_RE = re.compile(r"^[a-z]+[A-Z]") |
There was a problem hiding this comment.
The documented regular expression for detecting camelCase keys, ^[a-z]+[A-Z], is a bit too simple and could lead to both false negatives and false positives. For example, it would not match a key like aLongerCamelCaseKey, but it would incorrectly match the start of aKey_with_snake_case. A more robust pattern like ^[a-z]+[A-Z][a-zA-Z0-9]*$ combined with re.fullmatch in the test implementation would be more accurate. Consider updating the documentation to reflect a more robust approach.
| CAMEL_RE = re.compile(r"^[a-z]+[A-Z]") | |
| CAMEL_RE = re.compile(r"^[a-z]+[A-Z][a-zA-Z0-9]*$") |
|
|
||
| - `shared/errors.py`: Typed adapter error hierarchy (`AdapterRateLimitError`, `AuthenticationError`, `ValidationError`, `NetworkError`, `ResourceNotFoundError`, `PermissionError`). TS throws plain `Error` objects. | ||
| - `testing/__init__.py` + `shared/mock_adapter.py`: Test utilities with `MockAdapter`, `MockStateAdapter`, `create_test_message()`. | ||
| - `from __future__ import annotations` everywhere: Enables PEP 604 union syntax (`X | Y`) on Python 3.10. |
There was a problem hiding this comment.
This description is slightly misleading. from __future__ import annotations doesn't "enable" PEP 604 union syntax on Python 3.10 (it's already available there), nor does it backport it to older versions. Its main purpose is to make all annotations behave as forward references by storing them as strings, which is what solves circular import problems with type hints.
| - `from __future__ import annotations` everywhere: Enables PEP 604 union syntax (`X | Y`) on Python 3.10. | |
| - `from __future__ import annotations` everywhere: Solves circular dependency issues in type hints by treating them as forward references. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a94deec2f5
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| - **Fallback**: If `use_application_default_credentials` is enabled, the adapter trusts Google Cloud's internal authentication. | ||
|
|
There was a problem hiding this comment.
Correct Google Chat webhook verification guidance
This section says ADC mode is a safe fallback for webhook auth, but GoogleChatAdapter.handle_webhook only verifies incoming JWTs when google_chat_project_number/pubsub_audience is configured; otherwise it logs a warning and continues processing (src/chat_sdk/adapters/google_chat/adapter.py, around lines 747-770). Documenting ADC as an authentication fallback can lead operators to deploy an unauthenticated Google Chat webhook endpoint.
Useful? React with 👍 / 👎.
|
|
||
| 2. **Thread ID format**: `{adapter}:{platform_id}` (e.g., `slack:C123:1234567890.123456`). State keys depend on this format. Changing it would break cross-language state sharing in deployments that mix TS and Python bots. | ||
|
|
||
| 3. **Lock token format**: `{backend}_{timestamp}_{random}`. The token must be a cryptographically random string for security. The format itself is not critical for interop, but the lock key format (`dedupe:{adapter}:{message_id}`) must match. |
There was a problem hiding this comment.
Align lock token format claim with actual backend behavior
The guide marks {backend}_{timestamp}_{random} as an architecture decision that must stay 1:1, but the Postgres backend currently generates lock tokens as pg_{uuid4} (src/chat_sdk/state/postgres.py::_generate_token). Keeping this incorrect “must match” claim in sync docs can misdirect maintainers into unnecessary token-format rewrites and parity work that does not reflect the current implementation.
Useful? React with 👍 / 👎.
Hardens the socket-mode port against the seven review findings on PR #86: * Add ``connect_timeout_s`` config (default 30s) and wrap the initial socket-mode handshake in ``asyncio.wait_for`` so a hung ``SocketModeClient.connect()`` can't make ``initialize()`` block forever (hazard #11). On timeout we tear the loop down before raising. * Reject forwarded socket events whose ``timestamp`` field is outside the same 5-minute window ``_verify_signature`` enforces, so a captured forwarded payload can't be replayed indefinitely (hazard #12). * Narrow the ``contextlib.suppress`` in ``stop_socket_mode`` to ``CancelledError`` only — surprising loop crashes are no longer silently swallowed during shutdown. * Replace ``asyncio.get_event_loop().create_task`` in the socket-mode ``wrap_async`` helper with ``get_running_loop`` (Python 3.12+ compatibility, hazard #5). * Have the socket-mode interactive branch ack with ``response_action: errors`` instead of an empty ack when dispatch raises — an empty ack on ``view_submission`` silently closes the modal so the user sees no signal anything went wrong. * Drop ``is_ext_shared_channel`` from the synthesized ``event_callback`` payload in the socket-mode events_api branch so socket and webhook paths feed identical shapes into ``_process_event_payload`` (hazard #7). * Add a regression test firing two concurrent ``_route_socket_event`` events_api dispatches for different teams via ``asyncio.gather`` to pin down the existing ``copy_context()`` isolation against future drift (hazard #6). Also fixes a pre-existing B010 lint warning in the test stub. https://claude.ai/code/session_01FyMxQn2BEAzmwKS1GZczKj
Summary
docs/ARCHITECTURE.mddocumenting module dependency graph, adapter protocol, card system, concurrency strategies (drop/queue/debounce/concurrent), state backends (memory/redis/postgres), markdown pipeline (parse/AST/walk/stringify), and streaming pipeline (chunks/StreamingMarkdownRenderer/safe commit points)docs/UPSTREAM_SYNC.mddocumenting how to keep the Python port in sync with the Vercel Chat TS SDK, including module mapping, what to port vs adapt, architecture decisions that must stay 1:1, and a comprehensive list of 9 known TS-to-Python footguns with before/after code examplesdocs/SECURITY.mddocumenting webhook verification per platform (8 adapters), SSRF protections, AES-256-GCM crypto for Slack token encryption, lock token generation, known limitations (Telegram silent skip, JWKS rotation window), and a production deployment audit checklistdocs/TESTING.mddocumenting test categories (unit/integration/replay/dispatch validation), how to run tests, how to add tests for a new adapter, known coverage gaps (4 modules under 60%), and the dispatch key validation pattern that catches camelCase key bugsdocs/DECISIONS.mddocumenting rationale for hand-rolled markdown parser, no Bot Framework SDK for Teams, PascalCase builders, global Chat singleton, zero runtime dependencies,from __future__ import annotations, and BaseAdapter with ChatNotImplementedError defaultsREADME.mdwith a Documentation section linking to all docsTest plan
uv run pytest tests/)🤖 Generated with Claude Code