fix(markdown): chat-scoped completeness — escaped chars, task lists, math/escape in `_remend` by patrick-chinchill · Pull Request #101 · Chinchill-AI/chat-sdk-python

patrick-chinchill · 2026-05-22T05:29:54Z

Summary

Chat-scoped completeness fixes for issue #69, targeting the gap subset that actually surfaces in LLM/agent output rather than chasing full CommonMark spec compliance.

Sequenced after #99 (depends on _is_excluded_asterisk from that PR). Targets main for the 4.27 cycle — same release, separate PR for review clarity.

Scope (chat-realistic gaps from the issue #69 follow-up)

Parser side (`markdown_parser.py`)

Escaped characters — inline regexes get (?<!\\) lookbehinds so \*, \[, \_, \~~ stay literal; new _unescape_punct resolves the escape at text-leaf emission. Image alt text also unescapes. Inline-code contents deliberately preserve backslashes (CommonMark: code spans don't unescape).
GFM task list checkboxes — - [ ] / - [x] / - [X] parse to listItem with checked: bool. Ordered lists with 1. [x] content are NOT coerced into tasks (task lists are bullet-only per GFM).

Streaming side (`streaming_markdown.py::_remend`)

Math regions — new _strip_math_regions drops $...$ (inline) and $$...$$ (display) before the inline-marker counters, so $a^* + b^*$ no longer accumulates phantom italic closers.
Escape-aware counters — new _strip_escape_sequences removes \X pairs from the counting buffer so the tilde/backtick/bracket counters in _remend no longer fire on escaped delimiters. The bracket walk's existing \-skip is preserved.

What's deliberately NOT here

Word-internal asterisks (5*3=15) — prototyped, reverted. Excluding word-internal * also breaks text*foo* (legitimate paired italic whose *s are also word-flanked). Distinguishing the two needs a CommonMark delimiter-stack algorithm. Documented as the next Option A item in _is_excluded_asterisk's docstring.
Setext headings, footnotes, raw HTML, indented code blocks, definition lists — analyzed in the parser-swap spike (PR spike(parser): bake-off harness for issue #69 Option B #100). All silently dropped by the baseline and by every candidate without plugins. Not surfaced by LLM/chat output in practice (analysis).
Multi-backtick code spans — deferred. Less common than the items above and would require a non-trivial inline-parser change.

Test plan

26 new tests across 3 classes:

TestEscapedCharacters (8 tests): asterisk, underscore, bracket, tilde, backslash literal, mixed real/escaped emphasis, image-alt unescape, inline-code-preserves-backslash.
TestTaskListItems (7 tests): unchecked, checked, uppercase X, plain item with no attr, mixed list state, marker stripped from content, ordered list not task-coerced.
TestRemendChatCompleteness (11 tests): inline + display math regions, math doesn't break legitimate emphasis outside, math drops strike markers and bracket walk, escape-aware tilde/bracket/backtick counters, mixed escaped+real, escape neutrality.

Validation:

uv run pytest tests/ — 3,621 pass, 1 pre-existing failure (test_github_webhook.py::TestGitHubAdapterConstructor::test_throws_when_no_auth, unrelated — same as sync: upstream v4.27.0 (alpha 1) #83/fix(streaming-markdown): list-marker awareness + table chunk-boundary (issue #69) #99/spike(parser): bake-off harness for issue #69 Option B #100)
uv run ruff check src/ tests/ scripts/ — passes
uv run ruff format --check src/ tests/ scripts/ — 193 files already formatted
uv run python scripts/audit_test_quality.py — 0 hard failures
No existing tests modified

Scope notes

Dependency on fix(streaming-markdown): list-marker awareness + table chunk-boundary (issue #69) #99: this branch builds on claude/wizardly-heisenberg-VLInV and extends _is_excluded_asterisk added there. Should merge after fix(streaming-markdown): list-marker awareness + table chunk-boundary (issue #69) #99.
No CHANGELOG entry — same convention as the in-flight 4.27 port PRs (feat(externalselect): add initialOption + option_groups (vercel/chat#410, #397) #84-feat(chat): add chat.get_user() for cross-platform user lookups (vercel/chat#391) #90 and fix(streaming-markdown): list-marker awareness + table chunk-boundary (issue #69) #99); the consolidating sync PR (sync: upstream v4.27.0 (alpha 1) #83) owns release notes.
Doesn't close Revisit hand-rolled markdown parser: close setext / footnotes / HTML / indented-code gaps #69: the broader parser-strategy question is now informed by PR spike(parser): bake-off harness for issue #69 Option B #100's spike data plus this surgical fix. The remaining gaps (delimiter-stack for word-internal *, multi-backtick spans) can land in a follow-up if they ever materialize as user-visible bugs.

Related: #99 (the original three streaming bugs), #100 (the parser-swap spike that reframed the strategy decision).

Generated by Claude Code

Summary by CodeRabbit

New Features
- Support for GitHub-flavored task lists (checkbox state preserved, including empty items)
Bug Fixes
- Better handling of backslash-escaped punctuation and inline math so formatting is preserved
- Improved nested-list and progressive table rendering to avoid misplaced markers and premature header emission
- More faithful parse/stringify roundtrips to retain original markdown formatting

Three production Slack streaming bugs from issue #69 comment: 1. `_remend("* item one\n")` appended a stray `*`. `_close_emphasis` counted the line-leading bullet as an italic opener. Now skips single-`*` runs preceded by line-leading whitespace and followed by a space/tab (list-marker shape), matching upstream `remend`'s awareness of list markers. 2. `StreamingMarkdownRenderer.finish()` on an odd-count bullet list produced `...item three\n*` — same root cause as #1; the fix covers both. 3. `_get_committable_prefix` released a confirmed table the moment a separator arrived, even with zero body rows. Slack `chat.appendStream` saw header+separator alone as broken syntax and then bare body rows in subsequent appends, never reconnecting them as a single table. Now the header+separator block is held until at least one body row arrives, then released atomically. The original tests pinned the buggy "commit on separator" behavior; updated to require a body row before committing the table block, and added a TestIssue69Regressions class with the exact reproducers from the issue comment plus invariants for the new contract. Closes part of #69 (the streaming bugs). The broader hand-rolled parser question (Options A/B/C in the issue) remains open.

The previous `_is_list_marker` skipped only line-leading single `*` followed by horizontal whitespace. Tighten to match upstream remend's `shouldSkipAsterisk` (packages/remend/src/emphasis-handlers.ts): exclude any single `*` flanked by whitespace (or text boundary) on both sides, which is what CommonMark says isn't a valid emphasis delimiter anyway. The line-leading bullet case still falls out naturally. Picks up three additional cases the previous narrower check missed: - `text * more` -- whitespace-flanked mid-line - `trailing *\n` -- asterisk at end of line - `partial *` -- asterisk at end of buffer Same `_remend` over-counting failure mode as the original issue #69 bugs, just different surface forms. Renamed the helper to `_is_excluded_asterisk` to reflect the broader scope. Remaining remend divergences (word-internal asterisks, math-block contents, escaped sequences, multi-backtick spans, setext headings, indented code, raw HTML, footnotes) are tracked on issue #69 for the parser-strategy discussion -- out of scope for this PR.

…ath/escape in remend) Pulls in the chat-realistic subset of the issue #69 gap list, scoped to what LLMs actually emit in agent/chat contexts. Builds on #99's _is_excluded_asterisk helper rather than restating it. Parser side (markdown_parser.py): - Inline regexes get `(?<!\\)` lookbehinds so escaped openers (`\*`, `\[`, `\_`, `\~~`) stay literal. Bracket content tolerates inner `\]` / `\[` via `(?:[^\]\\]|\\.)+?`. - New `_unescape_punct` runs at text-leaf emission and image-alt emission, resolving `\X` to `X` for ASCII-punctuation X. Inline-code values are deliberately not unescaped (CommonMark: backticks treat backslashes literally). - `_collect_list_items` detects the GFM task-marker prefix (`[ ]` or `[x]/[X]`) on unordered items and stores `checked: bool` on the resulting listItem. Plain items have no `checked` attr. - `make_list_item` extended with optional `checked` kwarg. Streaming side (streaming_markdown.py): - New `_strip_math_regions` pass drops `$...$` (inline) and `$$...$$` (display) before the inline-marker counters. Markers inside math are literal, not delimiters. - New `_strip_escape_sequences` pass removes `\X` pairs from the counting buffer so the tilde/backtick/bracket counters don't fire on escaped delimiters. - Both passes run after `_strip_fenced_code` in `_remend`. Word-internal asterisk exclusion was prototyped (closes `5*3=15`) but reverted: it also breaks paired emphasis like `text*foo*` whose two asterisks are both word-flanked. Distinguishing those needs a proper CommonMark delimiter-stack -- documented as the next Option A item in the helper's docstring. 20 new tests (15 parser + 11 streaming): escape behaviors on every delimiter family, image alt unescape, inline-code preservation, checked/unchecked/uppercase-X task items, mixed task lists, ordered lists are NOT task-coerced, math regions don't pollute counters, escape-aware counters, composition cases. No existing tests changed. 3,621 pass on the full suite; the lone pre-existing GitHub webhook failure is unrelated (see #83 test plan).

coderabbitai · 2026-05-22T05:30:02Z

Warning

Review limit reached

@patrick-chinchill, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 25 minutes and 9 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 72f311b6-8aeb-48eb-acc6-a4acb298d97a

📥 Commits

Reviewing files that changed from the base of the PR and between 03033b4 and 4fda9b4.

📒 Files selected for processing (4)

src/chat_sdk/shared/markdown_parser.py
src/chat_sdk/shared/streaming_markdown.py
tests/test_markdown_faithful.py
tests/test_streaming_markdown.py

📝 Walkthrough

Walkthrough

This PR implements GFM task-list support, escape-aware inline parsing/stringification for faithful round-trips, and streaming delimiter repair (escape & TeX-math awareness) plus table-holding changes so append-only streaming emits header+separator+first-row atomically.

Changes

Markdown Fidelity: Task Lists, Escaped Characters, and Streaming Fixes

Layer / File(s)	Summary
Task list AST and parsing support `src/chat_sdk/shared/markdown_parser.py`	`make_list_item` accepts optional `checked`; unordered list collection recognizes `- [ ]`/`- [x]` and sets `listItem.checked`, preserving empty task items.
Task list stringification and BaseFormat rendering `src/chat_sdk/shared/markdown_parser.py`, `src/chat_sdk/shared/base_format_converter.py`	Stringifier emits `[x]`/`[ ]` for unordered items, preserves empty task items, handles nested-list-before-text by emitting parent prefix+marker, and `_render_list` preserves checkbox markers in formatted output.
Escape-protected inline parsing `src/chat_sdk/shared/markdown_parser.py`	Adds sentinel-based escape protection/restoration, escape-aware inline regexes, and consistent restoration of escaped captures for links/images/inlineCode; hard breaks emitted consistently while preserving escapes.
Stringify escaping and paragraph safety `src/chat_sdk/shared/markdown_parser.py`	Adds `_escape_text_leaf` and `_escape_paragraph_block_markers`, escapes image alt text, and ensures parse→stringify→parse roundtrips preserve escaped punctuation and block-start markers.
Streaming delimiter repair (escape & math aware) `src/chat_sdk/shared/streaming_markdown.py`	Preprocesses text by removing backslash-escape sequences and TeX math regions before delimiter counting; excludes whitespace-flanked single `*` runs from emphasis tallying; updates `_close_emphasis` and `_remend`.
Table append-only holding behavior `src/chat_sdk/shared/streaming_markdown.py`	`_get_committable_prefix` records separator position and holds header+separator until the first body row arrives, emitting header+separator+first-row atomically for append-only consumers.
Streaming markdown tests and regressions `tests/test_streaming_markdown.py`	Updates tests to expect separator-only pushes to be held (empty delta) and first-body-row to release the assembled table; adds regressions for issue `#69` covering `_remend` and table chunking.
Escaped characters & task-list tests `tests/test_markdown_faithful.py`	Adds `TestEscapedCharacters` and `TestTaskListItems`; extends `TestStringifyMarkdown` with roundtrip assertions for escaped punctuation, block-marker safety, task-list state, and empty task items.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

A rabbit hops through markdown lands,
Where checkboxes march and escapes stand—
Tasks now kept whole, math and stars tamed,
Streams held steady, and text un-maimed;
Fidelity prances, tests applauded, 🐇✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 23.91% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title clearly summarizes the primary changes: fixing markdown handling for escaped characters, task lists, and math/escape sequences in the _remend function, which directly aligns with the main objectives.
Linked Issues check	✅ Passed	The PR successfully implements Option C from issue `#69` by delivering focused fixes for escaped characters and GFM task lists with comprehensive tests, while deferring lower-ROI gaps as recommended.
Out of Scope Changes check	✅ Passed	All changes are scoped to the documented objectives: escaped character handling, task list parsing/rendering, and math/escape-aware streaming fixes; no unrelated modifications detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request introduces support for GFM task lists and backslash-escaped punctuation in the markdown parser. It also improves the streaming renderer's accuracy by ignoring escaped characters and math regions when calculating unclosed delimiters, and by ensuring tables are only emitted once a body row is present. Feedback was provided regarding the inline math regex, which is currently too broad and may incorrectly match currency patterns, potentially leading to incorrect delimiter closing in the streaming output.

gemini-code-assist · 2026-05-22T05:32:23Z

+    # Display math: $$...$$ may span multiple lines.
+    text = re.sub(r"\$\$.*?\$\$", "", text, flags=re.DOTALL)
+    # Inline math: $...$ within a single line, non-greedy, no embedded $.
+    text = re.sub(r"\$[^\$\n]+\$", "", text)


The regex for inline math is too broad and will cause false positives with common text patterns like currency (e.g., "prices are $5 and $10"). This can lead to incorrect marker counting in _remend and potentially corrupt the output by appending unnecessary closing markers.

Following standard markdown math heuristics (like those in Pandoc or MathJax), the opening $ should be followed by a non-whitespace character, and the closing $ should be preceded by a non-whitespace character. It should also account for escaped dollar signs.

Suggested change

text = re.sub(r"\$[^\$\n]+\$", "", text)

text = re.sub(r"(?<!\\)\$(?!\s)(?:\\.|[^\$\n\\])+?(?<!\s)\$", "", text)

The spike concluded with Option A (PRs #99 + #101) being the right call for the SDK's current chat-only scope. Documenting the concrete triggers under which that conclusion should be re-examined: - Non-chat input source (RAG, user-authored memory, markdown-file ingestion, GitHub-body parsing -- anywhere humans write the input) - Long-form artifact output surface (citations + footnotes; math rendering; one-shot parse where latency is less critical) - Web rendering surface (port of upstream `@chat-adapter/web`) - New chat platform demanding richer parsing Plus a playbook for re-running the bake-off (author fixture, run parity + benchmark harness, compare silent-drop count / parse time) and a decision threshold for promoting markdown-it-py into runtime if a trigger materialises. Spot-checked upstream's `packages/` at the time of writing: only `adapter-web` is the relevant new package (added in v4.27, Python port deferred); no artifact / RAG / ingestion packages exist upstream. All triggers are forward-looking.

Independent review surfaced three actionable items: 1. **Math/escape ordering** (review #1) -- `_strip_math_regions` ran BEFORE `_strip_escape_sequences`, so input like `\$opener *unclosed text closer\$` had its two escaped `$`s paired as math, eating the `*` opener inside. Reordered: escape strip first, then math strip. Verified the original bug repro now correctly produces a closing `*`. 2. **Empty task items** (review #5) -- `r"^\[([ xX])\]\s+(.*)"` required at least one whitespace AFTER `]`, so `- [ ]` with no trailing content silently fell through to plain text. Loosened to `r"^\[([ xX])\](?:\s+(.*))?$"` -- trailing whitespace+content is now optional. `- [ ]` and `- [x]` (no trailing) both produce `listItem(checked=False/True, children=[])`. 3. **Strikethrough test strengthened** (review #7) -- the existing `test_escaped_tilde_does_not_form_strikethrough` only asserted the absence of a `delete` node. A buggy impl that dropped the tildes entirely would have passed. Added a content-shape assertion that `~~not strike~~` appears in the text leaves. Reviewer's #3 (the `r"a *b\* * c"` case) was investigated and determined NOT to be a bug -- the trailing `*` between two spaces is not a valid CommonMark closer (whitespace-flanked), so italic correctly stays open. Pre-fix behavior happened to balance by ignoring flanking; post-fix is CommonMark-correct. Pre-existing edge cases #4 (`\\*text*`) and #6 (link-text unescape implicit) are documented in the existing code; not blockers. Tests added: - `test_empty_unchecked_task_item_no_content` and `_checked_` variant - `test_remend_escaped_dollar_does_not_pair_with_unescaped_dollar` (the exact #1 repro) - `test_remend_escaped_dollar_does_not_create_phantom_math_region` - Strengthened `test_escaped_tilde_does_not_form_strikethrough` 3,625 pass / 1 pre-existing failure unrelated.

Downstream-impact subagent found two blockers in PR #101's new AST semantics: `stringify_markdown` and `_render_list` were not updated to emit the new state, causing data loss when adapters round-trip through the AST. Blocker 1: `listItem.checked` silently dropped. parse_markdown("- [x] done") → stringify_markdown → "* done\n" Affects every adapter that goes markdown → AST → stringify (linear/github/telegram/whatsapp + the generic BaseFormatConverter.to_markdown). On GitHub and Linear especially (native task-list rendering), this is observable data loss. Blocker 2: `_unescape_punct` + naive stringify produces re-parse drift. parse_markdown(r"Use \*literal\*") → text "Use *literal*" → stringify → "Use *literal*\n" → reparse → emphasis(literal) Telegram MarkdownV2 / WhatsApp / Slack image alt / Google Chat all render the un-escaped `*literal*` as bold/italic in the user's view. Fixes in `markdown_parser.py`: - New `_escape_text_leaf` re-escapes the seven delimiter chars (`* _ ~ \` [ ] \\`) on text-leaf emission. Inverse of `_unescape_punct`. Conservative: doesn't over-escape ordinary punct like `.` / `,` / `!`. - `_stringify_node("text")` calls `_escape_text_leaf` on the value. - `_stringify_node("image")` calls it on the `alt` attribute (alt is stored as a raw string, doesn't go through inline recursion). - `_stringify_node("list")` emits `[x] ` / `[ ] ` prefix when `item.checked` is set (unordered items only -- task lists are a bullet-only GFM convention). - Empty task items (`- [ ]` with no children) still get a line so the checkbox isn't lost on round-trip. Fixes in `base_format_converter.py`: - `_render_list` emits the same `[x] ` / `[ ] ` prefix so platform output (Slack mrkdwn, Discord, Teams, gchat, etc.) keeps the checkbox. Most platforms render the literal `[x]` as text; GitHub and Linear render it natively. Tests added (TestStringifyMarkdown): - `test_roundtrip_preserves_escaped_punctuation_in_text` (5 inputs: asterisk, bracket, backtick, tilde, backslash) - `test_roundtrip_preserves_task_list_state` (checked, unchecked, mixed) - `test_roundtrip_preserves_empty_task_item` - `test_stringify_does_not_over_escape_ordinary_text` 3,629 pass / 1 pre-existing unrelated failure.

Round-trip verification subagent surfaced a residual gap in the stringify path: while inline delimiter chars (`*`, `_`, `~`, `\``, `[`, `]`, `\`) are now re-escaped on text-leaf emission, block-level markers at paragraph start (`#`, `>`, `-`, `+`, ordered-list `digits.`, thematic-break `---`) weren't. Result: AST drift on roundtrip for inputs like `\# not heading` -> paragraph -> "# not heading" -> heading. The gap is narrow but real: it only fires when the very first text leaf of a paragraph starts with a block-marker character, which happens when a user typed an escape sequence to disambiguate from a block construct. Five cases drifted (heading / blockquote / unordered list / ordered list / thematic break). The sixth (`*`) was already covered by `_escape_text_leaf` since `*` is in the inline-delimiter set. Fix in `markdown_parser.py`: - New `_escape_paragraph_block_markers` helper applied to the joined output of `paragraph` nodes only. Mid-text occurrences of `#` / `>` / `-` / `1.` are safe and left alone (block markers only fire at line start). - Heading / blockquote / unordered-list / thematic-break: prepend `\`. - Ordered list: insert `\` between the digit run and the marker char (`.` or `)`), since digits aren't in CommonMark's escapable-punct set so `\1` wouldn't round-trip via `_unescape_punct`. Tests added (TestStringifyMarkdown): - `test_roundtrip_preserves_escaped_block_markers_at_paragraph_start` (8 inputs: all six block-marker families plus already-handled `*` and `---`) - `test_stringify_does_not_over_escape_mid_text_block_markers` (4 inputs verifying `\` doesn't appear when block-marker chars occur mid-text) 3,631 pass / 1 pre-existing unrelated failure.

Fourth-round review subagent found that the previous fix only checked position 0 of the joined paragraph text, but paragraph text leaves can contain embedded `\n` (from `_unescape_punct` resolving an escaped marker after a soft line break, or from joining text leaves around a hard break). Block markers at the start of line 2+ of a paragraph silently re-form constructs on re-parse. Concrete drift cases the previous fix missed: - `"Here is a paragraph\n\\# 1 is not a heading"` -> paragraph -> stringify -> "Here is a paragraph\n# 1 is not a heading\n" -> re-parse -> paragraph + heading. - Same for `\>`, `\-`, `\+`, ordered-list `1\.`, thematic-break `\---` appearing on line 2+ of a multi-line paragraph. Fix splits the joined paragraph text on `\n` and applies the single-line escape per line, then rejoins. Single-line paragraphs (the common case) take a no-`\n` fast path. Adds a one-test `test_roundtrip_preserves_escaped_block_markers_after_soft_break` pinning all five drift cases. 3,632 pass / 1 pre-existing unrelated failure.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e67aa14f53

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-22T22:04:03Z

+                continue
            for ci, child in enumerate(item_children):
                child_type = child.get("type")
                if child_type == "list":


Preserve task parent when first child is a nested list

In list stringification, nested list children are emitted without first writing the parent item prefix, so a task like - [ ] followed by an indented sublist (e.g. - [ ]\n - child) loses the parent item entirely and re-parses as plain text (" * child"). This is a new regression from task-list parsing because empty task items now legitimately produce listItem.checked with only nested list children, and the current branch drops both the checkbox state and list structure.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-22T22:04:03Z

+            # Empty task item -- still emit a line so the marker isn't lost.
+            if is_first_content and task_marker:
+                lines.append(f"{indent}{prefix} {task_marker}".rstrip())


Emit task checkbox before nested list content

When a checked/unchecked item has no text node and only nested list children, is_first_content stays true and this fallback appends the checkbox line after rendering nested children, producing reversed output order (for Slack/Discord/Teams/Google Chat converters) such as nested bullets first and - [ ] last. That corrupts the visible hierarchy for checklist parents that only group subtasks.

Useful? React with 👍 / 👎.

PR #99 review (chatgpt-codex P1): in `_get_committable_prefix`, the "hold pre-separator block" backward walk uses `TABLE_ROW_RE` to extend `table_start`, but separator lines also match that pattern. For a stream like `|A|B|\n|---|---|\n|1|2|\n|C|D|\n|---|---|\n` (two tables with no blank line between them), the walk crosses the first table's separator and collapses `_get_committable_prefix` back to "", violating the monotonic append-only contract of `get_committable_text()`. Adapters that compute deltas from prior committed length then emit incorrect/no deltas. Fix: the backward walk now stops at empty lines (existing), non-row content (existing), AND at prior separators (new). When it hits a prior separator -- meaning the candidate "new header" row above it was already committed as a body row of the prior table -- the function falls back to `return text` instead of holding. That emits one "stray separator" on the rollback delta, which is broken markup but the lesser evil compared to non-monotonic rollback. The fix preserves the well-formed multi-table case where the second table is separated from the first by a blank line; the empty-line break in the walk fires before reaching the prior separator. Tests added: - `test_back_to_back_tables_keep_committable_monotonic` (the exact reviewer repro) - `test_second_table_after_blank_line_still_holds_header` (verifies the fix doesn't regress the blank-line-separated case) The outdated comment from gemini-code-assist on `_is_excluded_asterisk` suggesting we also include `\n` and end-of-string in the list-marker exclusion is already addressed in the second commit on this branch, which broadened the helper to match remend's `shouldSkipAsterisk` exactly. 79 streaming tests pass / 3598 total / 1 pre-existing unrelated failure.

Three findings actioned: 1. **Math regex too broad** (gemini-code-assist, high). The inline-math pattern `\$[^\$\n]+\$` matched plain currency text like `prices are $5 and $10`, eating the `$5 and $10` portion and corrupting downstream inline-marker counts. Tightened to require non-whitespace immediately after the opening `$` and before the closing `$` (standard pandoc / markdown-it heuristic). Display math also tightened (`.+?` instead of `.*?`) to require non-empty content and avoid matching bare `$$$$`. 2. **Task item with only nested list children loses parent prefix** (chatgpt-codex P1). `parse_markdown("- [ ]\n - child")` produces a listItem with `checked: False` and only a nested-list child. Stringify hit the nested-list branch first, indented and appended, never wrote the `* [ ]` parent line. Re-parse then dropped the parent entirely. Fixed by tracking `prefix_emitted` across the item's children -- if the first non-skipped child is a nested list, emit the parent prefix line first. 3. **Same scenario in `_render_list` produces reversed output** (chatgpt-codex P2). `is_first_content` stayed True through the nested-list branch, so the empty-task fallback appended the checkbox line AFTER the nested rendering. Fixed by emitting the prefix before recursing into the nested list when it's the first content, then dropping the `if task_marker` qualifier on the fallback so any item with zero rendered children still gets a line (was previously only firing for task items). Tests added: - `test_task_item_with_only_nested_list_keeps_parent_prefix` - `test_remend_unescaped_currency_does_not_pair_as_math` 3,634 pass / 1 pre-existing unrelated failure.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 12fed105f8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-23T01:05:05Z

+    ("delete", re.compile(r"(?<!\\)~~((?:[^\\]|\\.)+?)~~")),
    # Emphasis: *text*  (not preceded/followed by *)
-    ("emphasis_star", re.compile(r"(?<!\*)\*(?!\*)(.+?)(?<!\*)\*(?!\*)")),
+    ("emphasis_star", re.compile(r"(?<![*\\])\*(?!\*)((?:[^\\]|\\.)+?)(?<!\*)\*(?!\*)")),


Treat doubled backslashes as non-escaping delimiters

The new (?<!\\) opener guards only check the immediately previous character, so delimiters after an even number of backslashes are incorrectly treated as escaped. For example, \\*real italic* (literal backslash + italic) now parses as plain text instead of emphasis, and the same regression applies to links/code/strike patterns using this guard. This changes valid markdown semantics and will silently drop formatting in common path-like text that precedes inline markdown with a doubled backslash.

Useful? React with 👍 / 👎.

Confirmed real bug — verified \\*real italic* parses as plain text instead of literal \ + italic. Deferring the fix to a follow-up rather than blocking this PR. Reasoning:

The fix can't be done with the existing (?<!\\) lookbehind approach — Python's re doesn't support variable-width lookbehind, and any fixed-width alternation (e.g. (?:(?<!\\)|(?<=\\\\))) handles even-count backslashes but breaks 3+ backslashes (which should be escape-backslash + escape-delimiter = literal). The proper fix is a placeholder-based pre-pass: substitute every \X pair with a private-use Unicode placeholder before regex runs, then restore at text-leaf and inline-code emission. That's a moderate refactor of _parse_inline, _unescape_punct, and several inline regexes; safer as its own PR.

For now, added a comment block at _INLINE_PATTERNS (commit 9a2027f) documenting the limitation and the recommended fix path. The pattern is rare in LLM/chat output (which is this PR's scope) — backslash-emphasis sequences typically appear in technical docs explaining regex/glob syntax, which LLMs usually wrap in code fences. Will revisit when we have a concrete production case or when we tackle the broader parser-rewrite question from #100.

Generated by Claude Code

Update: actually fixed in 48fa4f3 rather than deferring. The placeholder-based pre-pass was smaller than I'd estimated (~50 LOC of helpers + ~10 LOC of wiring) and didn't break any existing tests.

Approach: _protect_escapes runs at the start of _parse_inline and replaces every \X escape with <SENTINEL>X (SENTINEL = U+0001, a control char that doesn't appear in chat markdown). The regex lookbehinds change from (?<!\\) to (?<!\x01). Because the sentinel marks the escape unambiguously, any number of preceding backslashes is handled correctly:

\* → <S>* — lookbehind sees <S>, skips. Correct (escaped).

\\* → <S>\* — the * is preceded by \ (not <S>), lookbehind passes, emphasis opens. Correct.

\\\* → <S>\<S>* — the * is preceded by <S>, lookbehind skips. Correct (3 backslashes = escaped \ + escaped *, both literal).

Two restore functions handle the emission modes:

_restore_escapes (text leaves, image alt, link URL): <S>X → X

_restore_escapes_as_literal_pair (inline code, where CommonMark preserves backslashes): <S>X → \X

Tests added: test_escaped_backslash_followed_by_real_emphasis pins your exact repro, plus a triple-backslash case for the harder counting. 3,636 / 3,637 total tests pass.

Generated by Claude Code

PR #101 review (chatgpt-codex P2): the inline regex lookbehinds only check one character back, so an escaped backslash followed by a real delimiter (e.g. `\\*foo*`) is treated as an escaped delimiter and falls through to plain text -- losing the intended emphasis. Fixing this properly requires variable-width lookbehind (not supported by Python's `re`) or a placeholder-based pre-pass that substitutes escape sequences before the regex runs. Both are moderate refactors of the inline parser. The pattern is rare in LLM / chat output (technical docs explaining backslash + emphasis typically use code fences), so deferring to a future chat-completeness pass. Added a comment block near `_INLINE_PATTERNS` documenting the limitation and the recommended fix path so the next person to touch this code sees it immediately.

coderabbitai

🧹 Nitpick comments (1)

src/chat_sdk/shared/markdown_parser.py (1)
475-475: ⚡ Quick win

Replace or-defaulting with an explicit None check.

Line 475 uses task_m.group(2) or ""; please switch to explicit None handling to avoid truthiness traps.
Proposed patch
-                    item_text = task_m.group(2) or ""
+                    task_text = task_m.group(2)
+                    item_text = task_text if task_text is not None else ""
As per coding guidelines, src/**/*.py: Use x if x is not None else default instead of x or default to avoid truthiness traps when porting from TypeScript.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/chat_sdk/shared/markdown_parser.py` at line 475, Replace the
truthy-default pattern used when extracting the second regex group: instead of
using "task_m.group(2) or ''" assign item_text using an explicit None check
(e.g., set item_text to task_m.group(2) if task_m.group(2) is not None else ""),
locating the assignment to item_text that uses task_m and updating it
accordingly to avoid truthiness traps.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/chat_sdk/shared/markdown_parser.py`:
- Line 475: Replace the truthy-default pattern used when extracting the second
regex group: instead of using "task_m.group(2) or ''" assign item_text using an
explicit None check (e.g., set item_text to task_m.group(2) if task_m.group(2)
is not None else ""), locating the assignment to item_text that uses task_m and
updating it accordingly to avoid truthiness traps.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 21754e16-9544-4bcd-bedb-c09a8d4b38ac

📥 Commits

Reviewing files that changed from the base of the PR and between 08c42fa and 9a2027f.

📒 Files selected for processing (5)

src/chat_sdk/shared/base_format_converter.py
src/chat_sdk/shared/markdown_parser.py
src/chat_sdk/shared/streaming_markdown.py
tests/test_markdown_faithful.py
tests/test_streaming_markdown.py

Closes the PR #101 review #4 finding (chatgpt-codex P2) properly rather than deferring. `\\*foo*` (escaped backslash + real italic) now parses correctly as text(`\`) + emphasis(`foo`); previously the fixed-width `(?<!\\)` lookbehind couldn't distinguish odd vs even backslash runs, so the `*` was wrongly treated as escaped. Approach: a `_protect_escapes` pre-pass runs at the start of `_parse_inline`, replacing every `\X` (X in `_ESCAPABLE_PUNCT`) with `<SENTINEL>X` where SENTINEL = U+0001 (a control codepoint that never legitimately appears in chat markdown). The inline regex lookbehinds now check for the sentinel instead of `\`, which trivially handles any number of preceding backslashes: - `\*` -> `<SENTINEL>*`. Regex lookbehind sees SENTINEL, skips. - `\\*` -> `<SENTINEL>\*`. The `*` is preceded by `\` (not SENTINEL), so the lookbehind passes and emphasis opens correctly. - `\\\*` -> `<SENTINEL>\<SENTINEL>*`. The `*` is preceded by SENTINEL, lookbehind skips. Correct (3 backslashes = escaped backslash + escaped asterisk = both literal). The complementary restore functions handle the two emission modes: - `_restore_escapes(text)`: SENTINEL pairs -> literal X (text leaves and image alt -- the backslash is dropped because the escape resolved to a literal char). - `_restore_escapes_as_literal_pair(text)`: SENTINEL pairs -> `\X` (inline code -- CommonMark preserves backslashes inside code spans). Changes: - New helpers `_protect_escapes`, `_restore_escapes`, `_restore_escapes_as_literal_pair`. ~50 LOC. - `_INLINE_PATTERNS` lookbehinds changed from `(?<!\\)` to `(?<!\x01)`; emphasis-star/_under add `\x01` to the exclusion sets; content groups for link/image use `(?:[^\]\x01]|\x01.)+?` to consume sentinel pairs without terminating early. - `_parse_inline` calls `_protect_escapes` on first entry and `_restore_escapes` at emission sites (text leaves, image alt, link URL/title). Inline code uses `_restore_escapes_as_literal_pair`. - Recursive calls into link text / emphasis content pass a `_already_protected=True` kwarg so the pre-pass doesn't run twice. - Removed `_unescape_punct` (no longer used); updated docstring references to point at `_restore_escapes` instead. - Removed the limitation comment added in 9a2027f. Tests added: - `test_escaped_backslash_followed_by_real_emphasis` (the reviewer's exact repro — `\\*real italic*` now produces text + emphasis) - `test_triple_backslash_before_star_is_fully_escaped` 3,636 pass / 1 pre-existing unrelated failure. No existing tests broke (escape behaviour preserved, inline code preservation preserved, image alt unescape preserved).

Second round of subagent review on commit 48fa4f3 surfaced three issues: 1. **U+0001 sentinel could be mangled by adapter-supplied input** (blocker). The pre-pass used `\x01` as the escape sentinel and assumed it would never appear in chat content -- but no adapter strips control characters, so input like `before\x01*after*` would have the sentinel paired with the asterisk and dropped, corrupting the visible text. Switched the sentinel to U+FDD0 (a Unicode noncharacter -- one of the 32 BMP codepoints permanently reserved by the Unicode standard for process-internal application use; never assigned to a real character). Also added a strip step at `_protect_escapes` entry: any literal U+FDD0 in input is removed before the escape walk runs, as defense-in-depth. Updated all the `(?<!\x01)` lookbehinds in `_INLINE_PATTERNS` to use the new sentinel. 2. **Silent URL behavior change** (medium). The PR refactored link / image URL handling to call `_restore_escapes(url)` (CommonMark says URLs honour backslash escapes too). This is a correctness improvement but wasn't called out -- added `test_link_url_resolves_backslash_escapes` to pin the new contract so future regressions are caught. 3. **Weak triple-backslash test** (nit). The existing `test_triple_backslash_before_star_is_fully_escaped` only asserted `"emphasis" not in types` -- a hypothetical buggy impl that dropped all input would pass. Strengthened to assert the exact AST shape: `[{"type": "text", "value": "\\*foo*"}]`. Tests added / strengthened: - `test_triple_backslash_before_star_is_fully_escaped` -- exact AST assertion - `test_sentinel_codepoint_in_input_is_stripped` -- adversarial input with U+FDD0 is sanitised - `test_link_url_resolves_backslash_escapes` -- new URL behavior pinned 3,638 pass / 1 pre-existing unrelated failure.

patrick-chinchill · 2026-05-23T04:50:26Z

@codex review

Substantive changes since your last review:

12fed10 — fixed your P1 (task item with only nested list children loses parent prefix) in _stringify_node and the P2 mirror in _render_list; fixed gemini's math-regex currency false-positive.
48fa4f3 — fixed your P2 (doubled-backslash drops emphasis) with a placeholder-based pre-pass. Replaced fixed-width (?<!\\) lookbehinds with (?<!<SENTINEL>) where SENTINEL marks each \X escape. Correctly handles any count of preceding backslashes.
03033b4 — hardened the sentinel against adversarial input: switched from U+0001 to U+FDD0 (a Unicode noncharacter that's permanently reserved for internal use); added a strip step so pre-existing sentinels in input can't smuggle in fake escape sequences. Also pinned the new URL-escape behavior in a test.

Particularly interested in scrutiny of:

The sentinel choice and the strip-at-entry approach
The two restore variants (_restore_escapes for text leaves vs _restore_escapes_as_literal_pair for inline code per CommonMark)
Whether any callers feed parse_markdown content that's already been through an upstream escape pass

Generated by Claude Code

chatgpt-codex-connector · 2026-05-23T04:54:44Z

Codex Review: Didn't find any major issues. Can't wait for the next one!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

src/chat_sdk/shared/markdown_parser.py (1)

265-299: ⚡ Quick win

Avoid hardcoding the sentinel glyph inside inline regexes.

_INLINE_PATTERNS repeats the literal � in several places instead of deriving from _ESCAPE_SENTINEL. If the sentinel is changed again, escape matching can silently drift.

♻️ Proposed refactor

+_ESC = re.escape(_ESCAPE_SENTINEL)
+
 _INLINE_PATTERNS = [
-    ("inlineCode", re.compile(r"(?<!�)`([^`]+)`")),
+    ("inlineCode", re.compile(rf"(?<!{_ESC})`([^`]+)`")),
     # ...apply the same replacement for all other `(?<!�)` and `�.` usages...
 ]

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/chat_sdk/shared/markdown_parser.py` around lines 265 - 299, The inline
regexes in _INLINE_PATTERNS currently hardcode the sentinel glyph "�"; update
them to derive the sentinel from the module-level _ESCAPE_SENTINEL so changes
won't drift: build each pattern string using re.escape(_ESCAPE_SENTINEL) (or a
variable like_ESC = re.escape(_ESCAPE_SENTINEL)) and interpolate that escaped
sentinel into the lookbehind/lookaround fragments (e.g., (?<!{ESC}) ), then
compile those constructed pattern strings into the same named entries in
_INLINE_PATTERNS; ensure behavior with _protect_escapes remains identical (i.e.,
escaped delimiters still match the sentinel form).

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/test_markdown_faithful.py`:
- Around line 251-260: The test currently only checks that a link node exists;
update test_link_url_resolves_backslash_escapes to also assert the link node's
normalized URL value so regressions in unescaping fail the test: locate the
single link node from link_nodes and assert its URL equals
"http://example.com/path)more" (access the URL property on the node, e.g.
link_nodes[0].get('target') or link_nodes[0].get('destination') as appropriate
for your node shape) instead of only checking len(link_nodes) == 1.

---

Nitpick comments:
In `@src/chat_sdk/shared/markdown_parser.py`:
- Around line 265-299: The inline regexes in _INLINE_PATTERNS currently hardcode
the sentinel glyph "�"; update them to derive the sentinel from the module-level
_ESCAPE_SENTINEL so changes won't drift: build each pattern string using
re.escape(_ESCAPE_SENTINEL) (or a variable like_ESC =
re.escape(_ESCAPE_SENTINEL)) and interpolate that escaped sentinel into the
lookbehind/lookaround fragments (e.g., (?<!{ESC}) ), then compile those
constructed pattern strings into the same named entries in _INLINE_PATTERNS;
ensure behavior with _protect_escapes remains identical (i.e., escaped
delimiters still match the sentinel form).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 9c91ee92-602d-4d60-b27d-3614c4a169a6

📥 Commits

Reviewing files that changed from the base of the PR and between 9a2027f and 03033b4.

📒 Files selected for processing (2)

src/chat_sdk/shared/markdown_parser.py
tests/test_markdown_faithful.py

coderabbitai · 2026-05-23T04:57:12Z

+    def test_link_url_resolves_backslash_escapes(self):
+        # PR #101 review #3 (downstream-impact reviewer): the inline
+        # parser now resolves backslash escapes in link URLs per
+        # CommonMark spec. This is a behavior change from the previous
+        # raw pass-through; pin it so the contract is explicit.
+        children = self._para_children(r"[t](http://example.com/path\)more)")
+        link_nodes = [c for c in children if c.get("type") == "link"]
+        # The `\)` inside the URL is now resolved to literal `)`.
+        # Previously this URL would have been truncated at the first `)`.
+        assert len(link_nodes) == 1


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Assert the normalized URL value, not just link-node presence.

Line 260 only checks node count, so this can still pass when URL unescaping/regression is wrong. Add a concrete URL assertion for the \) case.

✅ Suggested test tightening

children = self._para_children(r"[t](http://example.com/path\)more)") link_nodes = [c for c in children if c.get("type") == "link"] @@ assert len(link_nodes) == 1 + assert link_nodes[0]["url"] == "http://example.com/path)more"

As per coding guidelines, tests/**/*.py: "Every test must fail when the code is wrong — no assert True stubs, no bare truthiness checks when specific values are available".

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_markdown_faithful.py` around lines 251 - 260, The test currently only checks that a link node exists; update test_link_url_resolves_backslash_escapes to also assert the link node's normalized URL value so regressions in unescaping fail the test: locate the single link node from link_nodes and assert its URL equals "http://example.com/path)more" (access the URL property on the node, e.g. link_nodes[0].get('target') or link_nodes[0].get('destination') as appropriate for your node shape) instead of only checking len(link_nodes) == 1.

…ackslash Holistic re-review blocker: a code span ending in a backslash (e.g. `` `C:\` ``, `` `\d+\` ``) left the escape sentinel (U+FDD0) dangling in the captured content, leaking the noncharacter into stringify_markdown / ast_to_plain_text output as a replacement glyph. Root cause: `_protect_escapes` rewrites `\X` -> `<SENTINEL>X`. For `` `C:\` `` the `\`+backtick becomes `<SENTINEL>`+backtick, and the inline-code regex `` (?<!ï·�)`([^`]+)` `` captures `C:<SENTINEL>` -- the sentinel's paired backtick falls outside the group as the code-span closer. Both restore functions guarded on `i + 1 < len(text)`, so a trailing sentinel with no following char was emitted raw. Fix: both restore functions now handle a dangling sentinel explicitly. `_restore_escapes_as_literal_pair` (code spans) emits a bare backslash -- matching CommonMark's "backslash is literal inside code spans" rule, so `` `C:\` `` correctly yields code content `C:\`. `_restore_escapes` (text leaves / URLs) likewise emits a backslash rather than the raw sentinel. Neither can ever emit U+FDD0 now. Test added: `test_code_span_ending_in_backslash_does_not_leak_sentinel` covers three realistic code-span-ending-in-backslash inputs and asserts the sentinel codepoint never reaches output. 3,638 pass / 1 pre-existing unrelated failure.

…at-completeness # Conflicts: # tests/test_streaming_markdown.py

Adversarial re-review surfaced two pre-existing stringify round-trip bugs that are commonly hit by LLM-produced markdown. Fixing them in this PR rather than filing follow-ups (per the team preference for clean PRs). Both bugs predate PR #101 -- the same drift reproduces on `main`. They were never exercised by the chat-completeness work specifically, but they sit next to the escape-roundtrip code this PR added and break the same general "parse -> stringify -> parse stable" contract the rest of the PR pins. Fix 1 (list-item continuation indent): - `- item1\n para2` parsed to one list with one item containing one paragraph `"item1\npara2"`. Stringify emitted `* item1\npara2\n` -- no indent on `para2` -- so re-parse split into `[list, paragraph]`. - `_stringify_node`'s list branch now splits each child's stringified text on `\n` and applies the standard 2-space continuation indent to lines after the first. Empty continuation lines stay empty so intra-item blank lines round-trip too. - Affects every adapter that uses `from_markdown` / `to_markdown` to translate (linear, github, telegram, whatsapp + the generic `BaseFormatConverter.to_markdown`). Fix 2 (hard break stringify): - A `break` node (parsed from `Step \nline2`) stringified as just `\n`, which re-parses as a soft break -- collapsing the `[text, break, text]` triple into a single text leaf with embedded `\n`. The semantic distinction between hard and soft breaks was silently lost on round-trip. - `_stringify_node`'s break branch now emits `" \n"` per CommonMark. Tests added: - `test_roundtrip_preserves_list_item_continuation_indent` (4 inputs: single-line continuation, multi-line continuation, multiple items with continuations, ordered list with continuation) - `test_roundtrip_preserves_hard_break_in_paragraph` (asserts both AST stability AND that the `break` node survives) Docstring note added near `_INLINE_PATTERNS` for the third finding (O(n^2) on adversarial bracket-heavy input). Pre-existing in the underlying linear-search-per-position; this PR's alternation regex adds a constant-factor slowdown only on ~1000+ unclosed brackets, sub-millisecond on typical chat content. Deferred to a future character-level walker if a non-chat input surface ever lands. 3,643 pass / 1 pre-existing unrelated failure.

…mpleteness # Conflicts: # src/chat_sdk/shared/streaming_markdown.py # tests/test_streaming_markdown.py

claude added 3 commits May 22, 2026 03:47

gemini-code-assist Bot reviewed May 22, 2026

View reviewed changes

claude added 4 commits May 22, 2026 08:41

patrick-chinchill marked this pull request as ready for review May 22, 2026 21:55

chatgpt-codex-connector Bot reviewed May 22, 2026

View reviewed changes

claude added 2 commits May 23, 2026 00:42

patrick-chinchill marked this pull request as draft May 23, 2026 00:59

patrick-chinchill marked this pull request as ready for review May 23, 2026 00:59

chatgpt-codex-connector Bot reviewed May 23, 2026

View reviewed changes

coderabbitai Bot reviewed May 23, 2026

View reviewed changes

github-advanced-security AI found potential problems May 23, 2026

View reviewed changes

Comment thread src/chat_sdk/shared/markdown_parser.py Fixed

Comment thread src/chat_sdk/shared/markdown_parser.py Fixed

coderabbitai Bot reviewed May 23, 2026

View reviewed changes

claude added 3 commits May 23, 2026 15:05

Merge branch 'claude/wizardly-heisenberg-VLInV' into claude/parser-ch…

97ef356

…at-completeness # Conflicts: # tests/test_streaming_markdown.py

ci: retrigger after pre-existing test_github_webhook flake

c3b6962

patrick-chinchill mentioned this pull request May 28, 2026

Revisit hand-rolled markdown parser: close setext / footnotes / HTML / indented-code gaps #69

Closed

5 tasks

claude and others added 2 commits May 28, 2026 04:37

Merge remote-tracking branch 'origin/main' into claude/parser-chat-co…

3eb87a0

…mpleteness # Conflicts: # src/chat_sdk/shared/streaming_markdown.py # tests/test_streaming_markdown.py

Merge branch 'main' into claude/parser-chat-completeness

4fda9b4

patrick-chinchill merged commit b1004b1 into main May 28, 2026
10 checks passed

patrick-chinchill mentioned this pull request May 28, 2026

fix(adapters): accept dict StreamChunks in slack/github/google_chat streams #105

Merged

5 tasks

	text = re.sub(r"\$[^\$\n]+\$", "", text)
	text = re.sub(r"(?<!\\)\$(?!\s)(?:\\.\|[^\$\n\\])+?(?<!\s)\$", "", text)

Conversation

patrick-chinchill commented May 22, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Scope (chat-realistic gaps from the issue #69 follow-up)

Parser side (markdown_parser.py)

Streaming side (streaming_markdown.py::_remend)

What's deliberately NOT here

Test plan

Scope notes

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 23, 2026

Choose a reason for hiding this comment

Uh oh!

patrick-chinchill May 23, 2026

Choose a reason for hiding this comment

Uh oh!

patrick-chinchill May 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

patrick-chinchill commented May 23, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 23, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

patrick-chinchill commented May 22, 2026 •

edited by coderabbitai Bot

Loading

Parser side (`markdown_parser.py`)

Streaming side (`streaming_markdown.py::_remend`)

coderabbitai Bot commented May 22, 2026 •

edited

Loading