Coverage gap — documented non-parity with upstream remark.
Current state
src/chat_sdk/shared/markdown_parser.py is a ~880-line hand-rolled regex parser producing mdast-shaped output. It covers ATX headings, fenced code blocks, thematic breaks, blockquotes, ordered/unordered lists, GFM tables, inline emphasis/strong/delete/code/links/images, hard line breaks.
What it doesn't cover (listed at docs/UPSTREAM_SYNC.md:442 as "by design, won't fix"):
- Setext headings (
===== / ----- underlines)
- Footnotes (
[^1]: ...)
- Raw HTML nodes (block + inline)
- Indented code blocks (4-space indent)
- Escaped characters (
\*, \[, \\, …)
- Backtick spans longer than 1 (
code with ` inside )
The decision is recorded at docs/DECISIONS.md:7-21. Summary of the original rationale: (1) mdast compatibility — no Python parser produces mdast natively; (2) round-trip fidelity; (3) subset-is-sufficient; (4) zero runtime deps; (5) streaming renderer needs to understand the same constructs.
Why reopen
Each "by design, won't fix" row accumulates risk:
- LLM-generated content: models frequently emit setext headings and escaped characters; these currently get rendered as raw text, breaking display on every adapter that pipes through our parser (all of them).
- Upstream drift: upstream remark handles these correctly. Every time upstream tests something that depends on them, we have to explicitly skip; the skip list grows.
- Fragility: the hand-rolled regex parser is already 880 lines and has had several correctness fixes for edge cases (see git history on
shared/markdown_parser.py). Each new construct adds risk of subtle inline-span bugs; each fix risks regressing another construct.
- Downstream asks: consumers periodically ask why
\*not italic\* doesn't render; the answer "we don't handle escaped characters" doesn't age well.
Options
Option A — close the gaps in the existing parser
Add setext, indented code, raw HTML, footnotes, escaped chars, and multi-backtick spans to shared/markdown_parser.py. Estimate: ~300–400 LOC, new edge cases to test. Keeps zero runtime deps.
Option B — swap the backend
Evaluate mistune (has GFM + footnote plugins) or markdown-it-py (CommonMark + plugin ecosystem) with a thin translation layer to mdast. The original "no mdast from Python libs" objection stands, but the translation layer has shrunk in scope — we only need the node shapes we already emit (paragraph, heading, code, list, listItem, blockquote, table, tableRow, tableCell, thematicBreak, emphasis, strong, delete, inlineCode, link, image, text, break). A translator for those nodes is probably ~250 LOC. The parser itself shrinks to a call into the library.
Key concern with Option B: the StreamingMarkdownRenderer in shared/streaming_markdown.py needs to understand code-fence state and inline-marker state for mid-stream closure. If the parser library is used only for static parses (after-the-fact mdast), and streaming keeps its own state machine, they could diverge. Scope this carefully.
Option C — close specific high-value gaps only
Order of ROI:
- Escaped characters — common in LLM output; small implementation (~30 LOC).
- Indented code blocks — common in technical content; small (~40 LOC).
- Setext headings — common; medium (~60 LOC).
- Multi-backtick spans — less common; small (~30 LOC).
- Footnotes — rare in chat; defer.
- Raw HTML — rare in chat; defer.
Option C is pragmatic: close items 1–3 in one PR, document 4–6 as remaining gaps, come back if consumers ask.
Recommendation
Start with C. Benchmark against mistune in a separate spike to see whether the translator layer lands under 250 LOC; if yes, B becomes attractive for the 0.5 cycle.
Acceptance
Coverage gap — documented non-parity with upstream remark.
Current state
src/chat_sdk/shared/markdown_parser.pyis a ~880-line hand-rolled regex parser producing mdast-shaped output. It covers ATX headings, fenced code blocks, thematic breaks, blockquotes, ordered/unordered lists, GFM tables, inline emphasis/strong/delete/code/links/images, hard line breaks.What it doesn't cover (listed at
docs/UPSTREAM_SYNC.md:442as "by design, won't fix"):=====/-----underlines)[^1]: ...)\*,\[,\\, …)code with ` inside)The decision is recorded at
docs/DECISIONS.md:7-21. Summary of the original rationale: (1) mdast compatibility — no Python parser produces mdast natively; (2) round-trip fidelity; (3) subset-is-sufficient; (4) zero runtime deps; (5) streaming renderer needs to understand the same constructs.Why reopen
Each "by design, won't fix" row accumulates risk:
shared/markdown_parser.py). Each new construct adds risk of subtle inline-span bugs; each fix risks regressing another construct.\*not italic\*doesn't render; the answer "we don't handle escaped characters" doesn't age well.Options
Option A — close the gaps in the existing parser
Add setext, indented code, raw HTML, footnotes, escaped chars, and multi-backtick spans to
shared/markdown_parser.py. Estimate: ~300–400 LOC, new edge cases to test. Keeps zero runtime deps.Option B — swap the backend
Evaluate
mistune(has GFM + footnote plugins) ormarkdown-it-py(CommonMark + plugin ecosystem) with a thin translation layer to mdast. The original "no mdast from Python libs" objection stands, but the translation layer has shrunk in scope — we only need the node shapes we already emit (paragraph,heading,code,list,listItem,blockquote,table,tableRow,tableCell,thematicBreak,emphasis,strong,delete,inlineCode,link,image,text,break). A translator for those nodes is probably ~250 LOC. The parser itself shrinks to a call into the library.Key concern with Option B: the
StreamingMarkdownRendererinshared/streaming_markdown.pyneeds to understand code-fence state and inline-marker state for mid-stream closure. If the parser library is used only for static parses (after-the-fact mdast), and streaming keeps its own state machine, they could diverge. Scope this carefully.Option C — close specific high-value gaps only
Order of ROI:
Option C is pragmatic: close items 1–3 in one PR, document 4–6 as remaining gaps, come back if consumers ask.
Recommendation
Start with C. Benchmark against
mistunein a separate spike to see whether the translator layer lands under 250 LOC; if yes, B becomes attractive for the 0.5 cycle.Acceptance
docs/UPSTREAM_SYNC.mdnon-parity tabledocs/DECISIONS.md:7-21updated to reflect the narrowed gap list