fix(doc): preserve round-trip formatting in fetch output#469
Conversation
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a nested-list indentation normalization pass and a UTF‑8‑aware emphasis spacing normalizer; both run only outside fenced code blocks (and inline code spans). Tests added for emphasis trimming and nested list indentation behaviors. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@shortcuts/doc/markdown_fix.go`:
- Around line 307-329: The current normalizeNestedListIndentation function
converts any even-space-indented list marker regardless of surrounding context;
change it to only rewrite indentation when the marker is nested under a
preceding list item. Before replacing indent in normalizeNestedListIndentation,
find the previous non-blank line and confirm it is a list item (use a list-item
detection regex similar to nestedListIndentRe but allowing leading tabs/spaces
and matching markers like [-*+]|\d+[.)]); if there is no previous non-blank list
item (start of file or top-level paragraph), skip conversion so top-level
even-space indents are preserved. Keep all other behavior (even-pair -> tabs
conversion, skipping odd counts) the same and reference nestedListIndentRe and
normalizeNestedListIndentation when making the change.
- Around line 126-129: The leading-space regexes (boldLeadingSpaceRe and
italicLeadingSpaceRe) are too permissive and match ambiguous spans like "* x *";
tighten them by requiring the captured inner text to end with a non-space
character (change the second group from ([^*\n].*?) to ([^*\n].*?\S)) so the
leading-space pass won't consume text that has a trailing space before the
closer and thus won't convert literal " * x * " / " ** x ** " into emphasis;
keep boldTrailingSpaceRe and italicTrailingSpaceRe as-is.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 43193614-e642-48df-a695-4340398bc951
📒 Files selected for processing (2)
shortcuts/doc/markdown_fix.goshortcuts/doc/markdown_fix_test.go
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
- trim leading spaces inside bold and italic emphasis exported by docs +fetch - normalize nested list indentation to avoid flattening and literal text on re-import - add regression tests for emphasis spacing and nested list indentation
- keep literal * x * and ** x ** text unchanged - only normalize indented nested list markers when a parent list item exists - add regression coverage for both CodeRabbit findings
511d8bc to
f0ce393
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
shortcuts/doc/markdown_fix.go (1)
114-133: Docstring forfixBoldSpacingis stale after the regex expansion.The comment still enumerates only "two issues" (trailing
**whitespace, redundant heading bold), but the function now also normalizes leading whitespace inside**…**, and the full set of italic (*…*) leading/trailing cases viaitalicLeadingSpaceRe/italicTrailingSpaceRe. Consider refreshing the doc block so future readers can see at a glance what each regex covers (and that italic spans are handled too).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@shortcuts/doc/markdown_fix.go` around lines 114 - 133, Update the docblock for fixBoldSpacing to reflect all current behaviors: list normalization of leading whitespace inside bold markers (boldLeadingSpaceRe), trimming trailing whitespace before closing bold (boldTrailingSpaceRe), analogous leading/trailing normalization for italic spans (italicLeadingSpaceRe and italicTrailingSpaceRe), and removal of redundant bold in ATX headings (headingBoldRe); keep the note about skipping inline code spans and briefly describe each regex's purpose so readers can quickly see what each symbol handles.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@shortcuts/doc/markdown_fix.go`:
- Around line 114-133: Update the docblock for fixBoldSpacing to reflect all
current behaviors: list normalization of leading whitespace inside bold markers
(boldLeadingSpaceRe), trimming trailing whitespace before closing bold
(boldTrailingSpaceRe), analogous leading/trailing normalization for italic spans
(italicLeadingSpaceRe and italicTrailingSpaceRe), and removal of redundant bold
in ATX headings (headingBoldRe); keep the note about skipping inline code spans
and briefly describe each regex's purpose so readers can quickly see what each
symbol handles.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 23241f93-3586-4a2b-8f1e-e6534f0df5ea
📒 Files selected for processing (2)
shortcuts/doc/markdown_fix.goshortcuts/doc/markdown_fix_test.go
✅ Files skipped from review due to trivial changes (1)
- shortcuts/doc/markdown_fix_test.go
fangshuyu-768
left a comment
There was a problem hiding this comment.
Thanks for the formatting fix. I left a few edge-case concerns around nested list normalization and the missing regression coverage.
- 遇到空行时停止向上查找父级列表项,避免把 loose list sibling 误改成嵌套列表 - 避免把列表项中的四空格缩进代码块误改成 tab 缩进列表项 - 补充两个回归测试,并更新 fixBoldSpacing 注释使其与当前实现一致
fangshuyu-768
left a comment
There was a problem hiding this comment.
My earlier comments on the nested-list heuristic are all resolved — confirmed by code + new test cases. Taking a fresh look at the new emphasis (bold/italic) leading-space rules, I spotted three issues worth addressing before merge. All verified by tracing the regexes by hand against the code paths in fixBoldSpacingLine.
- 将 fixBoldSpacingLine 改为按星号 run 扫描,修复 ** hello **、* hello * 和同一行多个 italic span 的空格清理 - 保留 inline code、heading 和 *** hello** 这类近邻字面量,避免误改 emphasis nesting
There was a problem hiding this comment.
🧹 Nitpick comments (1)
shortcuts/doc/markdown_fix.go (1)
218-222: Docstring slightly misdescribes the scanner's payload rule.The comment says "any candidate whose payload contains another asterisk" is ignored, but by construction
payload := seg[openEnd:closeStart]wherecloseStartis the next asterisk run — payloads can never contain*. The real rule is that a mismatched closing run length (e.g. opener*, next run**) aborts normalization and the opener is emitted literally, which is what keeps nested‑emphasis‑like text untouched. Consider rewording for accuracy; no code change needed.📝 Proposed docstring tweak
// fixEmphasisSpacingSegment trims only the whitespace immediately inside simple // *...* and **...** spans. It deliberately ignores runs of 3+ asterisks and -// any candidate whose payload contains another asterisk so nested emphasis-like -// text remains untouched. When both inner sides contain whitespace, single-rune -// payloads are preserved as literal text (for example "* x *" and "** x **"). +// any candidate whose next asterisk run has a different length than the opener, +// so nested or unbalanced emphasis-like text remains untouched. When both inner +// sides contain whitespace, single-rune payloads are preserved as literal text +// (for example "* x *" and "** x **").🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@shortcuts/doc/markdown_fix.go` around lines 218 - 222, The docstring for fixEmphasisSpacingSegment is inaccurate about payloads containing asterisks; update it to state that payload := seg[openEnd:closeStart] can never contain '*' because closeStart points to the next asterisk run, and the actual rule that prevents normalization is a mismatched closing run length (e.g., opener '*' vs next run '**') which causes the opener to be emitted literally and preserves nested-emphasis-like text—adjust the comment to describe this exact behavior and reference variables seg, openEnd, closeStart and the mismatched run length rule instead of saying the payload may contain another asterisk.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@shortcuts/doc/markdown_fix.go`:
- Around line 218-222: The docstring for fixEmphasisSpacingSegment is inaccurate
about payloads containing asterisks; update it to state that payload :=
seg[openEnd:closeStart] can never contain '*' because closeStart points to the
next asterisk run, and the actual rule that prevents normalization is a
mismatched closing run length (e.g., opener '*' vs next run '**') which causes
the opener to be emitted literally and preserves nested-emphasis-like
text—adjust the comment to describe this exact behavior and reference variables
seg, openEnd, closeStart and the mismatched run length rule instead of saying
the payload may contain another asterisk.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d2bea55d-59a6-4a59-bd2f-a85abcbc299f
📒 Files selected for processing (2)
shortcuts/doc/markdown_fix.goshortcuts/doc/markdown_fix_test.go
✅ Files skipped from review due to trivial changes (1)
- shortcuts/doc/markdown_fix_test.go
Adds 5 invariant-level tests on top of #469's transforms: - TestFixExportedMarkdownIdempotent — f(f(x)) == f(x) across rich fixtures (kitchen sink, CJK, nested containers). Protects the core round-trip promise from future transform interactions that rewrite their own output. - TestFixExportedMarkdownPreservesFencedCodeByteForByte — packs every pipeline-touching shape into a fence and asserts byte-identical output. Code samples must never be silently rewritten by a formatting pass. - TestFixExportedMarkdownPreservesCRLF — CRLF input preserves line endings AND still triggers transforms. Windows-authored markdown should not be silently LF-normalized. - TestFixExportedMarkdownTransformInteractions — composition regressions: nested-list + trailing-space bold, text→list transition, callout containing list with emphasis, heading vs paragraph bold. - TestNormalizeNestedListIndentationDocumentedSkips — locks in the deliberate no-op branches (odd-space indent, blank-line loose-list sibling, 4-space indented code block, parentless two-space) as an explicit spec so future heuristic tweaks surface in the test diff. All transforms, fixtures, and expectations are derived from the head of PR #469. No production code changes.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #469 +/- ##
==========================================
+ Coverage 60.19% 60.42% +0.22%
==========================================
Files 390 393 +3
Lines 33433 33657 +224
==========================================
+ Hits 20125 20336 +211
- Misses 11426 11433 +7
- Partials 1882 1888 +6 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
🚀 PR Preview Install Guide🧰 CLI updatenpm i -g https://pkg.pr.new/larksuite/cli/@larksuite/cli@2f734f8937590efec4218f371a56a72eb7b84eda🧩 Skill updatenpx skills add GaoSSR/cli#fix/docs-fetch-roundtrip-formatting -y -g |
|
@fangshuyu-768 Thanks for merge |
Adds 5 invariant-level tests on top of larksuite#469's transforms: - TestFixExportedMarkdownIdempotent — f(f(x)) == f(x) across rich fixtures (kitchen sink, CJK, nested containers). Protects the core round-trip promise from future transform interactions that rewrite their own output. - TestFixExportedMarkdownPreservesFencedCodeByteForByte — packs every pipeline-touching shape into a fence and asserts byte-identical output. Code samples must never be silently rewritten by a formatting pass. - TestFixExportedMarkdownPreservesCRLF — CRLF input preserves line endings AND still triggers transforms. Windows-authored markdown should not be silently LF-normalized. - TestFixExportedMarkdownTransformInteractions — composition regressions: nested-list + trailing-space bold, text→list transition, callout containing list with emphasis, heading vs paragraph bold. - TestNormalizeNestedListIndentationDocumentedSkips — locks in the deliberate no-op branches (odd-space indent, blank-line loose-list sibling, 4-space indented code block, parentless two-space) as an explicit spec so future heuristic tweaks surface in the test diff. All transforms, fixtures, and expectations are derived from the head of PR larksuite#469. No production code changes.
Adds 5 invariant-level tests on top of #469's transforms: - TestFixExportedMarkdownIdempotent — f(f(x)) == f(x) across rich fixtures (kitchen sink, CJK, nested containers). Protects the core round-trip promise from future transform interactions that rewrite their own output. - TestFixExportedMarkdownPreservesFencedCodeByteForByte — packs every pipeline-touching shape into a fence and asserts byte-identical output. Code samples must never be silently rewritten by a formatting pass. - TestFixExportedMarkdownPreservesCRLF — CRLF input preserves line endings AND still triggers transforms. Windows-authored markdown should not be silently LF-normalized. - TestFixExportedMarkdownTransformInteractions — composition regressions: nested-list + trailing-space bold, text→list transition, callout containing list with emphasis, heading vs paragraph bold. - TestNormalizeNestedListIndentationDocumentedSkips — locks in the deliberate no-op branches (odd-space indent, blank-line loose-list sibling, 4-space indented code block, parentless two-space) as an explicit spec so future heuristic tweaks surface in the test diff. All transforms, fixtures, and expectations are derived from the head of PR #469. No production code changes. Co-authored-by: fangshuyu-768 <shuyufang768@outlook.com>
Summary
Fixes a docs round-trip formatting regression in
docs +fetchoutput. Some exported markdown could not be safely re-imported withdocs +create/docs +update, which could flatten nested lists and surface inline formatting as literal markdown or text tags.Changes
Test Plan
go test ./shortcuts/docgo test ./shortcuts/doc -run 'TestFixExportedMarkdown|TestFixTopLevelSoftbreaks|TestFixBoldSpacing|TestNormalizeNestedListIndentation'go run github.com/golangci/golangci-lint/v2/cmd/golangci-lint@v2.1.6 run --new-from-rev=origin/mainmake build./lark-cli docs +fetch --doc 'KHCydlulPoZosix0m0ccKPWnnsg' --as usernow emits**SFT ...**instead of** SFT ...**, and nested ordered list items are tab-indented instead of ambiguous space-pair indentationmake unit-teststill fails locally in unchangedshortcuts/minutesdownload tests withblocked download URL: local/internal host is not allowedRelated Issues
Summary by CodeRabbit
New Features
Bug Fixes
Tests