Fix/im markdown emphasis spacing#667
Conversation
Change-Id: I0e94087876bd20869cf69beffc4b69ffd26f5040
Change-Id: Ic8700ad015f459d0e95e4341224ce70855f94fe4
Change-Id: Ic2609ca45653b48e21c303d2e41deb0902870bf4
Change-Id: I0e26a4eda85180de287ebef48d7689583a41a242
Change-Id: Ib84e8c31589dc35c29abd8e01b5b8bff1821020a
Change-Id: Ied3e03e8b3fff21ebf019babb40d3e7eab011c23
Change-Id: I5334e1c690d822a5e5735bfd1f9e8d14f65ddff3
|
张金杰 seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
📝 WalkthroughWalkthroughThe PR introduces markdown emphasis-spacing normalization with inline-code protection. New functions protect inline code via backtick-span scanning, normalize emphasis spacing by trimming whitespace inside simple emphasis spans ( Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
shortcuts/im/helpers_test.go (1)
522-564: Solid emphasis-normalization coverage.Subtests cover symmetric spacing for
*,**,***, plus negative cases for inline code, fenced code, and embedded non-emphatic spaced asterisks. Asymmetric forms (e.g.,**foo **,** foo**) and list-item edge cases aren’t directly exercised — consider adding one of each so future refactors don’t regress them.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@shortcuts/im/helpers_test.go` around lines 522 - 564, Tests exercise symmetric spacing and negative cases but miss asymmetric emphasis forms and list-item boundaries; add new subtests in helpers_test.go using wrapMarkdownAsPost and decodePostParagraphForTest that assert normalization for asymmetric examples like "**foo **" -> "**foo**" and "** foo**" -> "**foo**", and add at least one list-item edge-case (e.g., "- ** foo**" or "1. **foo **") to ensure list parsing/normalization preserves or normalizes emphasis correctly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@shortcuts/im/helpers.go`:
- Around line 942-992: The function normalizeEmphasisSpacingSegment incorrectly
treats a leading list-item `* ` as an emphasis opener; to fix, add a guard
(either inside hasSimpleEmphasisBoundaries or immediately after discovering an
opener in normalizeEmphasisSpacingSegment) that returns false / skips
normalization when openStart is at a line start (openStart == 0 or the byte
before openStart is '\n'), the marker length is 1 (openEnd-openStart == 1) and
the byte/rune immediately after openEnd is whitespace; update
hasSimpleEmphasisBoundaries or the caller logic around the
nextAsteriskRun/openStart/openEnd check so list-item-style `* ` openers are not
treated as emphasis openers.
- Around line 908-940: Add a unit test to document and assert the behavior of
scanInlineCodeSpans/wrapMarkdownAsPost when an unmatched backtick run appears
before a balanced code span: create a test (e.g.,
TestWrapMarkdownAsPost_UnmatchedBacktickThenBalanced) that feeds a string like
"text ` middle `code` end" into wrapMarkdownAsPost (or directly into
scanInlineCodeSpans) and assert that the unmatched single backtick does not form
a span and the subsequent "`code`" is detected as a span (and that emphasis
normalization skips the unmatched run); locate code around scanInlineCodeSpans
and wrapMarkdownAsPost to add the test so it fails if the scanner incorrectly
treats the unmatched backtick as a code opener.
---
Nitpick comments:
In `@shortcuts/im/helpers_test.go`:
- Around line 522-564: Tests exercise symmetric spacing and negative cases but
miss asymmetric emphasis forms and list-item boundaries; add new subtests in
helpers_test.go using wrapMarkdownAsPost and decodePostParagraphForTest that
assert normalization for asymmetric examples like "**foo **" -> "**foo**" and
"** foo**" -> "**foo**", and add at least one list-item edge-case (e.g., "- **
foo**" or "1. **foo **") to ensure list parsing/normalization preserves or
normalizes emphasis correctly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 088ddd7a-592a-4b5d-b431-fd21eda86c9e
📒 Files selected for processing (4)
shortcuts/im/helpers.goshortcuts/im/helpers_test.goskills/lark-im/references/lark-im-messages-reply.mdskills/lark-im/references/lark-im-messages-send.md
| func scanInlineCodeSpans(line string) [][2]int { | ||
| var spans [][2]int | ||
| i := 0 | ||
| for i < len(line) { | ||
| if line[i] != '`' { | ||
| i++ | ||
| continue | ||
| } | ||
| start := i | ||
| for i < len(line) && line[i] == '`' { | ||
| i++ | ||
| } | ||
| delim := line[start:i] | ||
| j := i | ||
| for j <= len(line)-len(delim) { | ||
| if line[j] == '`' { | ||
| k := j | ||
| for k < len(line) && line[k] == '`' { | ||
| k++ | ||
| } | ||
| if k-j == len(delim) { | ||
| spans = append(spans, [2]int{start, k}) | ||
| i = k | ||
| break | ||
| } | ||
| j = k | ||
| } else { | ||
| j++ | ||
| } | ||
| } | ||
| } | ||
| return spans | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
rg -nP --type=go -C2 'scanInlineCodeSpans|protectMarkdownInlineCode' shortcuts/im/Repository: larksuite/cli
Length of output: 1352
🏁 Script executed:
fd -e _test.go shortcuts/im/ | head -20Repository: larksuite/cli
Length of output: 415
🏁 Script executed:
rg -nP --type=go 'scanInlineCodeSpans|protectMarkdownInlineCode|unmatched.*backtick|backtick.*unmatched' shortcuts/im/ -A 5 -B 5Repository: larksuite/cli
Length of output: 2493
🏁 Script executed:
fd --full-path '.*helpers.*test.*\.go' shortcuts/im/Repository: larksuite/cli
Length of output: 205
🏁 Script executed:
rg -nP --type=go 'func Test.*' shortcuts/im/ | grep -i 'inline\|backtick\|code'Repository: larksuite/cli
Length of output: 39
🏁 Script executed:
cat -n shortcuts/im/helpers_test.go | head -150Repository: larksuite/cli
Length of output: 6334
🏁 Script executed:
rg -n 'protectMarkdownInlineCode|scanInlineCodeSpans' shortcuts/im/helpers_test.go -A 20Repository: larksuite/cli
Length of output: 39
🏁 Script executed:
wc -l shortcuts/im/helpers_test.goRepository: larksuite/cli
Length of output: 90
🏁 Script executed:
rg -n 'func Test.*' shortcuts/im/helpers_test.go | head -50Repository: larksuite/cli
Length of output: 1671
🏁 Script executed:
rg -n 'normalizeMarkdownEmphasisSpacing|protectMarkdownInlineCode|scanInlineCodeSpans|unmatched.*backtick' shortcuts/im/helpers_test.go -A 10 -B 2Repository: larksuite/cli
Length of output: 39
🏁 Script executed:
sed -n '319,375p' shortcuts/im/helpers_test.goRepository: larksuite/cli
Length of output: 1475
🏁 Script executed:
rg -n 'backtick|inline.*code|\`' shortcuts/im/helpers_test.go | head -30Repository: larksuite/cli
Length of output: 170
🏁 Script executed:
sed -n '906,940p' shortcuts/im/helpers.goRepository: larksuite/cli
Length of output: 738
🏁 Script executed:
rg -n 'backtick|inline' shortcuts/im/helpers_test.goRepository: larksuite/cli
Length of output: 210
🏁 Script executed:
grep -n 'Test.*[Mm]arkdown\|Test.*[Ee]mphasis\|Test.*[Cc]ode' shortcuts/im/helpers_test.goRepository: larksuite/cli
Length of output: 575
🏁 Script executed:
sed -n '538,565p' shortcuts/im/helpers_test.goRepository: larksuite/cli
Length of output: 1180
🏁 Script executed:
sed -n '506,560p' shortcuts/im/helpers_test.goRepository: larksuite/cli
Length of output: 2282
🏁 Script executed:
rg -n 'unmatched' shortcuts/im/helpers_test.goRepository: larksuite/cli
Length of output: 39
🏁 Script executed:
sed -n '822,850p' shortcuts/im/helpers.goRepository: larksuite/cli
Length of output: 942
🏁 Script executed:
cat -n shortcuts/im/helpers.go | sed -n '908,940p'Repository: larksuite/cli
Length of output: 812
🏁 Script executed:
python3 << 'EOF'
# Simulate scanInlineCodeSpans behavior with unmatched backticks
line = "hello ` world" # unmatched opening backtick
spans = []
i = 0
while i < len(line):
if line[i] != '`':
i += 1
continue
start = i
while i < len(line) and line[i] == '`':
i += 1
delim = line[start:i]
j = i
found = False
while j <= len(line) - len(delim):
if line[j] == '`':
k = j
while k < len(line) and line[k] == '`':
k += 1
if k - j == len(delim):
spans.append((start, k))
i = k
found = True
break
j = k
else:
j += 1
if not found:
print(f"No match found for delimiter at {start}: {repr(delim)}")
print(f"Loop exited with i={i}, continuing outer loop")
print(f"Final spans for '{line}': {spans}")
print()
# Test case: unmatched at start, then balanced backticks later
line2 = "start ` middle `code` end"
spans2 = []
i = 0
while i < len(line2):
if line2[i] != '`':
i += 1
continue
start = i
while i < len(line2) and line2[i] == '`':
i += 1
delim = line2[start:i]
j = i
found = False
while j <= len(line2) - len(delim):
if line2[j] == '`':
k = j
while k < len(line2) and line2[k] == '`':
k += 1
if k - j == len(delim):
spans2.append((start, k))
i = k
found = True
break
j = k
else:
j += 1
if not found:
print(f"No match found for delimiter at {start}: {repr(delim)}")
print(f"Continuing outer loop from i={i}")
print(f"Final spans for '{line2}': {spans2}")
EOFRepository: larksuite/cli
Length of output: 304
Add a test case for unmatched backticks in inline code scanning.
The scan logic correctly handles unmatched opening backticks: when no closing run is found, i remains at the post-run position, allowing the outer loop to continue and subsequent backticks on the same line to form spans independently. This fallback behavior matches CommonMark's intent of not requiring the first backtick run to be the code opener.
However, the existing test suite for wrapMarkdownAsPost covers only balanced delimiters (`code` and ``nested``). Add a test case with an unmatched backtick followed by a balanced pair (e.g., "text middle code end"`) to document this behavior and ensure the emphasis normalization skips over it correctly.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@shortcuts/im/helpers.go` around lines 908 - 940, Add a unit test to document
and assert the behavior of scanInlineCodeSpans/wrapMarkdownAsPost when an
unmatched backtick run appears before a balanced code span: create a test (e.g.,
TestWrapMarkdownAsPost_UnmatchedBacktickThenBalanced) that feeds a string like
"text ` middle `code` end" into wrapMarkdownAsPost (or directly into
scanInlineCodeSpans) and assert that the unmatched single backtick does not form
a span and the subsequent "`code`" is detected as a span (and that emphasis
normalization skips the unmatched run); locate code around scanInlineCodeSpans
and wrapMarkdownAsPost to add the test so it fails if the scanner incorrectly
treats the unmatched backtick as a code opener.
| func normalizeEmphasisSpacingSegment(seg string) string { | ||
| if !strings.Contains(seg, "*") { | ||
| return seg | ||
| } | ||
|
|
||
| var sb strings.Builder | ||
| pos := 0 | ||
| for pos < len(seg) { | ||
| openStart, openEnd, ok := nextAsteriskRun(seg, pos) | ||
| if !ok { | ||
| sb.WriteString(seg[pos:]) | ||
| break | ||
| } | ||
|
|
||
| sb.WriteString(seg[pos:openStart]) | ||
|
|
||
| markerLen := openEnd - openStart | ||
| if markerLen != 1 && markerLen != 2 && markerLen != 3 { | ||
| sb.WriteString(seg[openStart:openEnd]) | ||
| pos = openEnd | ||
| continue | ||
| } | ||
|
|
||
| closeStart, closeEnd, ok := nextAsteriskRun(seg, openEnd) | ||
| if !ok || closeEnd-closeStart != markerLen { | ||
| sb.WriteString(seg[openStart:openEnd]) | ||
| pos = openEnd | ||
| continue | ||
| } | ||
| if !hasSimpleEmphasisBoundaries(seg, openStart, closeEnd) { | ||
| sb.WriteString(seg[openStart:closeEnd]) | ||
| pos = closeEnd | ||
| continue | ||
| } | ||
|
|
||
| payload := seg[openEnd:closeStart] | ||
| normalized, shouldNormalize := normalizeEmphasisPayload(payload) | ||
| if !shouldNormalize { | ||
| sb.WriteString(seg[openStart:closeEnd]) | ||
| pos = closeEnd | ||
| continue | ||
| } | ||
|
|
||
| marker := seg[openStart:openEnd] | ||
| sb.WriteString(marker) | ||
| sb.WriteString(normalized) | ||
| sb.WriteString(marker) | ||
| pos = closeEnd | ||
| } | ||
| return sb.String() | ||
| } |
There was a problem hiding this comment.
Edge case: list items containing a stray * get rewritten as emphasis.
normalizeEmphasisSpacingSegment doesn't distinguish a leading * that is a list-item marker from an opening emphasis flanker. For lines that start with * and have any later * whose closing boundary is non-word-like, the leading list marker is consumed as an opener and the line is rewritten.
Examples (single line each):
* Use the * operator to multiply→*Use the* operator to multiply(list item lost)* one item * here→*one item* here
Bold/triple-asterisk variants inside such list items still work fine because the marker-length check filters them out — the issue is specific to lines that begin with * + space and contain another single *.
A simple guard would be to skip normalization when openStart == 0 (or when the byte before openStart is \n) AND the opener is followed by whitespace (CommonMark list-item shape ^\* ). The same pattern is what the existing test preserves embedded non emphatic spaced asterisks already relies on — the boundary check just needs the symmetric “opener immediately followed by whitespace” guard.
🛡️ Sketch of guard (add to `hasSimpleEmphasisBoundaries` or its caller)
// Reject list-item-style openers: line-start `*` followed by whitespace.
isLineStart := openStart == 0 || s[openStart-1] == '\n'
if isLineStart && openEnd-openStart == 1 && openEnd < len(s) {
if r, _ := utf8.DecodeRuneInString(s[openEnd:]); unicode.IsSpace(r) {
return false
}
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@shortcuts/im/helpers.go` around lines 942 - 992, The function
normalizeEmphasisSpacingSegment incorrectly treats a leading list-item `* ` as
an emphasis opener; to fix, add a guard (either inside
hasSimpleEmphasisBoundaries or immediately after discovering an opener in
normalizeEmphasisSpacingSegment) that returns false / skips normalization when
openStart is at a line start (openStart == 0 or the byte before openStart is
'\n'), the marker length is 1 (openEnd-openStart == 1) and the byte/rune
immediately after openEnd is whitespace; update hasSimpleEmphasisBoundaries or
the caller logic around the nextAsteriskRun/openStart/openEnd check so
list-item-style `* ` openers are not treated as emphasis openers.
Summary
Changes
Test Plan
lark xxxcommand works as expectedRelated Issues
Summary by CodeRabbit
New Features
** text **→**text**,* italic *→*italic*).Documentation