Skip to content

feat(doc): add pre-write semantic warnings to docs +update#569

Merged
fangshuyu-768 merged 3 commits intomainfrom
feat/docs-update-semantic-check
Apr 21, 2026
Merged

feat(doc): add pre-write semantic warnings to docs +update#569
fangshuyu-768 merged 3 commits intomainfrom
feat/docs-update-semantic-check

Conversation

@herbertliu
Copy link
Copy Markdown
Collaborator

@herbertliu herbertliu commented Apr 20, 2026

Summary

Add two static semantic checks to docs +update that run before the MCP
update-doc call. They inform the user about CLI/MCP contract edges that
commonly cause confusing round-trip results, without blocking execution.

Changes

  • shortcuts/doc/docs_update_check.go (new): docsUpdateWarnings(mode, markdown) returns human-readable warnings based on:
    1. replace_* + blank-line markdownreplace_range / replace_all only swap text inside an existing block, so \n\n renders as literal text, not a paragraph break. Hint: use delete_range + insert_before.
    2. Combined bold+italic emphases (***text***, **_text_**, _**text**_) — Lark stores only one of the two emphases; these patterns silently downgrade. Hint: split into two separate emphases.
  • shortcuts/doc/docs_update.go: wire the checks into Execute before CallMCPTool, writing each warning to stderr prefixed with warning:.
  • shortcuts/doc/docs_update_check_test.go (new): table-driven tests for both checks plus aggregation and empty-input cases.

Behavior

  • Warnings only — the update still proceeds after printing them.
  • Not emitted in dry-run mode (kept quiet during planning).
  • replace_*-only: the blank-line check does NOT fire on insert_* / append / overwrite where multi-paragraph markdown is valid.

Test Plan

  • go test ./shortcuts/doc/... passes
  • go vet ./shortcuts/... clean
  • gofmt -l . clean
  • golangci-lint run --new-from-rev=origin/main shows 0 issues
  • Manual: sample invocations with replace_range + \n\n, with ***text***, and both together each emit the expected warning lines

Related

First batch of the AI-Agent pitfall review (Cases 1 and 5). Case 2
(heading-type warning) is intentionally deferred — it requires an extra
block fetch and should be a separate PR with a --strict / verbose
opt-in, so it doesn't add latency to every docs +update.

Summary by CodeRabbit

  • New Features

    • Pre-submit validation for documentation updates now warns about potential formatting issues (multiline paragraph breaks in replace modes and combined bold+italic patterns).
    • Warnings are printed to the console prefixed with "warning:" before running the update, and ignore fenced code and inline code to reduce false positives.
  • Tests

    • Added unit and end-to-end tests covering the new semantic warnings and verifying that dry-run executions suppress those warnings.

Two static checks run before the MCP update-doc call:

1. replace_* + blank-line markdown: replace_range / replace_all only
   swap text inside an existing block — a \n\n in the payload will
   render as literal text, not a paragraph break. Hint to use
   delete_range + insert_before instead.

2. Combined bold+italic emphases (***text***, **_text_**, _**text**_)
   cannot round-trip through Lark and are silently downgraded to a
   single emphasis. Hint to split into two separate emphases.

Both warnings go to stderr and never block the update — they inform,
not gate. Adds table-driven tests for each check plus an aggregation
test, and wires the checks into Execute right before CallMCPTool.

Closes the first batch of items from the docs +update pitfalls
review (Cases 1 and 5).
@github-actions github-actions Bot added domain/ccm PR touches the ccm domain size/M Single-domain feat or fix with limited business impact labels Apr 20, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 20, 2026

📝 Walkthrough

Walkthrough

Performs pre-request semantic validation for docs +update by computing mode and markdown, invoking docsUpdateWarnings(mode, markdown), printing any warnings to stderr, and passing the precomputed values into the MCP invocation.

Changes

Cohort / File(s) Summary
Execution integration
shortcuts/doc/docs_update.go
Precompute mode and markdown, call docsUpdateWarnings(mode, markdown), print each warning to runtime.IO().ErrOut prefixed with warning:, and pass precomputed values into MCP args (conditionally include markdown).
Validation logic
shortcuts/doc/docs_update_check.go
New validator docsUpdateWarnings with checkDocsUpdateReplaceMultilineMarkdown and checkDocsUpdateBoldItalic; helpers to strip/mask fenced code regions and inline code spans, normalize CRLF, and ignore escaped emphasis. Detects paragraph breaks for replace modes and combined bold+italic patterns.
Unit tests
shortcuts/doc/docs_update_check_test.go
Table-driven tests covering replace-mode blank-line detection (CRLF, fence handling, indentation edge cases), emphasis-pattern detection (multiple combined forms, escaped markers, ignoring code regions), aggregation and empty cases.
E2E test
tests/cli_e2e/docs/docs_update_dryrun_test.go
Adds dry-run E2E asserting that semantic warnings are suppressed during --dry-run execution and that command exits successfully.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CLI as "DocsUpdate.Execute"
    participant Validator as "docsUpdateWarnings"
    participant MCP as "MCP tool"
    participant Stderr as "runtime.IO().ErrOut"

    User->>CLI: run `docs +update` (mode, markdown)
    CLI->>Validator: docsUpdateWarnings(mode, markdown)
    Validator-->>CLI: []warnings
    alt warnings found
        CLI->>Stderr: print "warning: ..." for each
    end
    CLI->>MCP: invoke with precomputed args (mode[, markdown])
    MCP-->>CLI: response
    CLI-->>User: result
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

size/L

Suggested reviewers

  • liujinkun2025

Poem

🐰 I nibble through markdown, soft and bright,
Finding blank breaks and tangled bolds at night,
I thump a warning, gentle and clear,
So updates hop safely, far and near,
Hooray — I guard each doc with delight.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: adding semantic warnings to the docs +update command.
Description check ✅ Passed The description includes all required sections with complete details: a clear summary explaining the motivation, a comprehensive list of changes with file names and implementation details, thorough test plan with checkmarks, and related issue context.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/docs-update-semantic-check

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 20, 2026

Codecov Report

❌ Patch coverage is 89.43089% with 13 lines in your changes missing coverage. Please review.
✅ Project coverage is 60.74%. Comparing base (9acd121) to head (8bec6eb).
⚠️ Report is 9 commits behind head on main.

Files with missing lines Patch % Lines
shortcuts/doc/docs_update.go 0.00% 7 Missing ⚠️
shortcuts/doc/docs_update_check.go 94.82% 4 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #569      +/-   ##
==========================================
+ Coverage   60.19%   60.74%   +0.55%     
==========================================
  Files         390      394       +4     
  Lines       33433    33910     +477     
==========================================
+ Hits        20125    20600     +475     
+ Misses      11426    11400      -26     
- Partials     1882     1910      +28     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
shortcuts/doc/docs_update.go (1)

125-125: Reuse the local markdown variable.

You already captured markdown := runtime.Str("markdown") at the top of Execute; re-reading it here is harmless but inconsistent with the rest of the function (lines 105–108 were updated to use mode/markdown locals).

Proposed tweak
-		normalizeDocsUpdateResult(result, runtime.Str("markdown"))
+		normalizeDocsUpdateResult(result, markdown)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@shortcuts/doc/docs_update.go` at line 125, In Execute, instead of calling
runtime.Str("markdown") inline, reuse the already-captured local variable
markdown (declared as markdown := runtime.Str("markdown")) when invoking
normalizeDocsUpdateResult; update the call normalizeDocsUpdateResult(result,
runtime.Str("markdown")) to normalizeDocsUpdateResult(result, markdown) so it is
consistent with other uses of mode/markdown within Execute and avoids redundant
runtime lookups.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@shortcuts/doc/docs_update.go`:
- Line 125: In Execute, instead of calling runtime.Str("markdown") inline, reuse
the already-captured local variable markdown (declared as markdown :=
runtime.Str("markdown")) when invoking normalizeDocsUpdateResult; update the
call normalizeDocsUpdateResult(result, runtime.Str("markdown")) to
normalizeDocsUpdateResult(result, markdown) so it is consistent with other uses
of mode/markdown within Execute and avoids redundant runtime lookups.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 86c38d34-1637-4b04-8e2b-cdff402a350f

📥 Commits

Reviewing files that changed from the base of the PR and between 9acd121 and 03b9732.

📒 Files selected for processing (3)
  • shortcuts/doc/docs_update.go
  • shortcuts/doc/docs_update_check.go
  • shortcuts/doc/docs_update_check_test.go

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 20, 2026

🚀 PR Preview Install Guide

🧰 CLI update

npm i -g https://pkg.pr.new/larksuite/cli/@larksuite/cli@8bec6eb85ce459ca9dc19c718edefabad020d191

🧩 Skill update

npx skills add larksuite/cli#feat/docs-update-semantic-check -y -g

Comment thread shortcuts/doc/docs_update_check.go Outdated
Comment thread shortcuts/doc/docs_update_check.go Outdated
Comment thread shortcuts/doc/docs_update_check_test.go
…checks (#578)

* fix(doc): exclude code regions and escaped markers from docs +update checks

Addresses the three review comments on #569: the blank-line paragraph
check and the bold+italic emphasis check both operate on the raw
markdown string, so fenced code blocks / inline code spans / literal
escaped markers produce false-positive warnings on content users
expect to pass through verbatim.

Changes:

- Add proseHasBlankLine(): fence-aware detector that returns true only
  when a blank line sits outside of ```...``` or ~~~...~~~ regions.
  Replaces the raw strings.Contains("\n\n") check in
  checkDocsUpdateReplaceMultilineMarkdown.

- Add stripMarkdownCodeRegions(): blanks out fenced code lines and
  masks inline code spans (via scanInlineCodeSpans from markdown_fix.go)
  with equal-length whitespace so byte offsets outside the stripped
  regions are preserved.

- Add stripEscapedEmphasisMarkers(): removes "\*" and "\_" so literal
  sequences like "\***text***" — which CommonMark renders as a literal
  asterisk plus bold — don't match the combined bold+italic regex.

- Wire both helpers into checkDocsUpdateBoldItalic(): the regex now runs
  on stripEscapedEmphasisMarkers(stripMarkdownCodeRegions(markdown)),
  so code samples and escaped markers are sanitized away before
  detection.

Shared fence-parsing helpers (codeFenceOpenMarker, isCodeFenceClose,
leadingRun) are kept local to this file to avoid touching files outside
the scope of the reviewed PR. If a future change wants to reuse them
across the doc package, they can be promoted then.

Tests:

- TestCheckDocsUpdateReplaceMultilineMarkdown: add 4 negative/positive
  cases — blank line inside backtick and tilde fences (no flag), blank
  line in prose while fence also has blanks (flag wins), fenced code
  with no blank lines (no flag).

- TestCheckDocsUpdateBoldItalic: add 9 cases — ***text*** / **_text_** /
  _**text**_ inside fenced code (backtick and tilde), inside inline
  code spans, and escaped \***text*** / \*\*_text_\*\* (none flagged);
  plus two positive cases to verify the strip doesn't over-sanitize
  (real emphasis in prose still fires when inline/fenced code is nearby).

* fix(doc): close CommonMark gaps and add three more combined-emphasis shapes

Self-review of the first commit turned up three issues:

- isCodeFenceClose was strict on exact marker length. Per CommonMark
  §4.5, a closing fence must be at least as long as the opener, not
  exactly the same length. A 3-backtick open legitimately closed by a
  4-backtick closer (used to embed triple-backticks inside the code
  sample) was left open-ended, causing the rest of the document to be
  treated as code and both checks to silently skip it.

- Both fence helpers accepted any amount of leading whitespace because
  they ran on strings.TrimSpace(line). CommonMark allows 0..3 leading
  spaces before a fence marker; 4+ spaces (or any tab in leading
  position, which expands to 4 columns) makes the line indented code
  block content, not a fence open/close. Indented fence-like lines now
  correctly remain prose and blank lines around them are detected.

- The bold/italic check only covered three of the six documented
  combined-emphasis shapes. Added ___text___, __*text*__, and
  *__text__* so parity with the asterisk variants is complete. The
  regex set is now table-driven (combinedEmphasisPatterns) to make
  adding future shapes a one-line change.

Implementation changes:

- New fenceIndentOK(line) helper: returns (body, true) for 0..3 leading
  spaces with no tabs, else (_, false). Used by both codeFenceOpenMarker
  and isCodeFenceClose.
- isCodeFenceClose now counts the fence-char run and accepts any run
  length >= len(marker), with trailing whitespace only.
- checkDocsUpdateBoldItalic replaced three named var regexes with a
  table of six {shape, re} entries and a single early-exit loop.
- Updated docsUpdateWarnings top docstring to list all six shapes.
- Noted the known limitation of stripEscapedEmphasisMarkers around
  doubled backslash escapes ("\\***text***"), which is a false negative
  we accept in exchange for keeping this a simple string replace.

Test additions (docs_update_check_test.go):

- Fence close: longer-marker close correctly ends fence; real prose
  blank after a longer-close fence is still detected.
- Indentation: 4-space indented fence-like line is not a fence open,
  so a surrounding blank line still flags; tab-indented variant same;
  3-space indented fence is still a real fence.
- New shapes: ___text___ positive + all three negative-guards (fenced
  code, inline code, escaped); __*text*__ and *__text__* positive +
  fenced/inline negative-guards; plus two composition tests to ensure
  the strip does not over-sanitize across the six-regex alternative set.

All 53 sub-tests in this file pass; go vet and gofmt are clean.

---------

Co-authored-by: fangshuyu-768 <shuyufang768@outlook.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@shortcuts/doc/docs_update_check_test.go`:
- Around line 150-151: The test currently only checks for "delete_range" in the
remediation hint (in the table-driven test where tt.wantHint and got are used),
which could miss missing "insert_before"; update the assertion to validate the
complete remediation text—either assert strings.Contains(got, "delete_range") &&
strings.Contains(got, "insert_before") or compare got to the exact expected hint
string so the test fails if either half is omitted. Ensure you change the
t.Errorf message to reflect the full expected remediation.
- Around line 357-375: Add a new E2E dry-run test that invokes the CLI's docs
+update with a warning-causing mode (e.g., "replace_range") and the --dry-run
flag, then assert that the output does NOT contain the warnings produced by
docsUpdateWarnings; specifically, create a test under the CLI E2E tests for docs
that runs the command (using the existing E2E test harness/runner used by other
CLI tests), passes input that would ordinarily trigger
docsUpdateWarnings("***opening***\n\n...") and verifies the stdout/stderr
contains no warning strings, ensuring the helper's warnings are suppressed in
dry-run mode.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5c9c33ee-a97e-4f6b-891d-ede361bda88b

📥 Commits

Reviewing files that changed from the base of the PR and between 03b9732 and 199aadb.

📒 Files selected for processing (2)
  • shortcuts/doc/docs_update_check.go
  • shortcuts/doc/docs_update_check_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • shortcuts/doc/docs_update_check.go

Comment thread shortcuts/doc/docs_update_check_test.go Outdated
Comment thread shortcuts/doc/docs_update_check_test.go
Two CodeRabbit nits from #569:

1. Unit test hint assertion only checked for `delete_range` in the
   remediation message; the companion `insert_before` half of the
   guidance could regress undetected. Broaden the assertion to require
   both tokens so a future edit that drops half the remediation
   produces an immediate test failure.

2. No E2E coverage proved the dry-run contract in the PR description
   ("Not emitted in dry-run mode — kept quiet during planning"). The
   helper itself is unit-tested, but nothing caught a regression where
   a later refactor wired docsUpdateWarnings into the DryRun path.

   Add tests/cli_e2e/docs/docs_update_dryrun_test.go:
   TestDocs_UpdateDryRunSuppressesSemanticWarnings invokes
   `docs +update --dry-run --mode=replace_range --markdown "***x***\n\nb"`
   — an input crafted to trip BOTH pre-write warnings — and asserts
   neither the "warning:" prefix, the blank-line message, nor the
   combined-emphasis message appears on stdout or stderr.

   Note: the file needs -f to add because .gitignore has a bare
   `docs/` rule that accidentally matches tests/cli_e2e/docs/. The
   existing tracked files under that directory predate the rule; new
   additions have to be force-added until the ignore pattern is
   narrowed. Not worth rewriting .gitignore for one file.

Verified manually that the new E2E fails cleanly when warnings are
injected into DryRun and passes again after reverting — the test has
real regression-detection power, not just a sticker.

Co-authored-by: fangshuyu-768 <shuyufang768@outlook.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
shortcuts/doc/docs_update_check_test.go (1)

357-365: Optionally assert per-warning content, not just the count.

TestDocsUpdateWarningsAggregates only verifies len(warnings) == 2, so a future regression that emitted two copies of the same warning (e.g., both from the emphasis check) would still pass. Consider asserting each expected warning body appears once, mirroring the tokens the individual checks already produce (delete_range / insert_before for the replace hint, combined bold+italic for the emphasis hint).

♻️ Suggested tightening
 	warnings := docsUpdateWarnings("replace_range", "***opening***\n\nsecond paragraph")
 	if len(warnings) != 2 {
 		t.Fatalf("expected 2 warnings, got %d: %v", len(warnings), warnings)
 	}
+	joined := strings.Join(warnings, "\n")
+	for _, needle := range []string{"delete_range", "insert_before", "combined bold+italic"} {
+		if !strings.Contains(joined, needle) {
+			t.Errorf("expected aggregated warnings to contain %q, got: %v", needle, warnings)
+		}
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@shortcuts/doc/docs_update_check_test.go` around lines 357 - 365,
TestDocsUpdateWarningsAggregates only checks len(warnings) but should also
assert each distinct warning message is present once to prevent duplicate/silent
regressions; update the test that calls docsUpdateWarnings("replace_range",
"***opening***\n\nsecond paragraph") to assert that one warning contains the
replace hint tokens (e.g., references to "delete_range" and/or "insert_before")
and one contains the emphasis hint (e.g., "combined bold+italic"), verifying
each expected warning body appears exactly once in the warnings slice instead of
only checking the count.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@shortcuts/doc/docs_update_check_test.go`:
- Around line 357-365: TestDocsUpdateWarningsAggregates only checks
len(warnings) but should also assert each distinct warning message is present
once to prevent duplicate/silent regressions; update the test that calls
docsUpdateWarnings("replace_range", "***opening***\n\nsecond paragraph") to
assert that one warning contains the replace hint tokens (e.g., references to
"delete_range" and/or "insert_before") and one contains the emphasis hint (e.g.,
"combined bold+italic"), verifying each expected warning body appears exactly
once in the warnings slice instead of only checking the count.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e0eca455-04fb-4495-aa0e-f31ca9435473

📥 Commits

Reviewing files that changed from the base of the PR and between 199aadb and 8bec6eb.

📒 Files selected for processing (2)
  • shortcuts/doc/docs_update_check_test.go
  • tests/cli_e2e/docs/docs_update_dryrun_test.go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain/ccm PR touches the ccm domain size/M Single-domain feat or fix with limited business impact

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants