feat(doc): normalize exported sheet and video tags by HomyeeKing · Pull Request #672 · larksuite/cli

HomyeeKing · 2026-04-27T06:16:42Z

变更说明

将导出的电子表格标签从 sheet-id 规范化为 id
将导出的视频结构从 <figure><source .../></figure> 规范化为 <video> 标签
为真实的 docs_ai 导出结构补充对应测试

导出格式变更对比

电子表格

场景	导出形式
PR 修改前	`<sheet sheet-id="jkxrFs" token="FhJTsZVIihsBEStE5RVcr6Fbnrf"></sheet>`
PR 修改后	`<sheet id="jkxrFs" token="FhJTsZVIihsBEStE5RVcr6Fbnrf"></sheet>`

视频

场景	导出形式
PR 修改前	`<figure view-type="Preview"><source href="..." mime="video/quicktime" origin-height="1892.000000" origin-width="3016.000000" token="RXwJbGRG9orDqhxuYKvch0TqnBf"/></figure>`
PR 修改后	`<video controls src="feishu://media/RXwJbGRG9orDqhxuYKvch0TqnBf" data-mime="video/quicktime" data-view-type="Preview"></video>`

测试

go test ./shortcuts/doc

coderabbitai · 2026-04-27T06:16:57Z

📝 Walkthrough

Walkthrough

This PR introduces a bidirectional content normalization layer for Feishu document operations. It adds a compatibility transformation system that converts Feishu-specific video markup to file tokens on input (create/update operations) and reverses this conversion on output (fetch operations) to maintain format compatibility across different doc formats.

Changes

Cohort / File(s)	Summary
Content Normalization Layer `shortcuts/doc/docs_content_compat.go`, `shortcuts/doc/docs_content_compat_test.go`	Implements `normalizeDocInputContent` function that rewrites `<video>` tags with `feishu://media/` sources into self-closing `<file>` elements with extracted `token`, `name`, and `view-type` attributes for xml/markdown formats; includes comprehensive test coverage for format-specific behavior and attribute preservation.
Create Operation Integration `shortcuts/doc/docs_create_v2.go`, `shortcuts/doc/docs_create_test.go`	Modifies `buildCreateBody` to capture `doc-format` and pass it through `normalizeDocInputContent` before sending content payload; adds test validating video markup normalization during creation.
Fetch Operation Post-Processing `shortcuts/doc/docs_fetch_v2.go`, `shortcuts/doc/docs_fetch_v2_test.go`	Adds `normalizeFetchedDocumentContent` function to reverse normalization on fetched content: converts `<file>` and `<figure><source>` elements back to `<video>` tags with proper attributes, and rewrites `<sheet sheet-id=...>` to `<sheet id=...>`; includes extensive test coverage for video/sheet tag conversions.
Update Operation Integration `shortcuts/doc/docs_update_v2.go`, `shortcuts/doc/docs_update_test.go`	Modifies `buildUpdateBody` to capture `doc-format` and normalize content payload via `normalizeDocInputContent` before transmission; adds test validating content transformation during updates.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

feat(doc): add v2 API for docs +create / +fetch / +update #638: Directly related as it modifies the same docs v2 flow with normalization of video/file/sheet tags and integration points in create/update/fetch operations.

Suggested labels

size/M, domain/ccm

Suggested reviewers

fangshuyu-768
liangshuo-1

Poem

🐰 Through formats wide, a rabbit hops with glee,
Converting videos and sheets with parsing spree!
From feishu tokens, tags transform and dance,
Normalizing up and down—a document's sweet chance! 📹✨

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 16.13% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check	⚠️ Warning	The PR description is provided in Chinese but lacks the required English sections (Summary, Changes, Test Plan) from the template and misses key structural information.	Provide a complete description in the required template format with English sections including a brief summary, list of main changes, test plan with checkboxes, and related issues.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically summarizes the main change: normalizing exported sheet and video tags, which aligns with the core functionality across the changeset.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

CLAassistant · 2026-04-27T06:17:04Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

coderabbitai

🧹 Nitpick comments (3)

shortcuts/doc/docs_fetch_v2.go (2)
83-87: Order coupling: <file> regex requires token before name.

fetchedVideoFileTagRe only matches <file> tags where token="…" appears before name="…". That happens to match what normalizeInputVideoTags in shortcuts/doc/docs_content_compat.go emits (line 42-49), and the test fixtures all use that order. If the docs_ai export ever serializes attributes in a different order, the file→video rewrite will silently no-op (the tag passes through unchanged) rather than fail — which would be a hard-to-spot regression. Two low-cost options:

Decouple the regex (match <file …/>) and pull token/name via fetchedAttrValue, mirroring how view-type is already extracted at line 150.

Or add a regression test asserting that <file name="…" token="…"/> is also rewritten, so the coupling is at least pinned.

Not blocking since current callers are consistent.
♻️ Sketch: attribute-order-independent file regex
-	fetchedVideoFileTagRe   = regexp.MustCompile(`<file\s+([^>]*\btoken="([^"]+)"[^>]*\bname="([^"]+)"[^>]*)/>`)
+	fetchedVideoFileTagRe   = regexp.MustCompile(`<file\s+[^>]*/>`)
Then in normalizeFetchedVideoTags extract via fetchedAttrValue(tag, "token") / fetchedAttrValue(tag, "name") and early-return when either is empty.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@shortcuts/doc/docs_fetch_v2.go` around lines 83 - 87, The <file> tag regex
fetchedVideoFileTagRe is order-coupled (requires token before name) and can miss
tags with attributes in different orders; update normalizeFetchedVideoTags to
use a relaxed regex for fetchedVideoFileTagRe that simply matches <file .../>
(mirror how view-type is handled) and then extract token and name using
fetchedAttrValue(tag, "token") and fetchedAttrValue(tag, "name"), returning
early if either is empty; alternatively add a regression test ensuring a tag
like <file name="..." token="..."/> is rewritten to pin the current behavior.
185-203: Optional: consider a single regex over the suffix list.

Six strings.HasSuffix branches read fine, but a single switch over filepath.Ext(name) would compress this and make the supported-extensions list scannable. Purely a style/readability nit.
♻️ Suggested simplification
 func isFetchedVideoFilename(name string) bool {
-	name = strings.ToLower(strings.TrimSpace(name))
-	switch {
-	case strings.HasSuffix(name, ".mp4"):
-		return true
-	case strings.HasSuffix(name, ".mov"):
-		return true
-	case strings.HasSuffix(name, ".m4v"):
-		return true
-	case strings.HasSuffix(name, ".webm"):
-		return true
-	case strings.HasSuffix(name, ".avi"):
-		return true
-	case strings.HasSuffix(name, ".mkv"):
-		return true
-	default:
-		return false
-	}
+	switch strings.ToLower(filepath.Ext(strings.TrimSpace(name))) {
+	case ".mp4", ".mov", ".m4v", ".webm", ".avi", ".mkv":
+		return true
+	default:
+		return false
+	}
 }
Requires adding "path/filepath" to imports.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@shortcuts/doc/docs_fetch_v2.go` around lines 185 - 203, The
isFetchedVideoFilename function currently checks many suffixes with repeated
strings.HasSuffix calls; replace that with a single extraction of the file
extension (use filepath.Ext(name) after trimming/ToLower) and switch or map over
the extension (e.g., case ".mp4", ".mov", etc.) to return true for supported
extensions; update imports to include "path/filepath" and keep the existing
normalization (strings.TrimSpace/ToLower) before calling filepath.Ext.
shortcuts/doc/docs_content_compat.go (1)
12-12: Optional: name overlap with fetch-side regexes.

fetchedVideoTagRe here matches input <video> tags being normalized back into <file/>, while fetchedVideoFileTagRe / fetchedVideoFigureTagRe in shortcuts/doc/docs_fetch_v2.go operate on fetch output. The shared fetched* prefix is technically accurate (these tags originate from a prior fetch) but on a quick read it can look like input/output regexes are intermixed. A name like inputVideoTagRe (or moving all three regexes into a single file with directional prefixes inboundVideoTagRe / outboundVideo*TagRe) would make the direction obvious. No behavior change.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@shortcuts/doc/docs_content_compat.go` at line 12, Rename the ambiguous regex
variable fetchedVideoTagRe to a direction-explicit name (e.g., inputVideoTagRe
or inboundVideoTagRe) to avoid confusion with fetchedVideoFileTagRe and
fetchedVideoFigureTagRe; update the declaration (var fetchedVideoTagRe = ...)
and all references/usages across the codebase to the new identifier so the regex
that normalizes incoming <video> tags is clearly distinguished from the
fetch-output regexes.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@shortcuts/doc/docs_content_compat.go`:
- Line 12: Rename the ambiguous regex variable fetchedVideoTagRe to a
direction-explicit name (e.g., inputVideoTagRe or inboundVideoTagRe) to avoid
confusion with fetchedVideoFileTagRe and fetchedVideoFigureTagRe; update the
declaration (var fetchedVideoTagRe = ...) and all references/usages across the
codebase to the new identifier so the regex that normalizes incoming <video>
tags is clearly distinguished from the fetch-output regexes.

In `@shortcuts/doc/docs_fetch_v2.go`:
- Around line 83-87: The <file> tag regex fetchedVideoFileTagRe is order-coupled
(requires token before name) and can miss tags with attributes in different
orders; update normalizeFetchedVideoTags to use a relaxed regex for
fetchedVideoFileTagRe that simply matches <file .../> (mirror how view-type is
handled) and then extract token and name using fetchedAttrValue(tag, "token")
and fetchedAttrValue(tag, "name"), returning early if either is empty;
alternatively add a regression test ensuring a tag like <file name="..."
token="..."/> is rewritten to pin the current behavior.
- Around line 185-203: The isFetchedVideoFilename function currently checks many
suffixes with repeated strings.HasSuffix calls; replace that with a single
extraction of the file extension (use filepath.Ext(name) after trimming/ToLower)
and switch or map over the extension (e.g., case ".mp4", ".mov", etc.) to return
true for supported extensions; update imports to include "path/filepath" and
keep the existing normalization (strings.TrimSpace/ToLower) before calling
filepath.Ext.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2374a61d-e7dc-4673-a740-6e0af88020f1

📥 Commits

Reviewing files that changed from the base of the PR and between 1e2144e and c23c163.

📒 Files selected for processing (8)

shortcuts/doc/docs_content_compat.go
shortcuts/doc/docs_content_compat_test.go
shortcuts/doc/docs_create_test.go
shortcuts/doc/docs_create_v2.go
shortcuts/doc/docs_fetch_v2.go
shortcuts/doc/docs_fetch_v2_test.go
shortcuts/doc/docs_update_test.go
shortcuts/doc/docs_update_v2.go

HomyeeKing added 2 commits April 26, 2026 22:08

feat(doc): add video and embedded sheet compat

8fd8a2b

feat(doc): normalize exported sheet and video tags

c23c163

github-actions Bot added domain/ccm PR touches the ccm domain size/M Single-domain feat or fix with limited business impact labels Apr 27, 2026

coderabbitai Bot reviewed Apr 27, 2026

View reviewed changes

HomyeeKing closed this Apr 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(doc): normalize exported sheet and video tags#672

feat(doc): normalize exported sheet and video tags#672
HomyeeKing wants to merge 2 commits intolarksuite:mainfrom
HomyeeKing:feat/export

HomyeeKing commented Apr 27, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Apr 27, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (2 warnings)

Uh oh!

CLAassistant commented Apr 27, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

HomyeeKing commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

变更说明

导出格式变更对比

电子表格

视频

测试

Uh oh!

coderabbitai Bot commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (2 warnings)

Uh oh!

CLAassistant commented Apr 27, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

HomyeeKing commented Apr 27, 2026 •

edited

Loading

coderabbitai Bot commented Apr 27, 2026 •

edited

Loading