feat(ci): refine PR business area labels and introduce skill format check#148
feat(ci): refine PR business area labels and introduce skill format check#148fengzhangchi-bytedance merged 37 commits intolarksuite:mainfrom
Conversation
Replaced synthetic or open PR samples with actual merged/closed PRs from the repository to provide a more accurate reflection of the size label categorization. Added 4 samples each for sizes S, M, and L covering docs, fixes, ci, and features.
Based on user feedback, fine-grained domain labels (like `domain/base`) are too detailed for the early stages. This change adds support for applying `area/*` tags to indicate which important top-level modules a PR touches. Currently tracked areas: - `area/shortcuts` - `area/skills` - `area/cmd` Minor modules like docs, ci, and tests are intentionally excluded to keep tags focused on critical architectural components.
To avoid polluting the root `scripts/` directory, moved `sync_pr_labels.js` and `sync_pr_labels.samples.json` into a new `scripts/sync-pr-labels/` folder. Added a dedicated README to document its usage and behavior. Updated `.github/workflows/pr-labels.yml` to reflect the new path.
Renamed `scripts/sync-pr-labels/` to `scripts/pr-labels/` to keep directory names concise. Updated internal references and GitHub workflow files to point to the new path.
Added `expected_areas` lists to each sample in `samples.json` to reflect the newly added `area/*` high-level module tagging logic. Allows testing to accurately check both `size/*` and `area/*` outputs.
- Reorganized code into logical sections with clear comments - Encapsulated GitHub API interactions into a reusable `GitHubClient` class - Extracted and centralized classification logic into a pure `evaluateRules` function - Replaced magic numbers with named constants (`THRESHOLD_L`, `THRESHOLD_XL`) - Fixed `ROOT` path resolution logic - Simplified conditional statements and control flow
- Add PATH_TO_AREA_MAP to map shortcuts/skills paths to business areas (im, vc, ccm, base, mail, calendar, task, contact) - Replace importantAreas with businessAreas throughout the codebase - Remove area/shortcuts, area/skills, area/cmd generic labels - Now generates specific labels like area/im, area/vc, area/ccm, etc. - Update samples.json expected_areas to match new behavior Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds two CI automations: a PR label synchronizer that analyzes changed files to manage Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant GHA as "GitHub Actions (workflow)"
participant Script as "PR Labels Script"
participant GitHubAPI as "GitHub REST API"
participant Repo as "Repository (label defs / files)"
GHA->>Script: Trigger with PR event payload
Script->>GitHubAPI: Fetch PR metadata and file list
GitHubAPI-->>Script: Return files, statuses
Script->>Script: Classify files, compute effective LOC, detect areas/domains
Script->>Repo: Read local label definitions / samples
Script->>GitHubAPI: Create or update label definitions (POST/PATCH)
GitHubAPI-->>Script: Confirm label ops
Script->>GitHubAPI: Add missing `size/*` and `area/*` labels to PR
GitHubAPI-->>Script: Labels applied
Script->>GitHubAPI: Remove managed labels no longer applicable
GitHubAPI-->>Script: Labels removed
Script->>GHA: Append PR Size Classification to GITHUB_STEP_SUMMARY
GHA-->>GHA: Workflow completes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Greptile SummaryThis PR introduces two new CI/CD automation pipelines for the Key changes:
Confidence Score: 5/5Safe to merge; all remaining findings are P2 quality/reliability suggestions that do not affect correctness in the current repo state. Prior P1 concerns (String.includes false-positive in skill validator, missing pull-requests:write permission, stale README area labels) are all resolved. Remaining items are a hard-exit on a missing skills directory (P2 — harmless given the directory exists in this repo) and live-API test calls without rate-limit handling (P2 — cosmetic test reliability concern). Neither blocks merge. scripts/skill-format-check/index.js (missing-directory exit code), scripts/pr-labels/test.js (live API calls without rate-limit handling) Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[PR event / push to main] --> B{Workflow trigger}
B -->|pr-labels.yml\npull_request_target| C[Checkout base branch]
C --> D[node scripts/pr-labels/index.js]
D --> E[Fetch PR files via GitHub API]
E --> F{Classify PR}
F --> G[effectiveChanges / domains / coreAreas / sensitiveKeywords]
G -->|lowRiskOnly| H[size/S]
G -->|effectiveChanges > 1200 or multiDomain+sensitive| I[size/XL]
G -->|refactor / effectiveChanges >= 300 / multiDomain / sensitive| J[size/L]
G -->|default| K[size/M]
H & I & J & K --> L[Sync domain/* labels via PATH_TO_DOMAIN_MAP]
L --> M[Add new labels / remove stale managed labels]
M --> N[Write GitHub Step Summary]
B -->|skill-format-check.yml\npull_request / push| O[Checkout branch]
O --> P[node scripts/skill-format-check/index.js]
P --> Q{skills/ dir exists?}
Q -->|No| R[exit 1]
Q -->|Yes| S[For each skill dir]
S --> T{SKILL.md exists?}
T -->|No| U[error: missing SKILL.md]
T -->|Yes| V{Frontmatter valid?}
V -->|name + description present| W[exit 0]
V -->|missing required field| X[error: missing field]
U & X --> Y[exit 1]
W --> Z[All skills passed]
Reviews (12): Last reviewed commit: "Merge branch 'larksuite:main' into main" | Re-trigger Greptile |
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (6)
scripts/skill-format-check/README.md (1)
7-10: Clarify metadata requirement wording in docs.“Required fields” currently lists
metadata, but the next line says missingmetadatais warning-only. Consider rephrasing to “checked fields” (or explicitly split required vs optional-warning) to avoid ambiguity.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/skill-format-check/README.md` around lines 7 - 10, The README wording is ambiguous about `metadata`; update the description to clearly split required vs warning-only fields: in the paragraph describing `index.js` and `skill-template/skill-template.md`, change "required fields" to something like "checked fields" or explicitly list "Required: `name`, `description`; Checked (warning-only if missing): `metadata`" so readers understand `metadata` triggers a warning rather than a build failure; reference `index.js`, `SKILL.md`, and `skill-template/skill-template.md` in the sentence you edit.scripts/skill-format-check/test.sh (1)
40-42: Consider checking for any non-zero exit code instead of exactly1.The negative test expects exactly exit code
1, but if the Node script crashes for an unrelated reason (e.g., syntax error, missing dependency), it might exit with a different non-zero code. This could mask real issues as false "passes."♻️ Proposed fix
- if [ $? -eq 1 ]; then + local exit_code=$? + if [ $exit_code -ne 0 ]; thenAlternatively, if you want to be strict about exit code
1specifically, consider adding a comment explaining why.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/skill-format-check/test.sh` around lines 40 - 42, The test currently checks for an exact exit code 1 after running node "$INDEX_JS" "$DIR/tests/temp_test_dir" by using if [ $? -eq 1 ]; change this to detect any non-zero failure (e.g., if [ $? -ne 0 ]) so that crashes or other errors also cause the negative test to fail; update the conditional around the node invocation (referencing INDEX_JS and DIR) and optionally add a comment if you truly intend to require exactly 1.scripts/skill-format-check/index.js (3)
52-59: Field validation usingincludes()can produce false positives.Using
frontmatter.includes('name:')will match substrings likehostname:or YAML values containingname:. This could incorrectly pass validation when required fields are actually missing.Consider using a regex that matches the field at the start of a line:
♻️ Proposed fix using regex
- if (!frontmatter.includes('name:')) { + if (!/^name:/m.test(frontmatter)) { console.error(`❌ [${skill}] YAML frontmatter missing 'name'`); hasErrors = true; } - if (!frontmatter.includes('description:')) { + if (!/^description:/m.test(frontmatter)) { console.error(`❌ [${skill}] YAML frontmatter missing 'description'`); hasErrors = true; } - if (!frontmatter.includes('metadata:')) { + if (!/^metadata:/m.test(frontmatter)) { console.warn(`⚠️ [${skill}] YAML frontmatter missing 'metadata' (Warning only)`); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/skill-format-check/index.js` around lines 52 - 59, The current validation uses frontmatter.includes('name:') and includes('description:') which can false-positive on substrings; update the checks that reference frontmatter (and set hasErrors/console.error for skill) to test for fields at line start instead — e.g., use a regex like /^\s*name:\b/m and /^\s*description:\b/m or split frontmatter into lines and check lines.some(l => l.trim().startsWith('name:')) to ensure the keys exist as top-level frontmatter fields before clearing hasErrors or logging success for skill.
41-51: Frontmatter extraction may behave unexpectedly with\r\nline endings.When the file starts with
---\r\n,content.substring(3, endOfFrontmatter)begins at byte 3, which is\r. This includes the carriage return in the frontmatter content. Additionally,indexOf('\n---', 3)won't match\r\n---correctly on files with inconsistent line endings.Consider normalizing line endings before parsing:
♻️ Proposed fix
const content = fs.readFileSync(skillFile, 'utf-8'); + // Normalize line endings to simplify parsing + const normalizedContent = content.replace(/\r\n/g, '\n'); // 检查 YAML Frontmatter - if (!content.startsWith('---\n') && !content.startsWith('---\r\n')) { + if (!normalizedContent.startsWith('---\n')) { console.error(`❌ [${skill}] SKILL.md must start with YAML frontmatter (---)`); hasErrors = true; } else { - // 兼容不同的换行符 - let endOfFrontmatter = content.indexOf('\n---', 3); + let endOfFrontmatter = normalizedContent.indexOf('\n---', 4); if (endOfFrontmatter === -1) { console.error(`❌ [${skill}] SKILL.md has unclosed YAML frontmatter`); hasErrors = true; } else { - const frontmatter = content.substring(3, endOfFrontmatter); + const frontmatter = normalizedContent.substring(4, endOfFrontmatter);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/skill-format-check/index.js` around lines 41 - 51, The frontmatter extraction logic can mis-handle CRLF endings: normalize line endings at the start (e.g., replace all '\r\n' with '\n') before doing the content.startsWith checks and before calling indexOf('\n---', 3) and substring; update the flow around content.startsWith('---\n') / content.startsWith('---\r\n') to operate on the normalized content so endOfFrontmatter = content.indexOf('\n---', 3) and frontmatter = content.substring(3, endOfFrontmatter) always get consistent results (or alternatively compute the frontmatter start offset based on the actual starting marker), ensuring indexOf and substring use the same normalized/new offsets.
16-18: Minor: Potential race condition withstatSyncafterreaddirSync.If a directory entry is deleted between
readdirSyncandstatSync, this will throw. For a CI script that runs on a static checkout, this is unlikely to be an issue in practice.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/skill-format-check/index.js` around lines 16 - 18, The current skills listing uses fs.readdirSync(SKILLS_DIR) followed by fs.statSync(path.join(SKILLS_DIR, file)) which can throw if an entry disappears between calls; update the logic in the skills variable to call fs.readdirSync with the { withFileTypes: true } option and use Dirent.isDirectory() (or wrap statSync in a try/catch) so you avoid the race and eliminate the separate statSync call when building the skills array.scripts/pr-labels/samples.json (1)
1-134: Add samples for the non-trivial path heuristics.This corpus never exercises
skills/**, renamed files between business areas, or a CCM-only PR spanning multiple subpaths likedoc+sheets/drive. Those are the easiest branches in the classifier to regress, so the sample suite will not catch some of the highest-risk edge cases introduced here.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/pr-labels/samples.json` around lines 1 - 134, The samples.json test corpus doesn't cover high-risk edge cases the classifier handles (renamed files across business areas, PRs touching skills/**, and CCM-only PRs spanning subpaths like doc + sheets/drive), so add new sample entries to scripts/pr-labels/samples.json that explicitly exercise those branches: create at least one sample PR with files under skills/**, one sample that includes file renames moving code between different areas (so expected_areas changes), and one CCM-only multi-subpath PR (e.g., doc + sheets or drive) with expected_label and expected_areas reflecting CCM; ensure each sample includes unique "name", "number", "title", "pr_url", "status"/"merged_at", "expected_label", "expected_areas", and a review_note describing why the sample covers the edge case so the classifier tests will exercise those code paths.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/pr-labels.yml:
- Around line 11-15: Remove the unnecessary write scope by deleting the
"pull-requests: write" permission from the workflow and keep "contents: read"
and "issues: write" only, since the pr-labels job only reads repository contents
and calls the issues/labels endpoints; update the permissions block to include
just "contents: read" and "issues: write" so the token is trimmed to least
privilege while still allowing label changes via the issues API.
- Around line 4-9: The workflow for pull_request_target currently lists event
types (the pull_request_target block with types - opened, reopened, synchronize,
ready_for_review) but omits the edited event so PR title changes won't retrigger
the classifier; update the pull_request_target types to include edited so
changes to the PR title (which drive prType) will rerun the workflow and update
labels accordingly.
In `@scripts/pr-labels/index.js`:
- Around line 252-261: The helper isLowRiskPath currently treats any .md under
skills/** as low-risk; add an explicit guard in isLowRiskPath to exclude files
under the skills directory before the DOC_SUFFIXES check (e.g., check if
normalized startsWith 'skills/' or matches /^skills\// and return false). Update
isLowRiskPath so that files under the skills/ prefix are not classified as
low-risk even if they match DOC_SUFFIXES, using the existing normalized variable
and LOW_RISK_PREFIXES/DOC_SUFFIXES checks to place the new guard at the top of
the function.
- Around line 400-431: The code derives domains from raw item.filename values
which misses renames and unnormalized paths; update the logic to use a
normalized filename (e.g., const name = (item.filename || item.previous_filename
|| "").replaceAll("\\", "/")) for all domain/business-area calculations and pass
that normalized name into shortcutDomainForPath, skillDomainForPath,
getBusinessArea, collectCoreAreas, collectSensitiveKeywords and when building
filenames/newShortcutDomain so renamed or Windows-style paths map to the correct
domain; ensure filenames array and any places that call detectNewShortcutDomain
use these normalized names (or a files array with a normalized filename
property) instead of the raw item.filename.
In `@scripts/pr-labels/README.md`:
- Around line 19-25: The README's "Area Tags" section is out of date: update the
text to match the labels actually emitted by index.js (replace the retired
examples `area/cmd`, `area/shortcuts`, `area/skills` with the current business
labels such as `area/im`, `area/base`, `area/ccm`), remove or mark retired
labels as deprecated, and add a short note that index.js is the source of truth
so future changes to the label set should be mirrored in README.md; reference
index.js and the "Area Tags" section in README when making the edits.
---
Nitpick comments:
In `@scripts/pr-labels/samples.json`:
- Around line 1-134: The samples.json test corpus doesn't cover high-risk edge
cases the classifier handles (renamed files across business areas, PRs touching
skills/**, and CCM-only PRs spanning subpaths like doc + sheets/drive), so add
new sample entries to scripts/pr-labels/samples.json that explicitly exercise
those branches: create at least one sample PR with files under skills/**, one
sample that includes file renames moving code between different areas (so
expected_areas changes), and one CCM-only multi-subpath PR (e.g., doc + sheets
or drive) with expected_label and expected_areas reflecting CCM; ensure each
sample includes unique "name", "number", "title", "pr_url",
"status"/"merged_at", "expected_label", "expected_areas", and a review_note
describing why the sample covers the edge case so the classifier tests will
exercise those code paths.
In `@scripts/skill-format-check/index.js`:
- Around line 52-59: The current validation uses frontmatter.includes('name:')
and includes('description:') which can false-positive on substrings; update the
checks that reference frontmatter (and set hasErrors/console.error for skill) to
test for fields at line start instead — e.g., use a regex like /^\s*name:\b/m
and /^\s*description:\b/m or split frontmatter into lines and check lines.some(l
=> l.trim().startsWith('name:')) to ensure the keys exist as top-level
frontmatter fields before clearing hasErrors or logging success for skill.
- Around line 41-51: The frontmatter extraction logic can mis-handle CRLF
endings: normalize line endings at the start (e.g., replace all '\r\n' with
'\n') before doing the content.startsWith checks and before calling
indexOf('\n---', 3) and substring; update the flow around
content.startsWith('---\n') / content.startsWith('---\r\n') to operate on the
normalized content so endOfFrontmatter = content.indexOf('\n---', 3) and
frontmatter = content.substring(3, endOfFrontmatter) always get consistent
results (or alternatively compute the frontmatter start offset based on the
actual starting marker), ensuring indexOf and substring use the same
normalized/new offsets.
- Around line 16-18: The current skills listing uses fs.readdirSync(SKILLS_DIR)
followed by fs.statSync(path.join(SKILLS_DIR, file)) which can throw if an entry
disappears between calls; update the logic in the skills variable to call
fs.readdirSync with the { withFileTypes: true } option and use
Dirent.isDirectory() (or wrap statSync in a try/catch) so you avoid the race and
eliminate the separate statSync call when building the skills array.
In `@scripts/skill-format-check/README.md`:
- Around line 7-10: The README wording is ambiguous about `metadata`; update the
description to clearly split required vs warning-only fields: in the paragraph
describing `index.js` and `skill-template/skill-template.md`, change "required
fields" to something like "checked fields" or explicitly list "Required: `name`,
`description`; Checked (warning-only if missing): `metadata`" so readers
understand `metadata` triggers a warning rather than a build failure; reference
`index.js`, `SKILL.md`, and `skill-template/skill-template.md` in the sentence
you edit.
In `@scripts/skill-format-check/test.sh`:
- Around line 40-42: The test currently checks for an exact exit code 1 after
running node "$INDEX_JS" "$DIR/tests/temp_test_dir" by using if [ $? -eq 1 ];
change this to detect any non-zero failure (e.g., if [ $? -ne 0 ]) so that
crashes or other errors also cause the negative test to fail; update the
conditional around the node invocation (referencing INDEX_JS and DIR) and
optionally add a comment if you truly intend to require exactly 1.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 785ea73d-14e4-4ad8-9099-1c64059ff0ee
📒 Files selected for processing (14)
.github/workflows/pr-labels.yml.github/workflows/skill-format-check.ymlscripts/pr-labels/README.mdscripts/pr-labels/index.jsscripts/pr-labels/samples.jsonscripts/skill-format-check/README.mdscripts/skill-format-check/index.jsscripts/skill-format-check/test.shscripts/skill-format-check/tests/bad-skill-no-frontmatter/SKILL.mdscripts/skill-format-check/tests/bad-skill-unclosed-frontmatter/SKILL.mdscripts/skill-format-check/tests/bad-skill/SKILL.mdscripts/skill-format-check/tests/good-skill-complex/SKILL.mdscripts/skill-format-check/tests/good-skill-minimal/SKILL.mdscripts/skill-format-check/tests/good-skill/SKILL.md
- Add `edited` event to PR labels workflow to trigger on title changes - Add security warning comment in pr-labels.yml workflow - Update pr-labels README with latest business area labels - Exclude `skills/lark-*` paths from low risk doc classification - Handle renamed files properly in PR path classification - Fix YAML frontmatter extraction to handle CRLF line endings - Use precise regex for YAML key validation instead of substring match - Fix exit code checking logic in skill-format-check test script - Translate Chinese comments in skill-format-check to English
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
.github/workflows/pr-labels.yml (1)
30-33: Keep best-effort behavior, but surface failures in the job summary.With
continue-on-error: true, label drift can be easy to miss. Consider adding a small post-step warning when sync fails.Suggested patch
- - name: Sync managed PR labels + - name: Sync managed PR labels + id: sync_pr_labels # Labeling is best-effort and must not block PR merges. continue-on-error: true env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} run: node scripts/pr-labels/index.js + + - name: Warn when label sync fails + if: ${{ always() && steps.sync_pr_labels.outcome == 'failure' }} + run: | + echo "⚠️ PR label sync failed; labels may be stale." >> "$GITHUB_STEP_SUMMARY"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/pr-labels.yml around lines 30 - 33, The "Sync managed PR labels" step should keep continue-on-error: true but be given an id (e.g., id: sync_labels), then add a follow-up step with if: always() that checks steps.sync_labels.outcome (or conclusion) and when it is not "success" appends a short warning message to the job summary (by echoing into $GITHUB_STEP_SUMMARY) and emits a workflow-level warning (e.g., using echo "::warning::...") so failures are visible in the job summary while not blocking the job.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/skill-format-check/index.js`:
- Around line 49-54: The frontmatter closing delimiter detection using
normalizedContent.indexOf('\n---', 4) is too permissive (matches '----' etc.);
change the logic that computes endOfFrontmatter to locate a line that is exactly
'---' (e.g., scan lines after the opening '---' and find the first line where
trimmedLine === '---' or use a regex that matches '\n---\n' or
'(?:\r?\n)---(?:\r?\n)') and set endOfFrontmatter to the correct index before
extracting frontmatter; update the code that uses endOfFrontmatter and
frontmatter (variables normalizedContent, endOfFrontmatter, frontmatter)
accordingly and keep the error branch for missing/invalid closing delimiter.
In `@scripts/skill-format-check/test.sh`:
- Around line 15-18: The fixture copy can fail and allow node "$INDEX_JS" to run
against an empty temp dir; update the script around the mkdir/cp/node sequence
(the block that uses "$DIR/tests/temp_test_dir", "$DIR/tests/$test_name", and
"$INDEX_JS") to check the exit status of the cp command (and any mkdir) and
abort with a non-zero exit and a clear error message if copying fails; apply the
same check to the other occurrence of this pattern later in the script (the
block around lines 36-40) so the validator never runs on an incomplete or empty
fixture.
---
Nitpick comments:
In @.github/workflows/pr-labels.yml:
- Around line 30-33: The "Sync managed PR labels" step should keep
continue-on-error: true but be given an id (e.g., id: sync_labels), then add a
follow-up step with if: always() that checks steps.sync_labels.outcome (or
conclusion) and when it is not "success" appends a short warning message to the
job summary (by echoing into $GITHUB_STEP_SUMMARY) and emits a workflow-level
warning (e.g., using echo "::warning::...") so failures are visible in the job
summary while not blocking the job.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 4a858bf6-c1ca-4463-81c2-1b0ebe38c54a
📒 Files selected for processing (5)
.github/workflows/pr-labels.ymlscripts/pr-labels/README.mdscripts/pr-labels/index.jsscripts/skill-format-check/index.jsscripts/skill-format-check/test.sh
🚧 Files skipped from review as they are similar to previous changes (1)
- scripts/pr-labels/index.js
- Fix frontmatter closing delimiter detection to strictly match '---' using regex, preventing invalid closing tags like '----' from passing. - Improve test fixture reliability by failing tests immediately if fixture preparation fails, avoiding false positives.
There was a problem hiding this comment.
🧹 Nitpick comments (2)
scripts/skill-format-check/index.js (1)
38-39: Avoid hard-crash on unreadableSKILL.md; report it as a validation error instead.A
readFileSyncexception currently aborts the whole run immediately, which skips consolidated error reporting for other skills.Proposed patch
- const content = fs.readFileSync(skillFile, 'utf-8'); + let content; + try { + content = fs.readFileSync(skillFile, 'utf-8'); + } catch (err) { + console.error(`❌ [${skill}] Failed to read SKILL.md: ${err.message}`); + hasErrors = true; + return; + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/skill-format-check/index.js` around lines 38 - 39, Wrap the synchronous read of SKILL.md in a try/catch around the fs.readFileSync(skillFile, 'utf-8') call so a thrown error doesn't crash the whole script; on catch push a validation error (including skillFile path and error.message) into the existing errors collection used by the script and continue processing other skills, ensure the rest of the logic that expects `content` handles a missing/empty value (e.g., skip further checks for this skill) instead of calling process.exit or rethrowing.scripts/skill-format-check/test.sh (1)
53-63: Preserve validator output for easier triage in negative tests.Suppressing all output makes unexpected behavior harder to debug. Capture output to a temp log and print it only on failure.
Proposed patch
- # Run the script and suppress error output since we expect it to fail - node "$INDEX_JS" "$TEMP_DIR" > /dev/null 2>&1 + # Capture output for diagnostics while still treating non-zero as expected + local log_file="$TEMP_DIR/.validator.log" + node "$INDEX_JS" "$TEMP_DIR" > "$log_file" 2>&1 local exit_code=$? @@ - else + else echo "❌ Failed! Expected $test_name to fail but it passed." + if [ -s "$log_file" ]; then + echo "--- Validator output ---" + cat "$log_file" + fi rm -rf "$TEMP_DIR" exit 1 fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/skill-format-check/test.sh` around lines 53 - 63, The negative test currently discards all node output, making failures hard to triage; create a temp log (e.g., TMP_LOG=$(mktemp)), run node "$INDEX_JS" "$TEMP_DIR" > "$TMP_LOG" 2>&1, capture exit_code as now, and on the success branch remove TMP_LOG and return as before, but on the failure branch print the preserved output (cat "$TMP_LOG") before echoing the failure message and exiting; ensure TMP_LOG is removed in both branches.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@scripts/skill-format-check/index.js`:
- Around line 38-39: Wrap the synchronous read of SKILL.md in a try/catch around
the fs.readFileSync(skillFile, 'utf-8') call so a thrown error doesn't crash the
whole script; on catch push a validation error (including skillFile path and
error.message) into the existing errors collection used by the script and
continue processing other skills, ensure the rest of the logic that expects
`content` handles a missing/empty value (e.g., skip further checks for this
skill) instead of calling process.exit or rethrowing.
In `@scripts/skill-format-check/test.sh`:
- Around line 53-63: The negative test currently discards all node output,
making failures hard to triage; create a temp log (e.g., TMP_LOG=$(mktemp)), run
node "$INDEX_JS" "$TEMP_DIR" > "$TMP_LOG" 2>&1, capture exit_code as now, and on
the success branch remove TMP_LOG and return as before, but on the failure
branch print the preserved output (cat "$TMP_LOG") before echoing the failure
message and exiting; ensure TMP_LOG is removed in both branches.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 5f662769-0302-405a-991b-7425a3270838
📒 Files selected for processing (2)
scripts/skill-format-check/index.jsscripts/skill-format-check/test.sh
- ci: warn when PR label sync fails in job summary - test(skill-format-check): capture validator output for negative tests - fix(skill-format-check): catch errors when reading SKILL.md to avoid hard crashes
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
.github/workflows/pr-labels.yml (1)
15-18:⚠️ Potential issue | 🟠 MajorUse
issues: writefor label sync; drop PR write scope.The label sync implementation writes repository labels and PR issue-labels, so this permission block should grant
issues: write. Keeping onlypull-requests: writecan leave sync failing and labels stale.Suggested fix
permissions: contents: read - pull-requests: write + issues: write + pull-requests: read#!/bin/bash # Verify workflow permissions and label-writing API usage (read-only). rg -n -C2 '^\s*permissions:|pull-requests:|issues:' .github/workflows/pr-labels.yml rg -n -C2 '/repos/\$\{this\.repo\}/labels|/issues/\$\{this\.prNumber\}/labels' scripts/pr-labels/index.jsFor GitHub Actions GITHUB_TOKEN permissions, which scope is required to create/update repository labels and add/remove labels on pull requests: `issues: write` or `pull-requests: write`?🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/pr-labels.yml around lines 15 - 18, The workflow permissions currently grant "pull-requests: write" which is incorrect for label sync; change the permissions block so it keeps "contents: read" and replaces "pull-requests: write" with "issues: write" (i.e., ensure the permissions stanza contains "contents: read" and "issues: write" and remove the pull-requests write scope) so the workflow can create/update repository labels and add/remove labels on PRs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/skill-format-check/index.js`:
- Around line 16-18: Replace the current directory enumeration that uses
fs.readdirSync(SKILLS_DIR).filter(...) and fs.statSync(path.join(SKILLS_DIR,
file)) with a try/catch around a fs.readdirSync(SKILLS_DIR, { withFileTypes:
true }) call, then build skills by keeping entries where dirent.isDirectory() is
true (avoiding extra stat calls); ensure errors from readdirSync are caught and
logged/reported the same way other file read errors are handled in this file
(see the pattern around the file-read error handling), so SKILLS_DIR enumeration
fails gracefully instead of throwing on permission errors or broken symlinks.
---
Duplicate comments:
In @.github/workflows/pr-labels.yml:
- Around line 15-18: The workflow permissions currently grant "pull-requests:
write" which is incorrect for label sync; change the permissions block so it
keeps "contents: read" and replaces "pull-requests: write" with "issues: write"
(i.e., ensure the permissions stanza contains "contents: read" and "issues:
write" and remove the pull-requests write scope) so the workflow can
create/update repository labels and add/remove labels on PRs.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 667c23d2-fb4a-4141-b21f-c7bf785865ff
📒 Files selected for processing (3)
.github/workflows/pr-labels.ymlscripts/skill-format-check/index.jsscripts/skill-format-check/test.sh
✅ Files skipped from review due to trivial changes (1)
- scripts/skill-format-check/test.sh
- refactor: use `fs.readdirSync` with `{ withFileTypes: true }` to avoid extra stat calls
- fix: catch and report errors gracefully during skills directory enumeration instead of crashing
There was a problem hiding this comment.
🧹 Nitpick comments (2)
scripts/skill-format-check/index.js (2)
54-55: Consider handling all line ending variants.The current normalization handles CRLF but not standalone CR (old Mac-style). While rare, a more robust pattern would handle both:
- const normalizedContent = content.replace(/\r\n/g, '\n'); + const normalizedContent = content.replace(/\r\n?/g, '\n');This matches
\r\n(CRLF) or\ralone (CR), converting both to\n.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/skill-format-check/index.js` around lines 54 - 55, The line-ending normalization only replaces CRLF; update the replacement on normalizedContent to convert both CRLF and standalone CR by using a regex that matches \r\n or \r (e.g., change the call on content.replace(...) so it uses a pattern like /\r\n?\/g or /\r\n?/, ensuring all CRLF and old Mac CR are converted to '\n'), keeping the variable normalizedContent and the same assignment using content.replace.
76-79: Add test coverage for metadata warning case.The
metadatacheck at lines 76-79 is intentionally warning-only and does not sethasErrors, allowing skills to pass validation when metadata is missing. However, all currentgood-skill*fixtures include metadata, so the warning path is never tested. Add agood-skill-no-metadatafixture (or similar) toscripts/skill-format-check/tests/to verify that a skill with validnameanddescriptionbut missingmetadataexits with code 0 and prints the warning message. This confirms the warning-only behavior doesn't cause CI failure.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/skill-format-check/index.js` around lines 76 - 79, Add a new test fixture under scripts/skill-format-check/tests (e.g., good-skill-no-metadata) whose skill file includes valid YAML frontmatter with name and description but omits metadata, then update or add a test that runs the skill-format-check script (index.js) against that fixture and asserts the process exits with code 0 and that the console output includes the warning string "YAML frontmatter missing 'metadata'". Make sure the test targets the metadata check in scripts/skill-format-check/index.js (the /^metadata:/m regex branch) so the warning-only path is exercised without setting hasErrors.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@scripts/skill-format-check/index.js`:
- Around line 54-55: The line-ending normalization only replaces CRLF; update
the replacement on normalizedContent to convert both CRLF and standalone CR by
using a regex that matches \r\n or \r (e.g., change the call on
content.replace(...) so it uses a pattern like /\r\n?\/g or /\r\n?/, ensuring
all CRLF and old Mac CR are converted to '\n'), keeping the variable
normalizedContent and the same assignment using content.replace.
- Around line 76-79: Add a new test fixture under
scripts/skill-format-check/tests (e.g., good-skill-no-metadata) whose skill file
includes valid YAML frontmatter with name and description but omits metadata,
then update or add a test that runs the skill-format-check script (index.js)
against that fixture and asserts the process exits with code 0 and that the
console output includes the warning string "YAML frontmatter missing
'metadata'". Make sure the test targets the metadata check in
scripts/skill-format-check/index.js (the /^metadata:/m regex branch) so the
warning-only path is exercised without setting hasErrors.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: a532529b-b5bb-46f4-842a-53c6c18b5b1c
📒 Files selected for processing (1)
scripts/skill-format-check/index.js
test(pr-labels): add edge case samples for skills paths, CCM multi-paths, and renames
- use PR larksuite#134 to test skill path behaviors - use PR larksuite#57 to test multi-path CCM resolution - use PR larksuite#11 to test track renames cross domains
|
/review |
- Replaced `area/` prefix with `domain/` for PR labeling to align with existing GitHub labels - Renamed internal constants and variables from `area` to `domain` (e.g. `PATH_TO_AREA_MAP` to `PATH_TO_DOMAIN_MAP`) - Updated `samples.json` test data to use new `domain/` format and `expected_domains` key - Added `scripts/pr-labels/test.js` runner script for continuous validation of labeling logic against PR samples - Corrected expected size label for PR larksuite#134 test sample
…n skill-format-check
- Updated README.md to reflect the new `domain/` label prefix instead of `area/`
- Removed duplicate domain array interpolation in printDryRunResult - Added process.env.GITHUB_TOKEN guard in test.js to prevent ambiguous failures from API rate limits
- Added `issues: write` permission to pr-labels workflow, which is strictly required by the GitHub REST API to modify labels on pull requests - Reordered script execution in `index.js` to apply/remove labels on the PR *before* attempting to sync repository-level label definitions (colors/descriptions). The definition sync is now a trailing best-effort step with error catching so transient repo-level API failures don't abort the critical path.
- Added missing `skills/lark-task/` to `PATH_TO_DOMAIN_MAP` to properly detect task domain modifications - Updated GitHub REST API error checking in `syncLabelDefinition` to reliably match `error.status === 422` rather than loosely checking substring - Moved token presence check in `main()` to happen before `resolveContext` to avoid triggering unauthenticated 401 API limits when GITHUB_TOKEN is omitted locally
- Removed duplicate PR entries (larksuite#11 and larksuite#57) to reduce redundant API calls during testing - Renamed sample test cases to correctly reflect their expected labels (e.g. `size-l-skill-format-check` -> `size-m-skill-format-check`)
- Prior changes correctly made full label sync best-effort, but broke the flow for brand new domains - GitHub API returns a 422 error if you attempt to attach a label to an Issue/PR that does not exist in the repository - Added a targeted bootstrap loop to create/sync specifically the labels in `toAdd` before attempting `client.addLabels()` - Left the remaining global label synchronization as a best-effort trailing action
- Added a dedicated GitHub Actions workflow (`pr-labels-test.yml`) to automatically run `test.js` against `samples.json` whenever the labeling logic is updated - Documented local testing instructions in `scripts/pr-labels/README.md`
🚀 PR Preview Install Guide🧰 CLI updatenpm i -g https://pkg.pr.new/larksuite/cli/@larksuite/cli@3946fcf439d5a14d15ea9ef05aea5684f7804542🧩 Skill updatenpx skills add williamfzc/cli#main -y -g |
…heck (#148) * feat(ci): add PR size label pipeline * chore(ci): make PR label sync non-blocking * feat(ci): add dry-run mode for PR label sync * feat(ci): add PR label dry-run samples * test(ci): update PR label samples with real historical merged PRs Replaced synthetic or open PR samples with actual merged/closed PRs from the repository to provide a more accurate reflection of the size label categorization. Added 4 samples each for sizes S, M, and L covering docs, fixes, ci, and features. * feat(ci): add high-level area tags for PRs Based on user feedback, fine-grained domain labels (like `domain/base`) are too detailed for the early stages. This change adds support for applying `area/*` tags to indicate which important top-level modules a PR touches. Currently tracked areas: - `area/shortcuts` - `area/skills` - `area/cmd` Minor modules like docs, ci, and tests are intentionally excluded to keep tags focused on critical architectural components. * refactor(ci): extract pr-label-sync logic to a dedicated directory To avoid polluting the root `scripts/` directory, moved `sync_pr_labels.js` and `sync_pr_labels.samples.json` into a new `scripts/sync-pr-labels/` folder. Added a dedicated README to document its usage and behavior. Updated `.github/workflows/pr-labels.yml` to reflect the new path. * refactor(ci): rename pr label script directory for simplicity Renamed `scripts/sync-pr-labels/` to `scripts/pr-labels/` to keep directory names concise. Updated internal references and GitHub workflow files to point to the new path. * ci: add GitHub Actions workflow to check skill format * test(ci): update sample json to include expected_areas Added `expected_areas` lists to each sample in `samples.json` to reflect the newly added `area/*` high-level module tagging logic. Allows testing to accurately check both `size/*` and `area/*` outputs. * refactor(scripts): move skill format check to isolated directory and add README * test(scripts): add positive and negative tests for skill format check * fix(scripts): revert skill changes and downgrade version/metadata checks to warnings * fix(scripts): completely remove version check and skip lark-shared * refactor(ci): improve pr-labels script readability and maintainability - Reorganized code into logical sections with clear comments - Encapsulated GitHub API interactions into a reusable `GitHubClient` class - Extracted and centralized classification logic into a pure `evaluateRules` function - Replaced magic numbers with named constants (`THRESHOLD_L`, `THRESHOLD_XL`) - Fixed `ROOT` path resolution logic - Simplified conditional statements and control flow * ci: fix setup-node version in pr-labels workflow * tmp * refactor(ci): replace generic area labels with business-specific ones - Add PATH_TO_AREA_MAP to map shortcuts/skills paths to business areas (im, vc, ccm, base, mail, calendar, task, contact) - Replace importantAreas with businessAreas throughout the codebase - Remove area/shortcuts, area/skills, area/cmd generic labels - Now generates specific labels like area/im, area/vc, area/ccm, etc. - Update samples.json expected_areas to match new behavior Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(ci): address PR review feedback for label scripts and workflows - Add `edited` event to PR labels workflow to trigger on title changes - Add security warning comment in pr-labels.yml workflow - Update pr-labels README with latest business area labels - Exclude `skills/lark-*` paths from low risk doc classification - Handle renamed files properly in PR path classification - Fix YAML frontmatter extraction to handle CRLF line endings - Use precise regex for YAML key validation instead of substring match - Fix exit code checking logic in skill-format-check test script - Translate Chinese comments in skill-format-check to English * fix(skill-format-check): address CodeRabbit review feedback - Fix frontmatter closing delimiter detection to strictly match '---' using regex, preventing invalid closing tags like '----' from passing. - Improve test fixture reliability by failing tests immediately if fixture preparation fails, avoiding false positives. * fix: address review comments from PR 148 - ci: warn when PR label sync fails in job summary - test(skill-format-check): capture validator output for negative tests - fix(skill-format-check): catch errors when reading SKILL.md to avoid hard crashes * fix: add error handling for directory enumeration in skill-format-check - refactor: use `fs.readdirSync` with `{ withFileTypes: true }` to avoid extra stat calls - fix: catch and report errors gracefully during skills directory enumeration instead of crashing * docs(skill-format-check): clarify `metadata` requirement in README test(pr-labels): add edge case samples for skills paths, CCM multi-paths, and renames * test(pr-labels): add real PR edge case samples - use PR #134 to test skill path behaviors - use PR #57 to test multi-path CCM resolution - use PR #11 to test track renames cross domains * refactor(ci): migrate pr labels from area to domain prefix - Replaced `area/` prefix with `domain/` for PR labeling to align with existing GitHub labels - Renamed internal constants and variables from `area` to `domain` (e.g. `PATH_TO_AREA_MAP` to `PATH_TO_DOMAIN_MAP`) - Updated `samples.json` test data to use new `domain/` format and `expected_domains` key - Added `scripts/pr-labels/test.js` runner script for continuous validation of labeling logic against PR samples - Corrected expected size label for PR #134 test sample * test: use execFileSync instead of execSync in pr-labels test script * fix: resolve target path against process.cwd() instead of __dirname in skill-format-check * docs: correct label prefix in PR label workflow README - Updated README.md to reflect the new `domain/` label prefix instead of `area/` * fix(ci): fix dry-run console output formatting and enforce auth in tests - Removed duplicate domain array interpolation in printDryRunResult - Added process.env.GITHUB_TOKEN guard in test.js to prevent ambiguous failures from API rate limits * fix(ci): ensure PR labels can be applied reliably - Added `issues: write` permission to pr-labels workflow, which is strictly required by the GitHub REST API to modify labels on pull requests - Reordered script execution in `index.js` to apply/remove labels on the PR *before* attempting to sync repository-level label definitions (colors/descriptions). The definition sync is now a trailing best-effort step with error catching so transient repo-level API failures don't abort the critical path. * fix(ci): fix edge cases in pr-label index script - Added missing `skills/lark-task/` to `PATH_TO_DOMAIN_MAP` to properly detect task domain modifications - Updated GitHub REST API error checking in `syncLabelDefinition` to reliably match `error.status === 422` rather than loosely checking substring - Moved token presence check in `main()` to happen before `resolveContext` to avoid triggering unauthenticated 401 API limits when GITHUB_TOKEN is omitted locally * test(ci): clean up PR label test samples - Removed duplicate PR entries (#11 and #57) to reduce redundant API calls during testing - Renamed sample test cases to correctly reflect their expected labels (e.g. `size-l-skill-format-check` -> `size-m-skill-format-check`) * fix(ci): bootstrap new labels before applying to PRs - Prior changes correctly made full label sync best-effort, but broke the flow for brand new domains - GitHub API returns a 422 error if you attempt to attach a label to an Issue/PR that does not exist in the repository - Added a targeted bootstrap loop to create/sync specifically the labels in `toAdd` before attempting `client.addLabels()` - Left the remaining global label synchronization as a best-effort trailing action * test(ci): automate PR label regression testing - Added a dedicated GitHub Actions workflow (`pr-labels-test.yml`) to automatically run `test.js` against `samples.json` whenever the labeling logic is updated - Documented local testing instructions in `scripts/pr-labels/README.md` --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This PR enhances the CI/CD pipeline by refining how PRs are labeled with business areas and introduces a new automated check to ensure skill formatting compliance.
Key Changes:
area/shortcuts,area/skills,area/cmd) with precise, business-specific tags (e.g.,area/im,area/vc,area/ccm,area/base,area/mail,area/calendar,area/task,area/contact).PATH_TO_AREA_MAPto automatically map file paths to their correct domain.skill-format-check.yml) that triggers onmainbranch pushes and PRs. Itruns
scripts/skill-format-check/index.jsto automatically validate the format of files under theskills/**directory.pull-requests: writepermission to the PR labeling workflow so it can correctly apply tags.v4tags.Why are these changes needed?
paths directly to their business areas improves context and code ownership visibility.
malformed skill configurations from being merged.
Test
Size + Biz label check
https://bytedance.larkoffice.com/wiki/G8jWwVPzviCs8nk15mQcREDFnRh
Skill check
good-skillgood-skillgood-skill-minimalgood-skill-minimalgood-skill-complexgood-skill-complexbad-skillbad-skillbad-skill-no-frontmatterbad-skill-no-frontmatterbad-skill-unclosed-frontmatterbad-skill-unclosed-frontmatterSample
williamfzc#5
Summary by CodeRabbit