Skip to content

feat: (lark-doc) Doc Search advanced boolean and intitle search syntax#210

Merged
fangshuyu-768 merged 1 commit intomainfrom
docs/enhance-doc-search-syntax
Apr 8, 2026
Merged

feat: (lark-doc) Doc Search advanced boolean and intitle search syntax#210
fangshuyu-768 merged 1 commit intomainfrom
docs/enhance-doc-search-syntax

Conversation

@MakeLarkGreatAgain
Copy link
Copy Markdown
Collaborator

@MakeLarkGreatAgain MakeLarkGreatAgain commented Apr 2, 2026

Summary

目前 lark-cli docs +search 仅透出了基础关键词检索能力,AI Agent 无法感知底层搜索引擎支持的高级逻辑语法。这导致在处理“精确标题匹配”或“排除特定内容”等复杂意图时,Agent 只能拉回大量冗余数据在本地进行二次过滤,极易产生 Token 爆炸和总数统计幻觉。本 PR 通过更新 Skill 描述文件,正式向 AI Agent 暴露高级 Boolean 和 intitle: 语法。

Changes

  • 参数描述增强:在 skills/lark-doc/references/lark-doc-search.md 中为 --query 参数补充了高级 Boolean 语法说明(包含 AND、OR、- 排除及 "" 精确匹配)。
  • 决策规则重构:更新了 ## 决策规则 中的查询语义逻辑,明确规定 Agent 必须优先使用 intitle: 语法将过滤逻辑下推给服务端,严禁在可用服务端过滤时进行本地二次比对。

Test Plan

  • 格式检查:确保 Markdown 语法及表格渲染正常。
  • 意图翻译测试(模拟 Agent 行为):
    • 场景 1(负向排除):输入“找飞书项目文档不要纪要”,Agent 成功生成 lark-cli docs +search --query "飞书 -纪要"。
    • 场景 2(逻辑或):输入“查找A团队介绍或B团队介绍的文档”,Agent 生成 lark-cli docs +search --query ""A团队介绍" OR "B团队介绍""。
    • 场景 3(标题精确匹配):输入“查找标题明确为 2026规划 的文档”,Agent 生成 lark-cli docs +search --query "intitle:"2026规划""。
  • [兼容性验证:手动在终端执行上述生成的命令,确认飞书 OpenAPI 返回结果符合预期且无报错。

Summary by CodeRabbit

  • Documentation
    • Enhanced --query docs to describe advanced Boolean syntax: space=AND, OR, term exclusion (-), exact-phrase ("..."), and title-specific intitle: filtering.
    • Clarified query semantics: prefer service-side advanced filtering; allow client-side filtering only for complex local matches and require stripping highlight tags before title comparisons.

@MakeLarkGreatAgain MakeLarkGreatAgain added domain/doc Docs domain size/S Low-risk docs, CI, test, or chore only changes labels Apr 2, 2026
@github-actions github-actions Bot added domain/ccm PR touches the ccm domain size/M Single-domain feat or fix with limited business impact and removed domain/doc Docs domain labels Apr 2, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 2, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9df3e024-3636-493b-9326-e8ba22a4fad4

📥 Commits

Reviewing files that changed from the base of the PR and between b1771a2 and f0e9c19.

📒 Files selected for processing (1)
  • skills/lark-doc/references/lark-doc-search.md
✅ Files skipped from review due to trivial changes (1)
  • skills/lark-doc/references/lark-doc-search.md

📝 Walkthrough

Walkthrough

Documentation clarifies --query advanced Boolean syntax (space=AND, OR, - exclusion, "" exact phrase, intitle: title match) and mandates service-side filtering via --query; client-side filtering is allowed only for complex local matches after stripping highlight tags.

Changes

Cohort / File(s) Summary
Search Documentation
skills/lark-doc/references/lark-doc-search.md
Expanded --query docs with explicit Boolean syntax examples; updated query semantics to require using service-side --query (e.g., intitle:"X") instead of positional keywords; limited client-side filtering to complex local cases and required stripping highlight tags from title_highlighted before comparison.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Possibly related PRs

Suggested labels

documentation

Suggested reviewers

  • fangshuyu-768

Poem

🐰 I hopped through clauses, tidy and spry,

turning fuzzy trails into clear-sighted sky.
Service sniffs titles, exact by design,
Client trims tags when the match must refine.
A rabbit's small cheer — docs neat and fine.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: documentation updates exposing advanced Boolean and intitle search syntax to AI Agents.
Description check ✅ Passed The description comprehensively covers motivation, changes, and test scenarios, though the final compatibility verification item is incomplete (appears cut off mid-sentence).
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch docs/enhance-doc-search-syntax

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot removed the size/S Low-risk docs, CI, test, or chore only changes label Apr 2, 2026
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Apr 2, 2026

Greptile Summary

This PR updates skills/lark-doc/references/lark-doc-search.md to expose advanced Boolean search operators (AND, OR, -, "", intitle:) in the --query parameter description and rewrites the query-semantics decision rule to require server-side filtering over client-side post-processing.

Two prior review threads remain unresolved and are worth tracking: the underlying Lark Search v2 API's actual support for these operators has not been confirmed, and the prohibition on client-side filtering for "exact title match" is logically incorrect because intitle:X is a contains predicate — not an equals predicate — which removes a necessary safeguard without replacing it.

Confidence Score: 4/5

The documentation change is low-risk in isolation but two prior P1 threads remain unresolved — unconfirmed API operator support and a logical regression in the exact-title-match rule — making merge premature.

Score of 4 reflects that no new P0/P1 issues were found in this review pass, but the two open P1 threads from the previous review cycle (API support unconfirmed, intitle: is not an exact-equals operator) have not been addressed. Until those are resolved or explicitly acknowledged with a mitigation plan, the decision rule change poses a real risk of agent misbehavior for exact-title-match intents.

skills/lark-doc/references/lark-doc-search.md — specifically the query-semantics decision rule and the undocumented API operator support assumption.

Vulnerabilities

No security concerns identified. The change is documentation-only (a Markdown skill reference file) and does not touch authentication, data handling, or executable code.

Important Files Changed

Filename Overview
skills/lark-doc/references/lark-doc-search.md Enhanced --query docs with Boolean/intitle syntax and tightened decision rules; prior open threads flag that API support for these operators is unconfirmed and that intitle: is a contains-not-equals predicate, making the exact-title prohibition on client-side filtering a potential regression.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[User Query Intent] --> B{Intent type}
    B -->|Simple keyword| C[query: keyword]
    B -->|AND multiple terms| D[query: termA termB]
    B -->|Logical OR| E[query: termA OR termB]
    B -->|Exclude term| F[query: termA -termB]
    B -->|Phrase match| G[query: exact phrase in quotes]
    B -->|Title contains X| H[query: intitle X]
    B -->|Title exactly equals X| I[query: intitle X — WARNING: still a contains match]
    C & D & E & F & G & H & I --> J[POST search/v2/doc_wiki/search]
    J --> K{has_more?}
    K -->|No| L[Return results]
    K -->|Yes and user wants all| M{Pages fetched less than 5?}
    M -->|Yes| J
    M -->|No| N[Report progress and ask user]
Loading

Reviews (3): Last reviewed commit: "docs(lark-doc): document advanced boolea..." | Re-trigger Greptile

Comment thread skills/lark-doc/references/lark-doc-search.md
Comment thread skills/lark-doc/references/lark-doc-search.md Outdated
Comment thread skills/lark-doc/references/lark-doc-search.md Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
skills/lark-doc/references/lark-doc-search.md (1)

62-62: Consider adding concrete examples for the client-side filtering exception.

The decision rule states: "只有在遇到服务端语法无法覆盖的复杂本地比对场景时,才允许在客户端过滤" (only allow client-side filtering when server syntax cannot cover complex local matching scenarios).

This exception clause provides necessary flexibility but lacks concrete examples. Without specific scenarios, different developers or the AI Agent might interpret "complex local matching scenarios" inconsistently, potentially leading to unnecessary client-side filtering when server-side syntax would suffice.

📝 Suggested clarification with examples

Consider appending specific examples after the exception clause, such as:

-查询语义:必须优先利用 --query 的高级语法(如 intitle:、""、-)将过滤逻辑下推给服务端。当用户要求"标题精确等于 X"时,直接使用 --query "intitle:\"X\"",严禁先进行模糊搜索再做客户端二次筛选。只有在遇到服务端语法无法覆盖的复杂本地比对场景时,才允许在客户端过滤,且比对前必须先去掉 title_highlighted 里的高亮标签。
+查询语义:必须优先利用 --query 的高级语法(如 intitle:、""、-)将过滤逻辑下推给服务端。当用户要求"标题精确等于 X"时,直接使用 --query "intitle:\"X\"",严禁先进行模糊搜索再做客户端二次筛选。只有在遇到服务端语法无法覆盖的复杂本地比对场景时(例如:多字段联合条件、正则表达式匹配、自定义评分排序),才允许在客户端过滤,且比对前必须先去掉 title_highlighted 里的高亮标签。
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/lark-doc/references/lark-doc-search.md` at line 62, Add concrete
examples illustrating the allowed "client-side filtering" exception right after
the sentence that permits client filtering when server syntax cannot cover
complex local matching; specifically, append 2–3 short scenarios (e.g., fuzzy
unicode normalization across diacritics, language-specific tokenization
differences, or multi-field proximity matches that the server lacks) and show
the minimal client-side steps (strip HTML tags from title_highlighted, then
apply the local comparison) and when to prefer --query with intitle:"X" instead;
reference the existing terms --query, intitle:, and title_highlighted so readers
can locate the rule and understand exact inputs and the required pre-filtering
step.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@skills/lark-doc/references/lark-doc-search.md`:
- Line 49: Update the `--query <text>` documentation to remove the unsupported
Boolean/operator examples and state that the Feishu Search v2 endpoint
`/open-apis/search/v2/doc_wiki/search` accepts only a simple text string (basic
keyword matching, max ~50 characters) — i.e., replace the current list of
Boolean operators (AND via spaces, OR, -, "", intitle:) with a concise note
about plain keyword matching; also note that `shortcuts/doc/docs_search.go`
passes the query directly to the API so no client-side parsing/validation is
performed.

---

Nitpick comments:
In `@skills/lark-doc/references/lark-doc-search.md`:
- Line 62: Add concrete examples illustrating the allowed "client-side
filtering" exception right after the sentence that permits client filtering when
server syntax cannot cover complex local matching; specifically, append 2–3
short scenarios (e.g., fuzzy unicode normalization across diacritics,
language-specific tokenization differences, or multi-field proximity matches
that the server lacks) and show the minimal client-side steps (strip HTML tags
from title_highlighted, then apply the local comparison) and when to prefer
--query with intitle:"X" instead; reference the existing terms --query,
intitle:, and title_highlighted so readers can locate the rule and understand
exact inputs and the required pre-filtering step.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b910cee2-9ac6-47c0-9b7a-a73187c77dc7

📥 Commits

Reviewing files that changed from the base of the PR and between 79f43dc and 80c1917.

📒 Files selected for processing (1)
  • skills/lark-doc/references/lark-doc-search.md

Comment thread skills/lark-doc/references/lark-doc-search.md Outdated
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 2, 2026

🚀 PR Preview Install Guide

🧰 CLI update

npm i -g https://pkg.pr.new/larksuite/cli/@larksuite/cli@f0e9c196ae86b3a53b1e01be7e3b748e0ed15722

🧩 Skill update

npx skills add larksuite/cli#docs/enhance-doc-search-syntax -y -g

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 2, 2026

CLA assistant check
All committers have signed the CLA.

fangshuyu-768
fangshuyu-768 previously approved these changes Apr 7, 2026
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Apr 8, 2026

Tip:

Greploops — Automatically fix all review issues by running /greploops in Claude Code. It iterates: fix, push, re-review, repeat until 5/5 confidence.

Use the Greptile plugin for Claude Code to query reviews, search comments, and manage custom context directly from your terminal.

…or AI agents

Change-Id: I647ffad4579c503711a7ea220c390dca760cd6de
@wittam-01 wittam-01 force-pushed the docs/enhance-doc-search-syntax branch from b1771a2 to f0e9c19 Compare April 8, 2026 13:24
@fangshuyu-768 fangshuyu-768 merged commit d5d31f0 into main Apr 8, 2026
12 checks passed
@fangshuyu-768 fangshuyu-768 deleted the docs/enhance-doc-search-syntax branch April 8, 2026 14:01
@coderabbitai coderabbitai Bot mentioned this pull request Apr 29, 2026
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain/ccm PR touches the ccm domain size/M Single-domain feat or fix with limited business impact

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants