feat: (lark-doc) Doc Search advanced boolean and intitle search syntax#210
feat: (lark-doc) Doc Search advanced boolean and intitle search syntax#210fangshuyu-768 merged 1 commit intomainfrom
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
✅ Files skipped from review due to trivial changes (1)
📝 WalkthroughWalkthroughDocumentation clarifies Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Greptile SummaryThis PR updates Two prior review threads remain unresolved and are worth tracking: the underlying Lark Search v2 API's actual support for these operators has not been confirmed, and the prohibition on client-side filtering for "exact title match" is logically incorrect because Confidence Score: 4/5The documentation change is low-risk in isolation but two prior P1 threads remain unresolved — unconfirmed API operator support and a logical regression in the exact-title-match rule — making merge premature. Score of 4 reflects that no new P0/P1 issues were found in this review pass, but the two open P1 threads from the previous review cycle (API support unconfirmed, intitle: is not an exact-equals operator) have not been addressed. Until those are resolved or explicitly acknowledged with a mitigation plan, the decision rule change poses a real risk of agent misbehavior for exact-title-match intents. skills/lark-doc/references/lark-doc-search.md — specifically the query-semantics decision rule and the undocumented API operator support assumption.
|
| Filename | Overview |
|---|---|
| skills/lark-doc/references/lark-doc-search.md | Enhanced --query docs with Boolean/intitle syntax and tightened decision rules; prior open threads flag that API support for these operators is unconfirmed and that intitle: is a contains-not-equals predicate, making the exact-title prohibition on client-side filtering a potential regression. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[User Query Intent] --> B{Intent type}
B -->|Simple keyword| C[query: keyword]
B -->|AND multiple terms| D[query: termA termB]
B -->|Logical OR| E[query: termA OR termB]
B -->|Exclude term| F[query: termA -termB]
B -->|Phrase match| G[query: exact phrase in quotes]
B -->|Title contains X| H[query: intitle X]
B -->|Title exactly equals X| I[query: intitle X — WARNING: still a contains match]
C & D & E & F & G & H & I --> J[POST search/v2/doc_wiki/search]
J --> K{has_more?}
K -->|No| L[Return results]
K -->|Yes and user wants all| M{Pages fetched less than 5?}
M -->|Yes| J
M -->|No| N[Report progress and ask user]
Reviews (3): Last reviewed commit: "docs(lark-doc): document advanced boolea..." | Re-trigger Greptile
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
skills/lark-doc/references/lark-doc-search.md (1)
62-62: Consider adding concrete examples for the client-side filtering exception.The decision rule states: "只有在遇到服务端语法无法覆盖的复杂本地比对场景时,才允许在客户端过滤" (only allow client-side filtering when server syntax cannot cover complex local matching scenarios).
This exception clause provides necessary flexibility but lacks concrete examples. Without specific scenarios, different developers or the AI Agent might interpret "complex local matching scenarios" inconsistently, potentially leading to unnecessary client-side filtering when server-side syntax would suffice.
📝 Suggested clarification with examples
Consider appending specific examples after the exception clause, such as:
-查询语义:必须优先利用 --query 的高级语法(如 intitle:、""、-)将过滤逻辑下推给服务端。当用户要求"标题精确等于 X"时,直接使用 --query "intitle:\"X\"",严禁先进行模糊搜索再做客户端二次筛选。只有在遇到服务端语法无法覆盖的复杂本地比对场景时,才允许在客户端过滤,且比对前必须先去掉 title_highlighted 里的高亮标签。 +查询语义:必须优先利用 --query 的高级语法(如 intitle:、""、-)将过滤逻辑下推给服务端。当用户要求"标题精确等于 X"时,直接使用 --query "intitle:\"X\"",严禁先进行模糊搜索再做客户端二次筛选。只有在遇到服务端语法无法覆盖的复杂本地比对场景时(例如:多字段联合条件、正则表达式匹配、自定义评分排序),才允许在客户端过滤,且比对前必须先去掉 title_highlighted 里的高亮标签。🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@skills/lark-doc/references/lark-doc-search.md` at line 62, Add concrete examples illustrating the allowed "client-side filtering" exception right after the sentence that permits client filtering when server syntax cannot cover complex local matching; specifically, append 2–3 short scenarios (e.g., fuzzy unicode normalization across diacritics, language-specific tokenization differences, or multi-field proximity matches that the server lacks) and show the minimal client-side steps (strip HTML tags from title_highlighted, then apply the local comparison) and when to prefer --query with intitle:"X" instead; reference the existing terms --query, intitle:, and title_highlighted so readers can locate the rule and understand exact inputs and the required pre-filtering step.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@skills/lark-doc/references/lark-doc-search.md`:
- Line 49: Update the `--query <text>` documentation to remove the unsupported
Boolean/operator examples and state that the Feishu Search v2 endpoint
`/open-apis/search/v2/doc_wiki/search` accepts only a simple text string (basic
keyword matching, max ~50 characters) — i.e., replace the current list of
Boolean operators (AND via spaces, OR, -, "", intitle:) with a concise note
about plain keyword matching; also note that `shortcuts/doc/docs_search.go`
passes the query directly to the API so no client-side parsing/validation is
performed.
---
Nitpick comments:
In `@skills/lark-doc/references/lark-doc-search.md`:
- Line 62: Add concrete examples illustrating the allowed "client-side
filtering" exception right after the sentence that permits client filtering when
server syntax cannot cover complex local matching; specifically, append 2–3
short scenarios (e.g., fuzzy unicode normalization across diacritics,
language-specific tokenization differences, or multi-field proximity matches
that the server lacks) and show the minimal client-side steps (strip HTML tags
from title_highlighted, then apply the local comparison) and when to prefer
--query with intitle:"X" instead; reference the existing terms --query,
intitle:, and title_highlighted so readers can locate the rule and understand
exact inputs and the required pre-filtering step.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b910cee2-9ac6-47c0-9b7a-a73187c77dc7
📒 Files selected for processing (1)
skills/lark-doc/references/lark-doc-search.md
🚀 PR Preview Install Guide🧰 CLI updatenpm i -g https://pkg.pr.new/larksuite/cli/@larksuite/cli@f0e9c196ae86b3a53b1e01be7e3b748e0ed15722🧩 Skill updatenpx skills add larksuite/cli#docs/enhance-doc-search-syntax -y -g |
|
Tip: Greploops — Automatically fix all review issues by running Use the Greptile plugin for Claude Code to query reviews, search comments, and manage custom context directly from your terminal. |
…or AI agents Change-Id: I647ffad4579c503711a7ea220c390dca760cd6de
b1771a2 to
f0e9c19
Compare
Summary
目前 lark-cli docs +search 仅透出了基础关键词检索能力,AI Agent 无法感知底层搜索引擎支持的高级逻辑语法。这导致在处理“精确标题匹配”或“排除特定内容”等复杂意图时,Agent 只能拉回大量冗余数据在本地进行二次过滤,极易产生 Token 爆炸和总数统计幻觉。本 PR 通过更新 Skill 描述文件,正式向 AI Agent 暴露高级 Boolean 和 intitle: 语法。
Changes
Test Plan
Summary by CodeRabbit
--querydocs to describe advanced Boolean syntax: space=AND, OR, term exclusion (-), exact-phrase ("..."), and title-specificintitle:filtering.