Make llms-full output monolithic#26
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis PR consolidates LLM full-context artifact generation from per-topic bundles ( ChangesLLM Full-Context Artifact Refactoring
Eval Infrastructure & Fixtures
Sequence DiagramsequenceDiagram
participant Runner as run-llms-eval
participant Sandbox as LlmsSandbox
participant Gateway as LLM Gateway
participant Eval as Vitest/EVAL.ts
participant FS as Filesystem/Archive
Runner->>Sandbox: createLlmsSandbox(fixture, variant)
Runner->>Gateway: generateText({SYSTEM_PROMPT, PROMPT.md, sandbox tools})
Gateway-->>Runner: text, steps, tokens, toolCalls
Runner->>Sandbox: write transcript.json (includes variant/benchmark)
Runner->>Eval: runVitest(EVAL.ts, TRANSCRIPT_PATH)
Eval-->>Runner: { passed, output }
Runner->>FS: archive transcript + sandbox files -> results/<fixture>/<variant>/...
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related issues
Possibly related PRs
Poem
Comment |
There was a problem hiding this comment.
Actionable comments posted: 12
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/authoring/frontmatter.mdx`:
- Line 27: The sentence claiming "Each page declares a single group slug" is
incorrect; update the wording in docs/authoring/frontmatter.mdx to reflect that
the frontmatter key "group" may accept multiple group slugs (e.g., an array) and
that pages can be shared across multiple groups defined in docs.config.ts; keep
the rest of the explanation about intersection producing the nav tree, llms.txt
section headings, search metadata, and AGENTS.md grouping but change "single" to
language like "one or more" or "one or more group slugs (pages may belong to
multiple groups)".
In `@docs/build/connect-docs-site.mdx`:
- Line 12: The Mermaid component is HTML-escaped so MDX shows the literal text;
replace the escaped tag string (<Mermaid ... />) with an actual JSX/MDX
component invocation (<Mermaid ... />) so the diagram renders, ensure the prop
expression (the flowchart string passed via {`...`} or a string variable)
remains intact, and confirm the Mermaid component is imported/available in the
doc (check for a Mermaid import or add one) so the component renders correctly.
In `@docs/how-it-works.mdx`:
- Line 46: Update the docs text so Agent Readability output is docs-scoped:
change the standalone "agent-readability.json" entry to
"docs/agent-readability.json" (and similarly ensure any other occurrences in the
file, e.g., the repeated block at lines ~81-83, reference the docs-scoped path).
Leave existing entries that already have "docs/..." as-is and ensure the
sentence consistently lists all generated artifacts under the docs/ namespace
where appropriate.
In `@evals/lib/llms-eval.ts`:
- Around line 18-22: The code currently sets projectRoot to "" when
process.env.TRANSCRIPT_PATH is missing which lets answerPath resolve from CWD;
change the initialization so you require TRANSCRIPT_PATH and fail fast: if
process.env.TRANSCRIPT_PATH is falsy, throw a clear Error (or call
process.exit(1)) stating "TRANSCRIPT_PATH must be set", otherwise compute
projectRoot = resolve(dirname(process.env.TRANSCRIPT_PATH), "..") and then
compute answerPath; update the symbols projectRoot and answerPath in
evals/lib/llms-eval.ts accordingly so no default empty string is used.
In `@evals/lib/llms-metrics.ts`:
- Around line 101-105: The returned `passed` value currently only checks
`reasons.length === 0` and ignores `wrongGroupReads`, allowing false positives;
update the returned object so `passed` is true only when both `reasons` and
`wrongGroupReads` are empty—e.g., change the `passed` expression to check both
`reasons.length === 0` and `wrongGroupReads` is empty (use a safe check like
`(wrongGroupReads?.length ?? 0) === 0`) while leaving `reasons` and
`wrongGroupReads` fields unchanged in the returned object.
In `@evals/lib/llms-variants.ts`:
- Around line 68-70: The eval corpus in evals/lib/llms-variants.ts still
references deprecated outputs (e.g. "docs/llms-full*" and the
identifier/docsLlmsFullTxt) that contradict the new monolithic contract; locate
every occurrence of the literals and the symbol docsLlmsFullTxt (and the
website/bundle mode description strings around the other noted blocks) and
remove or update them to reflect the current contract (no docs/llms-full*
artifacts, update to the actual files produced by website or bundle modes), and
update the fixture documents used by the eval answers so they no longer encode
the old behavior; ensure all variant descriptions and fixture content (the
strings in the blocks at the other noted locations) are consistent with the
monolithic output format.
In `@evals/llms/ambiguous-output-routing/PROMPT.md`:
- Line 1: Add a top-level H1 heading above the existing body text so the file
satisfies markdownlint MD041; prepend a single H1 line (for example "# Using the
Leadtype docs site") before the sentence starting "Using the Leadtype docs site,
I need the agent-facing output APIs..." ensuring the original text becomes the
first paragraph under that H1.
In `@evals/llms/cross-group-agent-flows/PROMPT.md`:
- Line 1: Add a top-level heading to the file PROMPT.md so the first line is an
H1 (e.g., "# Cross-group agent flows: hosted vs npm bundle") to satisfy MD041;
update the existing first-paragraph content to follow that H1 and ensure the
file now begins with the heading before any body text.
In `@evals/llms/negative-vector-index/PROMPT.md`:
- Line 1: Change the first line "Using the Leadtype docs site, answer this: does
Leadtype include a hosted database-backed vector index by default? If not, what
does it use by default and when would embeddings be added?" into a top-level H1
by prefixing it with "# " so the file's PROMPT.md starts with an H1 header to
satisfy MD041 (i.e., replace the plain paragraph at the top with an H1).
In `@evals/llms/single-group-authoring/PROMPT.md`:
- Line 1: Update the prompt wording in PROMPT.md to reflect the current artifact
contract: replace the phrase "full-context bundles" with "monolithic output
(`/llms-full.txt`)" and adjust the surrounding sentence so it asks the model to
summarize how frontmatter `group` controls navigation and the monolithic output;
keep the requirement to include at least two optional frontmatter fields. Ensure
any mention of deprecated artifact shapes is removed and the prompt explicitly
references `/llms-full.txt` as the target artifact format.
In `@evals/llms/single-page-cli-flag/PROMPT.md`:
- Line 1: Add a top-level Markdown heading as the first line of PROMPT.md to
satisfy markdownlint rule MD041 (first-line-heading); prepend a descriptive H1
(for example, "# Using the Leadtype docs site" or similar) before the existing
plain-text prompt so the file begins with a single-line heading.
In `@evals/run-llms-eval.ts`:
- Around line 55-61: The parsePositiveInt function silently accepts a missing
flag value by defaulting undefined to "1"; change parsePositiveInt to explicitly
detect value === undefined and throw an Error like `${flag} requires a value`
instead of defaulting, then parse and validate the provided string as an integer
(keep the existing Number.isInteger and >0 check for non-numeric or non-positive
inputs). Apply the same fix to the other CLI parsing helper(s) used around lines
85-87 so they also reject undefined/missing flag values rather than defaulting.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: c0c4447b-53b6-4c13-9a84-ffb3de989ee2
⛔ Files ignored due to path filters (4)
apps/example/src/generated/agent-readability.jsonis excluded by!**/generated/**apps/example/src/generated/docs-nav.jsonis excluded by!**/generated/**apps/example/src/generated/docs-search-content.jsonis excluded by!**/generated/**apps/example/src/generated/docs-search-index.jsonis excluded by!**/generated/**
📒 Files selected for processing (54)
apps/example/scripts/llm-generate-real.tsapps/example/scripts/llm-generate.tsapps/example/scripts/mdx-convert.tsapps/example/src/routeTree.gen.tsapps/example/src/routes/docs/reference/evals.tsxapps/example/tests/e2e/smoke.e2e.tsdocs/authoring/components.mdxdocs/authoring/frontmatter.mdxdocs/build/bundle-package-docs.mdxdocs/build/connect-docs-site.mdxdocs/build/optimize-docs-for-agents.mdxdocs/docs.config.tsdocs/how-it-works.mdxdocs/index.mdxdocs/methodology.mdxdocs/quickstart.mdxdocs/reference/cli.mdxdocs/reference/evals.mdxdocs/reference/llm.mdxdocs/reference/remark.mdxevals/README.mdevals/lib/llms-eval.tsevals/lib/llms-metrics.test.tsevals/lib/llms-metrics.tsevals/lib/llms-sandbox.tsevals/lib/llms-variants.tsevals/lib/transcript.tsevals/llms/ambiguous-output-routing/EVAL.tsevals/llms/ambiguous-output-routing/PROMPT.mdevals/llms/ambiguous-output-routing/expected.jsonevals/llms/cross-group-agent-flows/EVAL.tsevals/llms/cross-group-agent-flows/PROMPT.mdevals/llms/cross-group-agent-flows/expected.jsonevals/llms/exact-symbol-readability/EVAL.tsevals/llms/exact-symbol-readability/PROMPT.mdevals/llms/exact-symbol-readability/expected.jsonevals/llms/negative-vector-index/EVAL.tsevals/llms/negative-vector-index/PROMPT.mdevals/llms/negative-vector-index/expected.jsonevals/llms/single-group-authoring/EVAL.tsevals/llms/single-group-authoring/PROMPT.mdevals/llms/single-group-authoring/expected.jsonevals/llms/single-page-cli-flag/EVAL.tsevals/llms/single-page-cli-flag/PROMPT.mdevals/llms/single-page-cli-flag/expected.jsonevals/package.jsonevals/run-eval.tsevals/run-llms-eval.tsevals/vitest.config.tspackages/leadtype/src/cli.test.tspackages/leadtype/src/cli/generate.tspackages/leadtype/src/llm/llm.test.tspackages/leadtype/src/llm/llm.tspackages/leadtype/src/llm/readability.ts
📜 Review details
🧰 Additional context used
📓 Path-based instructions (5)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
**/*.{ts,tsx}: Use explicit types for function parameters and return values when they enhance clarity
Preferunknownoveranywhen the type is genuinely unknown
Use const assertions (as const) for immutable values and literal types
Leverage TypeScript's type narrowing instead of type assertions
Files:
evals/llms/negative-vector-index/EVAL.tsapps/example/scripts/llm-generate.tsevals/llms/cross-group-agent-flows/EVAL.tsevals/llms/single-page-cli-flag/EVAL.tsevals/llms/single-group-authoring/EVAL.tsapps/example/scripts/llm-generate-real.tsevals/vitest.config.tsevals/lib/llms-eval.tsevals/llms/ambiguous-output-routing/EVAL.tsevals/llms/exact-symbol-readability/EVAL.tsapps/example/scripts/mdx-convert.tsapps/example/tests/e2e/smoke.e2e.tsdocs/docs.config.tsevals/lib/llms-sandbox.tsapps/example/src/routes/docs/reference/evals.tsxpackages/leadtype/src/llm/readability.tsevals/run-eval.tsevals/lib/llms-metrics.test.tsevals/lib/llms-metrics.tsevals/lib/transcript.tsapps/example/src/routeTree.gen.tspackages/leadtype/src/cli.test.tspackages/leadtype/src/cli/generate.tsevals/lib/llms-variants.tspackages/leadtype/src/llm/llm.test.tspackages/leadtype/src/llm/llm.tsevals/run-llms-eval.ts
**/*.{js,ts,jsx,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
**/*.{js,ts,jsx,tsx}: Use meaningful variable names instead of magic numbers - extract constants with descriptive names
Use arrow functions for callbacks and short functions
Preferfor...ofloops over.forEach()and indexedforloops
Use optional chaining (?.) and nullish coalescing (??) for safer property access
Prefer template literals over string concatenation
Use destructuring for object and array assignments
Useconstby default,letonly when reassignment is needed, nevervar
Alwaysawaitpromises in async functions - don't forget to use the return value
Useasync/awaitsyntax instead of promise chains for better readability
Handle errors appropriately in async code with try-catch blocks
Don't use async functions as Promise executors
Removeconsole.log,debugger, andalertstatements from production code
ThrowErrorobjects with descriptive messages, not strings or other values
Usetry-catchblocks meaningfully - don't catch errors just to rethrow them
Prefer early returns over nested conditionals for error cases
Extract complex conditions into well-named boolean variables
Use early returns to reduce nesting
Prefer simple conditionals over nested ternary operators
Don't useeval()or assign directly todocument.cookie
Avoid spread syntax in accumulators within loops
Use top-level regex literals instead of creating them in loops
Prefer specific imports over namespace imports
Use descriptive names for functions, variables, and types for meaningful naming
Add comments for complex logic, but prefer self-documenting code
Files:
evals/llms/negative-vector-index/EVAL.tsapps/example/scripts/llm-generate.tsevals/llms/cross-group-agent-flows/EVAL.tsevals/llms/single-page-cli-flag/EVAL.tsevals/llms/single-group-authoring/EVAL.tsapps/example/scripts/llm-generate-real.tsevals/vitest.config.tsevals/lib/llms-eval.tsevals/llms/ambiguous-output-routing/EVAL.tsevals/llms/exact-symbol-readability/EVAL.tsapps/example/scripts/mdx-convert.tsapps/example/tests/e2e/smoke.e2e.tsdocs/docs.config.tsevals/lib/llms-sandbox.tsapps/example/src/routes/docs/reference/evals.tsxpackages/leadtype/src/llm/readability.tsevals/run-eval.tsevals/lib/llms-metrics.test.tsevals/lib/llms-metrics.tsevals/lib/transcript.tsapps/example/src/routeTree.gen.tspackages/leadtype/src/cli.test.tspackages/leadtype/src/cli/generate.tsevals/lib/llms-variants.tspackages/leadtype/src/llm/llm.test.tspackages/leadtype/src/llm/llm.tsevals/run-llms-eval.ts
**/*.{jsx,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
**/*.{jsx,tsx}: Use function components over class components in React
Call hooks at the top level only, never conditionally
Specify all dependencies in hook dependency arrays correctly
Use thekeyprop for elements in iterables (prefer unique IDs over array indices)
Nest children between opening and closing tags instead of passing as props
Don't define components inside other components
AvoiddangerouslySetInnerHTMLunless absolutely necessary
Use proper image components (e.g., Next.js<Image>) over<img>tags
Use Next.js<Image>component for images
Usenext/heador App Router metadata API for head elements in Next.js
Use Server Components for async data fetching instead of async Client Components in Next.js
Use ref as a prop instead ofReact.forwardRefin React 19+
Files:
apps/example/src/routes/docs/reference/evals.tsx
**/*.{jsx,tsx,html}
📄 CodeRabbit inference engine (AGENTS.md)
**/*.{jsx,tsx,html}: Use semantic HTML and ARIA attributes for accessibility: provide meaningful alt text for images, use proper heading hierarchy, add labels for form inputs, include keyboard event handlers alongside mouse events, use semantic elements instead of divs with roles
Addrel="noopener"when usingtarget="_blank"on links
Files:
apps/example/src/routes/docs/reference/evals.tsx
**/*.{test,spec}.{js,ts,jsx,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
**/*.{test,spec}.{js,ts,jsx,tsx}: Write assertions insideit()ortest()blocks
Avoid done callbacks in async tests - use async/await instead
Don't use.onlyor.skipin committed code
Keep test suites reasonably flat - avoid excessivedescribenesting
Files:
evals/lib/llms-metrics.test.tspackages/leadtype/src/cli.test.tspackages/leadtype/src/llm/llm.test.ts
🪛 ast-grep (0.42.1)
evals/lib/llms-eval.ts
[warning] 46-46: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(pattern, "i")
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html
(regexp-from-variable)
[warning] 49-49: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(pattern, "i")
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html
(regexp-from-variable)
🪛 LanguageTool
docs/how-it-works.mdx
[uncategorized] ~73-~73: Did you mean the formatting language “Markdown” (= proper noun)?
Context: ...lms.txt over HTTP and follow page-level markdown links first. The root /llms-full.txt fi...
(MARKDOWN_NNP)
docs/authoring/components.mdx
[uncategorized] ~9-~9: Did you mean the formatting language “Markdown” (= proper noun)?
Context: ...ipeline can flatten each component into markdown for agents, search, and llms-full.txt...
(MARKDOWN_NNP)
docs/build/connect-docs-site.mdx
[uncategorized] ~163-~163: Did you mean the formatting language “Markdown” (= proper noun)?
Context: ... public/llms-full.txt — all generated markdown docs flattened into one fallback file. ...
(MARKDOWN_NNP)
docs/reference/remark.mdx
[uncategorized] ~102-~102: Did you mean the formatting language “Markdown” (= proper noun)?
Context: ...file, finds the named type, and emits a markdown table with one row per property. The re...
(MARKDOWN_NNP)
docs/reference/llm.mdx
[uncategorized] ~12-~12: Did you mean the formatting language “Markdown” (= proper noun)?
Context: ....txtfallback containing all generated markdown docs. Pairs withgenerateLlmsTxt`. - *...
(MARKDOWN_NNP)
[uncategorized] ~37-~37: Did you mean the formatting language “Markdown” (= proper noun)?
Context: ...| <out>/llms-full.txt | All generated markdown docs flattened into one fallback file. ...
(MARKDOWN_NNP)
🪛 markdownlint-cli2 (0.22.1)
evals/llms/cross-group-agent-flows/PROMPT.md
[warning] 1-1: First line in a file should be a top-level heading
(MD041, first-line-heading, first-line-h1)
evals/llms/single-page-cli-flag/PROMPT.md
[warning] 1-1: First line in a file should be a top-level heading
(MD041, first-line-heading, first-line-h1)
evals/llms/negative-vector-index/PROMPT.md
[warning] 1-1: First line in a file should be a top-level heading
(MD041, first-line-heading, first-line-h1)
evals/llms/exact-symbol-readability/PROMPT.md
[warning] 1-1: First line in a file should be a top-level heading
(MD041, first-line-heading, first-line-h1)
evals/llms/ambiguous-output-routing/PROMPT.md
[warning] 1-1: First line in a file should be a top-level heading
(MD041, first-line-heading, first-line-h1)
evals/llms/single-group-authoring/PROMPT.md
[warning] 1-1: First line in a file should be a top-level heading
(MD041, first-line-heading, first-line-h1)
🔍 Remote MCP Context7
Perfect! Now I have the contextual information needed. Let me compile the findings:
Summary of Additional Context for PR Review
Based on the provided changes, here are the key technical details that support the PR's implementation:
TanStack Router File-Based Routing Pattern
The new /docs/reference/evals route follows the standard TanStack Router createFileRoute pattern. The implementation in apps/example/src/routes/docs/reference/evals.tsx correctly:
- Imports
createFileRoutefrom@tanstack/react-router - Passes the route path (
'/docs/reference/evals') as the argument - Exports a
Routeconstant with the component and head configuration - The generated route tree (
routeTree.gen.ts) automatically includes module augmentation viadeclare module '@tanstack/react-router', which matches the documented TanStack Router module augmentation pattern.
Vitest Configuration Pattern
The evals/vitest.config.ts configuration follows Vitest's documented include pattern by:
- Using glob patterns to include both
lib/**/*.test.ts(standard test files) andevals/**/llms/**(custom eval entrypoints) - This aligns with Vitest's support for custom test discovery patterns without requiring
.test.or.spec.suffixes in all files, allowing theEVAL.tsfiles to be included as test fixtures.
Eval Harness Architecture
The eval harness implementation introduces a novel pattern where:
- Fixture structure: Each eval under
evals/llms/<fixture>/containsPROMPT.md(instructions),EVAL.ts(test runner), andexpected.json(assertions) - Metrics pipeline: The
evals/lib/llms-metrics.tsmodule loads expectations fromexpected.json, parses filesystem reads from transcript logs, and validates them against variant patterns - Sandbox isolation: The
createLlmsSandbox()utility creates isolated temp directories with cleaned fixtures, enabling safe concurrent eval runs with variant materialization
Key Changes to Output Artifacts
The PR consolidates the llms-full output from a distributed multi-file structure to a single monolithic root file:
- Before:
public/docs/llms-full/<group>.txt(per-group bundles) + router files - After:
public/llms-full.txt(single flattened fallback) - The
readability.tspattern change ensures only/llms-full.txtis treated as an agent-readability artifact, not/docs/llms-full/*variants
This architectural shift is validated by eval results across both Claude Opus and GPT-5.5 models, as documented in the new docs/reference/evals.mdx page.
🔇 Additional comments (41)
docs/index.mdx (1)
17-17: Artifact terminology update is consistent with the new monolithic output contract.These docs changes correctly reflect root
llms-full.txtin site mode.Also applies to: 56-56, 65-65
docs/methodology.mdx (1)
25-25: The methodology wording now aligns with root-levelllms-full.txtbehavior.Also applies to: 40-40
docs/docs.config.ts (1)
7-7: Config copy updates are coherent with the new public artifact surface.Also applies to: 10-10, 23-23, 48-48
packages/leadtype/src/llm/readability.ts (1)
22-23: Regex scope reduction matches the monolithic root artifact design.This correctly stops treating
/docs/llms-full*as readability artifacts.evals/llms/exact-symbol-readability/expected.json (1)
1-11: Fixture expectations are aligned with the updated readability/artifact contract.docs/reference/evals.mdx (1)
23-30: The evals reference content is clear and maps well to the new default artifact strategy.Also applies to: 47-57, 66-70
apps/example/src/routes/docs/reference/evals.tsx (1)
7-14: Route wiring and MDX page integration look good.packages/leadtype/src/llm/llm.ts (3)
594-600: LGTM!The
stripLeadingTitleHeadinghelper correctly handles the edge case of avoiding duplicate title headings when flattening content. The implementation properly checks for exact match and preserves content when no match is found.
602-636: LGTM!The
renderFullContextDocumentfunction cleanly consolidates the full-context generation into a single flattened document. Good use of:
- Template literals for content block formatting
- The new
stripLeadingTitleHeadinghelper to avoid duplicate titles- Consistent link rendering with the existing
renderLinkhelper
665-691: LGTM!The refactored
generateLLMFullContextFilescorrectly:
- Validates groups via
resolveGroups()(line 682) even though the result isn't used for output- Cleans up stale docs-scoped artifacts (
llms-full/directory anddocs/llms-full.txt)- Writes the single root-level
llms-full.txtThis aligns with the PR objective of consolidating to a monolithic output.
apps/example/scripts/llm-generate-real.ts (1)
35-36: LGTM!The agent guidance text correctly directs to
/llms-full.txtas the broad context fallback, consistent with the new monolithic output structure.packages/leadtype/src/cli/generate.ts (2)
76-76: LGTM!The type change from
docsLlmsFullTxttollmsFullTxtcorrectly reflects the new root-level artifact location.
463-463: LGTM!The path correctly points to the root output directory for
llms-full.txt, aligning with the consolidated artifact structure.apps/example/tests/e2e/smoke.e2e.ts (1)
186-188: LGTM!The updated assertions correctly verify the new monolithic
llms-full.txtformat:
- "Full Context" matches the document header from
renderFullContextDocument- "Quickstart" confirms actual page content is included
- Removal of "Full Context Router" expectation aligns with the router elimination
evals/llms/negative-vector-index/expected.json (1)
1-15: LGTM!The eval fixture correctly tests a negative case (Leadtype does NOT include a hosted database-backed vector index). The pattern structure with both allowed and forbidden patterns provides good coverage for validating agent understanding.
docs/quickstart.mdx (2)
50-52: LGTM!The step title and description clearly communicate the new artifact structure with the root-level
llms-full.txtfallback file.
67-81: LGTM!The output tree and explanatory text accurately document the new artifact layout with
llms-full.txtat the root level.docs/build/optimize-docs-for-agents.mdx (1)
13-14: LGTM!Documentation consistently updated across all sections:
- Output tree includes root-level
llms-full.txt- Static file lists updated
- Minimal checklist includes
llms-full.txt- Added reference to evals page for design rationale
Also applies to: 52-52, 103-103, 225-225, 264-264
docs/reference/cli.mdx (2)
28-29: LGTM!CLI flag descriptions accurately document:
--bundleskipsllms-full.txt(website-only artifact)--base-urlaffects full-context fallback URLs
65-65: LGTM!The JSON output example correctly shows
llmsFullTxtpointing to the root-level path, matching the updatedGenerateResult.filestype.packages/leadtype/src/cli.test.ts (1)
122-125: Monolithic artifact contract coverage is strong.These assertions correctly pin the new root
llms-full.txtbehavior while guarding against legacydocs/llms-full*regressions.Also applies to: 147-150, 177-195
packages/leadtype/src/llm/llm.test.ts (1)
209-268: Great migration-focused test updates.The new expectations validate both the root-only full-context output and stale docs-scoped artifact cleanup, which are the key behavioral guarantees for this change.
Also applies to: 303-348, 748-753
evals/llms/single-group-authoring/expected.json (1)
1-11: Fixture shape and intent look good.The expected groups/pages and pattern list are consistent with a focused single-group authoring eval.
docs/reference/llm.mdx (1)
12-13: Docs now consistently reflect the new root-only full-context model.The updated API reference and artifact table are aligned with the PR’s generation contract.
Also applies to: 37-44, 211-212, 346-347, 375-383
evals/lib/transcript.ts (1)
14-20: Typed transcript extension looks good.Making
benchmarkandvariantoptional keeps compatibility while improving eval metadata quality.evals/run-eval.ts (2)
43-51: CLI required-value parsing is a solid hardening change.This closes a common argument parsing edge case and yields clearer failures for missing
--fixture/--modelvalues.Also applies to: 67-72
194-194: Benchmark tagging in transcript is correctly wired.
benchmark: "package"cleanly disambiguates this runner from LLMS eval runs.evals/lib/llms-sandbox.ts (1)
11-44: Sandbox lifecycle implementation is clean and robust.Typed handle, descriptive copy errors, and deterministic cleanup make this harness utility reliable.
evals/llms/negative-vector-index/EVAL.ts (1)
1-3: Fixture entrypoint wiring looks correct.This matches the shared harness pattern and keeps fixture execution consistent.
evals/llms/cross-group-agent-flows/EVAL.ts (1)
1-3: Consistent fixture harness bootstrap.This follows the expected LLMS eval fixture entrypoint shape.
evals/package.json (1)
9-9: New LLMS eval script is clear and appropriately scoped.Good addition to expose the dedicated runner directly from package scripts.
evals/llms/ambiguous-output-routing/EVAL.ts (1)
1-3: Eval fixture entrypoint wiring looks correct.This correctly delegates to the shared fixture assertion with the local fixture URL.
evals/llms/single-group-authoring/EVAL.ts (1)
1-3: Shared eval harness integration is clean.This is consistent with the other fixture entrypoints and correctly awaits the assertion.
evals/vitest.config.ts (1)
3-7: Vitest include configuration looks good.The targeted include patterns are clear and aligned with the eval harness layout.
evals/llms/single-page-cli-flag/EVAL.ts (1)
1-3: Entrypoint implementation is correct.The fixture is wired into the shared assertion flow as expected.
evals/llms/cross-group-agent-flows/expected.json (1)
1-11: Fixture expectations are well-formed.Patterns and expected routing targets are clear and consistent for this eval.
evals/llms/single-page-cli-flag/expected.json (1)
1-5: Expected fixture contract looks good.The patterns/group/page assertions are straightforward and correctly structured.
evals/llms/exact-symbol-readability/EVAL.ts (1)
1-3: Fixture bootstrap is clean and correct.This is a clear, deterministic eval entrypoint for fixture-local assertions.
evals/llms/ambiguous-output-routing/expected.json (1)
1-10: Expected fixture payload looks consistent.Patterns/groups/page targets are aligned with the scenario intent.
evals/README.md (1)
57-83: New benchmark documentation is clear and actionable.The added commands and variant table make the llms eval flow easy to run and compare.
evals/lib/llms-metrics.test.ts (1)
52-149: Test coverage for llms variants is strong.Good breadth across positive/negative paths and read summarization behavior.
There was a problem hiding this comment.
Actionable comments posted: 6
♻️ Duplicate comments (1)
docs/how-it-works.mdx (1)
81-81:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winNormalize
<out>/prefix across all Agent Readability paths.Line 81 currently prefixes only the first path with
<out>/, which makes the other three look inconsistent with the rest of this table.Proposed fix
- type: "<out>/docs/sitemap.xml + docs/sitemap.md + docs/robots.txt + docs/agent-readability.json", + type: "<out>/docs/sitemap.xml + <out>/docs/sitemap.md + <out>/docs/robots.txt + <out>/docs/agent-readability.json",🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/how-it-works.mdx` at line 81, The Agent Readability path string currently only prefixes the first entry with "<out>/", producing "type: \"<out>/docs/sitemap.xml + docs/sitemap.md + docs/robots.txt + docs/agent-readability.json\""; update this value so each path is consistently prefixed (e.g., "<out>/docs/sitemap.xml + <out>/docs/sitemap.md + <out>/docs/robots.txt + <out>/docs/agent-readability.json") to normalize the "<out>/" prefix across all entries in that line.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@evals/lib/llms-eval.ts`:
- Around line 33-39: The test "used an llms context source" in
evals/lib/llms-eval.ts misses section-index reads: update the assertion to
include summary.sectionIndexReads (e.g., check summary.sectionIndexReads.length
> 0 or truthiness) alongside summary.readLlmsTxt, summary.readRootFull,
summary.pageReads.length > 0, and summary.groupReads.length > 0 so
section-index-only runs pass; modify the condition used in that expect(...) to
OR in summary.sectionIndexReads.
In `@evals/lib/llms-variants.ts`:
- Around line 199-202: isLlmsVariant uses a type assertion (value as
LlmsVariant) which is avoidable; remove the assertion and instead make the
runtime check use LLMS_VARIANTS.includes(value) by ensuring LLMS_VARIANTS is
typed to accept strings (e.g. declare LLMS_VARIANTS as string[] or readonly
string[]/readonly string[] literal union) so the includes call accepts a string
without casting; update the LLMS_VARIANTS declaration accordingly and simplify
isLlmsVariant to: return typeof value === "string" &&
LLMS_VARIANTS.includes(value).
In `@evals/run-llms-eval.ts`:
- Around line 66-74: parseRequiredFlagValue currently only rejects values that
start with "--", so short flags like "-h" can be mistaken for a value; update
the guard in parseRequiredFlagValue to treat any string starting with "-" (short
or long flag) as a missing value except legitimate negative numeric values.
Replace the check `!value || value.startsWith("--")` with a test that throws
when value is falsy or matches a leading dash that is not a negative number
(e.g., use a regex such as /^-(?!\d)/ to detect flag-like tokens) so flags like
"-h" are rejected but negative numbers like "-1" are accepted.
- Around line 282-287: The loop writes files using transcript.filesModified
entries without sanitizing paths, allowing path traversal; to fix,
validate/sanitize each rel before writing: ensure rel is not absolute, normalize
it (e.g. path.normalize), reject or skip entries that contain traversal (e.g.
'..') or where path.relative(path.join(dir, "files"), dest) starts with '..' or
is outside the intended base, then compute dest = path.join(dir, "files",
safeRel) and proceed to mkdir/writeFile; reference transcript.filesModified,
tempDir, path.join, path.normalize, path.relative, mkdir, and writeFile to
locate and update the code.
- Around line 299-341: The spawn call in the Promise (proc, settle, settle
function) can hang indefinitely; add a timeout timer after proc is created that,
after a configurable duration (e.g., N ms), kills the child (proc.kill() or
proc.kill('SIGKILL')) and calls settle({ passed: false, output:
`${output}\ntimeout after ${N}ms` }); also ensure you clear the timer inside
proc.on('close') and proc.on('error') so the timer doesn’t fire after normal
termination; keep the existing settle guard (settled) to avoid double-settling.
- Around line 136-140: The getModel signature currently types the model field as
any, losing type safety; update the return type so model uses ReturnType<typeof
gateway> instead of any (keep Provider and modelId unchanged) to capture the
concrete LanguageModelV2 shape returned by gateway and remove the need for an
explicit any; ensure the biome-ignore comment is adjusted/removed if no longer
required and update the object return type in getModel to reflect this new model
type.
---
Duplicate comments:
In `@docs/how-it-works.mdx`:
- Line 81: The Agent Readability path string currently only prefixes the first
entry with "<out>/", producing "type: \"<out>/docs/sitemap.xml + docs/sitemap.md
+ docs/robots.txt + docs/agent-readability.json\""; update this value so each
path is consistently prefixed (e.g., "<out>/docs/sitemap.xml +
<out>/docs/sitemap.md + <out>/docs/robots.txt +
<out>/docs/agent-readability.json") to normalize the "<out>/" prefix across all
entries in that line.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 57b11678-63fe-4ead-965e-7519de779c22
⛔ Files ignored due to path filters (3)
apps/example/src/generated/agent-readability.jsonis excluded by!**/generated/**apps/example/src/generated/docs-search-content.jsonis excluded by!**/generated/**apps/example/src/generated/docs-search-index.jsonis excluded by!**/generated/**
📒 Files selected for processing (14)
docs/authoring/frontmatter.mdxdocs/build/connect-docs-site.mdxdocs/how-it-works.mdxevals/lib/llms-eval.tsevals/lib/llms-metrics.tsevals/lib/llms-variants.tsevals/llms/ambiguous-output-routing/PROMPT.mdevals/llms/ambiguous-output-routing/expected.jsonevals/llms/cross-group-agent-flows/PROMPT.mdevals/llms/negative-vector-index/PROMPT.mdevals/llms/single-group-authoring/PROMPT.mdevals/llms/single-page-cli-flag/PROMPT.mdevals/llms/single-page-cli-flag/expected.jsonevals/run-llms-eval.ts
📜 Review details
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
**/*.{ts,tsx}: Use explicit types for function parameters and return values when they enhance clarity
Preferunknownoveranywhen the type is genuinely unknown
Use const assertions (as const) for immutable values and literal types
Leverage TypeScript's type narrowing instead of type assertions
Files:
evals/lib/llms-eval.tsevals/lib/llms-metrics.tsevals/run-llms-eval.tsevals/lib/llms-variants.ts
**/*.{js,ts,jsx,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
**/*.{js,ts,jsx,tsx}: Use meaningful variable names instead of magic numbers - extract constants with descriptive names
Use arrow functions for callbacks and short functions
Preferfor...ofloops over.forEach()and indexedforloops
Use optional chaining (?.) and nullish coalescing (??) for safer property access
Prefer template literals over string concatenation
Use destructuring for object and array assignments
Useconstby default,letonly when reassignment is needed, nevervar
Alwaysawaitpromises in async functions - don't forget to use the return value
Useasync/awaitsyntax instead of promise chains for better readability
Handle errors appropriately in async code with try-catch blocks
Don't use async functions as Promise executors
Removeconsole.log,debugger, andalertstatements from production code
ThrowErrorobjects with descriptive messages, not strings or other values
Usetry-catchblocks meaningfully - don't catch errors just to rethrow them
Prefer early returns over nested conditionals for error cases
Extract complex conditions into well-named boolean variables
Use early returns to reduce nesting
Prefer simple conditionals over nested ternary operators
Don't useeval()or assign directly todocument.cookie
Avoid spread syntax in accumulators within loops
Use top-level regex literals instead of creating them in loops
Prefer specific imports over namespace imports
Use descriptive names for functions, variables, and types for meaningful naming
Add comments for complex logic, but prefer self-documenting code
Files:
evals/lib/llms-eval.tsevals/lib/llms-metrics.tsevals/run-llms-eval.tsevals/lib/llms-variants.ts
🪛 ast-grep (0.42.1)
evals/lib/llms-eval.ts
[warning] 47-47: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(pattern, "i")
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html
(regexp-from-variable)
[warning] 50-50: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(pattern, "i")
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html
(regexp-from-variable)
🪛 LanguageTool
docs/how-it-works.mdx
[uncategorized] ~73-~73: Did you mean the formatting language “Markdown” (= proper noun)?
Context: ...lms.txt over HTTP and follow page-level markdown links first. The root /llms-full.txt fi...
(MARKDOWN_NNP)
docs/build/connect-docs-site.mdx
[uncategorized] ~163-~163: Did you mean the formatting language “Markdown” (= proper noun)?
Context: ... public/llms-full.txt — all generated markdown docs flattened into one fallback file. ...
(MARKDOWN_NNP)
🪛 markdownlint-cli2 (0.22.1)
evals/llms/single-group-authoring/PROMPT.md
[warning] 1-1: First line in a file should be a top-level heading
(MD041, first-line-heading, first-line-h1)
🔍 Remote MCP Context7
Additional relevant facts for reviewing PR #26
-
TanStack Router — createFileRoute usage and module augmentation
- createFileRoute is the documented pattern for file-based routes; examples show exporting
const Route = createFileRoute('/path')({ component: ... })and using the Vite plugin to keep path strings aligned with file paths. Module augmentation for@tanstack/react-routeris the recommended way to surface router types viadeclare module '@tanstack/react-router' { interface Register { router: typeof router } }. These docs validate the PR’s generatedcreateFileRouteusage and thedeclare module '@tanstack/react-router'augmentation in routeTree.gen.ts.
- createFileRoute is the documented pattern for file-based routes; examples show exporting
-
Vitest — configure
includeto pick up non-standard test files- Vitest’s
defineConfig({ test: { include: [...] } })supports arbitrary glob patterns resolved relative to the project root (tinyglobby). This is the documented way to include EVAL.ts entrypoints and custom test locations (e.g.,lib/**/*.test.tsandevals/**), matching the change inevals/vitest.config.ts.
- Vitest’s
Tool lookups performed
- Resolved TanStack Router library ID and fetched docs covering createFileRoute and module augmentation.,
- Resolved Vitest library ID and fetched docs for
defineConfig/test.include.,
🔇 Additional comments (5)
docs/build/connect-docs-site.mdx (2)
12-12: Past issue resolved: Mermaid component is now correctly formatted.The previous HTML-escaped tag has been fixed and the diagram now properly reflects the consolidated artifact structure with
llms.txt · llms-full.txt.
184-185: LGTM: Evals card added.The new Evals card correctly documents the evaluation framework introduced in this PR and follows the established card pattern.
evals/lib/llms-metrics.ts (1)
101-105:passednow correctly reflects both mismatch dimensions.Line 102 correctly requires both
reasonsandwrongGroupReadsto be empty before passing.evals/llms/single-page-cli-flag/expected.json (1)
1-5: Fixture expectations look consistent with the single-page CLI scenario.evals/llms/single-group-authoring/PROMPT.md (1)
1-3: Prompt contract update looks correct for monolithic/llms-full.txtevaluation.
c775eb6 to
c98b608
Compare
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@evals/lib/llms-variants.ts`:
- Around line 205-233: materializeLlmsVariant currently writes llms files to the
temp root, so the expected /docs/llms.txt artifact is never created; update the
calls in materializeLlmsVariant to write into the docs folder (e.g., use
writeTextFile(tempDir, "docs/llms.txt", renderLlmsTxt(variant))) and likewise
place the full manifests created by renderMonolith() and renderRootRouter() into
docs (writeTextFile(tempDir, "docs/llms-full.txt", ...)); modify the calls that
use writeTextFile in materializeLlmsVariant (and keep using renderLlmsTxt,
renderMonolith, renderRootRouter) so the filesystem produced matches the /docs
contract under test.
In `@evals/run-llms-eval.ts`:
- Around line 166-257: The sandbox created by createLlmsSandbox is only cleaned
in the inner finally (after runVitest), so if writeTranscript or earlier awaits
throw the temp dir is leaked; wrap the entire run (everything after const
sandbox = await createLlmsSandbox(...)) in a single try { ... } finally { await
sandbox.cleanup(); } block so sandbox.cleanup() always runs, keeping the
existing inner logic (generateText, writeTranscript, runVitest,
archiveTranscript, and the return) intact; refer to sandbox, createLlmsSandbox,
writeTranscript, archiveTranscript, and sandbox.cleanup when locating where to
move the finally.
- Around line 56-64: The parsePositiveInt function currently uses
Number.parseInt which accepts malformed inputs like "1.5" or "1foo"; update
parsePositiveInt to validate the raw string before parsing (e.g., require it
match a positive integer regex such as /^\d+$/ or /^[1-9]\d*$/), throw the same
`${flag} must be a positive integer, got ${value}` error for non-matching input,
then safely parse and return the integer; reference the parsePositiveInt
function and its flag parameter to locate and update the validation logic.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 265890c4-3af7-47d5-b0e5-b36dc31ea47f
⛔ Files ignored due to path filters (3)
apps/example/src/generated/agent-readability.jsonis excluded by!**/generated/**apps/example/src/generated/docs-search-content.jsonis excluded by!**/generated/**apps/example/src/generated/docs-search-index.jsonis excluded by!**/generated/**
📒 Files selected for processing (4)
docs/how-it-works.mdxevals/lib/llms-eval.tsevals/lib/llms-variants.tsevals/run-llms-eval.ts
📜 Review details
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
**/*.{ts,tsx}: Use explicit types for function parameters and return values when they enhance clarity
Preferunknownoveranywhen the type is genuinely unknown
Use const assertions (as const) for immutable values and literal types
Leverage TypeScript's type narrowing instead of type assertions
Files:
evals/lib/llms-eval.tsevals/run-llms-eval.tsevals/lib/llms-variants.ts
**/*.{js,ts,jsx,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
**/*.{js,ts,jsx,tsx}: Use meaningful variable names instead of magic numbers - extract constants with descriptive names
Use arrow functions for callbacks and short functions
Preferfor...ofloops over.forEach()and indexedforloops
Use optional chaining (?.) and nullish coalescing (??) for safer property access
Prefer template literals over string concatenation
Use destructuring for object and array assignments
Useconstby default,letonly when reassignment is needed, nevervar
Alwaysawaitpromises in async functions - don't forget to use the return value
Useasync/awaitsyntax instead of promise chains for better readability
Handle errors appropriately in async code with try-catch blocks
Don't use async functions as Promise executors
Removeconsole.log,debugger, andalertstatements from production code
ThrowErrorobjects with descriptive messages, not strings or other values
Usetry-catchblocks meaningfully - don't catch errors just to rethrow them
Prefer early returns over nested conditionals for error cases
Extract complex conditions into well-named boolean variables
Use early returns to reduce nesting
Prefer simple conditionals over nested ternary operators
Don't useeval()or assign directly todocument.cookie
Avoid spread syntax in accumulators within loops
Use top-level regex literals instead of creating them in loops
Prefer specific imports over namespace imports
Use descriptive names for functions, variables, and types for meaningful naming
Add comments for complex logic, but prefer self-documenting code
Files:
evals/lib/llms-eval.tsevals/run-llms-eval.tsevals/lib/llms-variants.ts
🪛 ast-grep (0.42.1)
evals/lib/llms-eval.ts
[warning] 48-48: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(pattern, "i")
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html
(regexp-from-variable)
[warning] 51-51: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(pattern, "i")
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html
(regexp-from-variable)
🪛 LanguageTool
docs/how-it-works.mdx
[uncategorized] ~73-~73: Did you mean the formatting language “Markdown” (= proper noun)?
Context: ...lms.txt over HTTP and follow page-level markdown links first. The root /llms-full.txt fi...
(MARKDOWN_NNP)
🔍 Remote MCP Context7
Relevant facts for reviewing PR #26
-
TanStack Router — createFileRoute is the documented pattern for file-based routes and supports exporting
const Route = createFileRoute('/path')({ component: ... }); module augmentation viadeclare module '@tanstack/react-router' { interface Register { router: typeof router } }is the recommended way to surface router types globally (validates the generated routeTree.gen.ts createFileRoute usage and the module augmentation). -
Vitest — using defineConfig({ test: { include: [...] } }) with custom glob patterns is the supported method to pick up non-standard test files (e.g., EVAL.ts). Including patterns like
evals/**or explicit**/EVAL.tsintest.includewill cause Vitest to discover and run those files. The PR’s vitest.config.ts change to includelib/**/*.test.tsplusEVAL.tsunderevals/**matches documented usage.
Sources queried: Context7 documentation for TanStack Router and Vitest (createFileRoute/module augmentation; defineConfig.test.include).,
🔇 Additional comments (3)
docs/how-it-works.mdx (3)
127-132: No actionable issue in this terminology refinement; the glossary updates are coherent with the rest of the page.
19-19: Monolithic rootllms-full.txtdocs are consistent with implementation intent.These updates clearly and consistently describe the new site-mode behavior: single root fallback (
/llms-full.txt) instead of per-group/per-leaf full bundles.Also applies to: 70-73, 154-155
46-51: Website-vs-bundle artifact boundaries are now clearly documented.The updated wording cleanly separates site-only outputs (
docs/...discovery/search/readability artifacts) from bundle-mode outputs (AGENTS.md+docs/*.md), which matches the PR contract.Also applies to: 81-81
| export async function materializeLlmsVariant(options: { | ||
| tempDir: string; | ||
| variant: LlmsVariant; | ||
| }): Promise<void> { | ||
| const { tempDir, variant } = options; | ||
| await writeDocsPages(tempDir); | ||
|
|
||
| await writeTextFile(tempDir, "llms.txt", renderLlmsTxt(variant)); | ||
|
|
||
| if ( | ||
| variant === "explicit-bundles" || | ||
| variant === "router" || | ||
| variant === "section-indexes" | ||
| ) { | ||
| await writeTopicBundles(tempDir); | ||
| } | ||
|
|
||
| if (variant === "section-indexes") { | ||
| await writeSectionIndexes(tempDir); | ||
| } | ||
|
|
||
| if (variant === "monolith") { | ||
| await writeTextFile(tempDir, "llms-full.txt", renderMonolith()); | ||
| } | ||
|
|
||
| if (variant === "router") { | ||
| await writeTextFile(tempDir, "llms-full.txt", renderRootRouter()); | ||
| } | ||
| } |
There was a problem hiding this comment.
Materialize /docs/llms.txt in the sandbox.
The fixture corpus in this file says hosted website mode publishes /docs/llms.txt, but materializeLlmsVariant() never writes it. That makes the eval filesystem diverge from the contract under test, and agents that follow the docs-scoped map will hit a missing file instead of the expected artifact.
Proposed fix
export async function materializeLlmsVariant(options: {
tempDir: string;
variant: LlmsVariant;
}): Promise<void> {
const { tempDir, variant } = options;
await writeDocsPages(tempDir);
await writeTextFile(tempDir, "llms.txt", renderLlmsTxt(variant));
+ await writeTextFile(tempDir, "docs/llms.txt", renderDocsLlmsTxt());
if (
variant === "explicit-bundles" ||
variant === "router" ||
variant === "section-indexes"
@@
if (variant === "router") {
await writeTextFile(tempDir, "llms-full.txt", renderRootRouter());
}
}
+
+function renderDocsLlmsTxt(): string {
+ return [
+ "# Leadtype Docs",
+ "",
+ "> Docs-scoped markdown map for hosted websites.",
+ "",
+ ...renderPageSections(),
+ ].join("\n");
+}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@evals/lib/llms-variants.ts` around lines 205 - 233, materializeLlmsVariant
currently writes llms files to the temp root, so the expected /docs/llms.txt
artifact is never created; update the calls in materializeLlmsVariant to write
into the docs folder (e.g., use writeTextFile(tempDir, "docs/llms.txt",
renderLlmsTxt(variant))) and likewise place the full manifests created by
renderMonolith() and renderRootRouter() into docs (writeTextFile(tempDir,
"docs/llms-full.txt", ...)); modify the calls that use writeTextFile in
materializeLlmsVariant (and keep using renderLlmsTxt, renderMonolith,
renderRootRouter) so the filesystem produced matches the /docs contract under
test.
| const sandbox = await createLlmsSandbox({ fixtureDir, variant }); | ||
| const start = Date.now(); | ||
| const transcriptCalls: ToolCall[] = []; | ||
| const filesModified = new Set<string>(); | ||
| const tools = scopedTools({ | ||
| tempDir: sandbox.tempDir, | ||
| transcript: transcriptCalls, | ||
| filesModified, | ||
| }); | ||
|
|
||
| const { provider, model } = getModel(modelId); | ||
| const errors: string[] = []; | ||
| let finalText = ""; | ||
| let steps = 0; | ||
| let inputTokens = 0; | ||
| let outputTokens = 0; | ||
|
|
||
| try { | ||
| const result = await generateText({ | ||
| model, | ||
| system: SYSTEM_PROMPT, | ||
| prompt: promptText, | ||
| tools, | ||
| stopWhen: stepCountIs(STEP_LIMIT), | ||
| }); | ||
| finalText = result.text ?? ""; | ||
| steps = result.steps?.length ?? 0; | ||
| inputTokens = result.usage?.inputTokens ?? 0; | ||
| outputTokens = result.usage?.outputTokens ?? 0; | ||
| } catch (err) { | ||
| errors.push(err instanceof Error ? err.message : String(err)); | ||
| } | ||
|
|
||
| const durationMs = Date.now() - start; | ||
| const transcript: Transcript = { | ||
| fixture, | ||
| benchmark: "llms", | ||
| mode: "treatment", | ||
| variant, | ||
| agent: { provider, model: modelId }, | ||
| toolCalls: transcriptCalls, | ||
| filesModified: [...filesModified].sort(), | ||
| finalText, | ||
| durationMs, | ||
| steps, | ||
| errors, | ||
| tokens: { input: inputTokens, output: outputTokens }, | ||
| }; | ||
| await writeTranscript(sandbox.tempDir, transcript); | ||
|
|
||
| try { | ||
| const expected = loadLlmsExpected(fixtureDir); | ||
| const selection = selectionMatchesVariant(transcript, expected); | ||
| const evalResult = await runVitest(fixture, sandbox.tempDir); | ||
| const passed = evalResult.passed; | ||
|
|
||
| process.stdout.write( | ||
| ` ${passed ? "ok" : "fail"} ${(durationMs / 1000).toFixed(1)}s · ${transcriptCalls.length} calls · context ${selection.passed ? "ok" : "miss"} · ${inputTokens}in/${outputTokens}out\n` | ||
| ); | ||
| if (errors.length > 0) { | ||
| for (const error of errors) { | ||
| process.stdout.write(` ! ${error}\n`); | ||
| } | ||
| } | ||
| if (!passed) { | ||
| const tailLines = evalResult.output.split("\n").slice(-25).join("\n"); | ||
| process.stdout.write(`${tailLines}\n`); | ||
| } | ||
|
|
||
| await archiveTranscript({ | ||
| fixture, | ||
| variant, | ||
| runIndex, | ||
| tempDir: sandbox.tempDir, | ||
| transcript, | ||
| }); | ||
|
|
||
| return { | ||
| fixture, | ||
| variant, | ||
| passed, | ||
| contextMatched: selection.passed, | ||
| wrongGroupReads: selection.wrongGroupReads.length, | ||
| durationMs, | ||
| toolCalls: transcriptCalls.length, | ||
| inputTokens, | ||
| outputTokens, | ||
| evalOutput: evalResult.output, | ||
| }; | ||
| } finally { | ||
| await sandbox.cleanup(); | ||
| } |
There was a problem hiding this comment.
Wrap sandbox cleanup around the whole run.
sandbox.cleanup() is only guaranteed after execution reaches Line 216. If writeTranscript() or any earlier awaited step fails after the sandbox is created, the temp directory is leaked.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@evals/run-llms-eval.ts` around lines 166 - 257, The sandbox created by
createLlmsSandbox is only cleaned in the inner finally (after runVitest), so if
writeTranscript or earlier awaits throw the temp dir is leaked; wrap the entire
run (everything after const sandbox = await createLlmsSandbox(...)) in a single
try { ... } finally { await sandbox.cleanup(); } block so sandbox.cleanup()
always runs, keeping the existing inner logic (generateText, writeTranscript,
runVitest, archiveTranscript, and the return) intact; refer to sandbox,
createLlmsSandbox, writeTranscript, archiveTranscript, and sandbox.cleanup when
locating where to move the finally.
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/leadtype/src/cli.test.ts (1)
409-412: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick winAdd a bundle-mode assertion for missing
docs/llms-full/directory.Line 411 asserts
docs/llms-full.txtis absent, but the directory absence is also part of the “no website artifacts” guarantee and is worth locking down.Suggested diff
expect(existsSync(path.join(outDir, "llms.txt"))).toBe(false); expect(existsSync(path.join(outDir, "llms-full.txt"))).toBe(false); expect(existsSync(path.join(outDir, "docs", "llms.txt"))).toBe(false); expect(existsSync(path.join(outDir, "docs", "llms-full.txt"))).toBe(false); + expect(existsSync(path.join(outDir, "docs", "llms-full"))).toBe(false); expect(existsSync(path.join(outDir, "docs", "sitemap.xml"))).toBe(false);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/leadtype/src/cli.test.ts` around lines 409 - 412, Add an assertion to ensure the docs/llms-full directory is absent in the test that checks for "no website artifacts": locate the block in packages/leadtype/src/cli.test.ts where existsSync assertions run (the test using outDir and path.join for "llms-full.txt", "llms.txt", and "sitemap.xml") and add expect(existsSync(path.join(outDir, "docs", "llms-full"))).toBe(false); matching the style of the existing assertions so the directory (not just the file) is asserted missing.
♻️ Duplicate comments (3)
evals/run-llms-eval.ts (3)
166-257:⚠️ Potential issue | 🟠 Major | ⚡ Quick winWrap sandbox cleanup around the entire
runOnescope.
sandbox.cleanup()is only called in thefinallyblock starting at line 255. IfwriteTranscript(line 214) or any earlier awaited call throws after the sandbox is created (line 166), the temp directory leaks.Proposed fix: move try-finally to cover full sandbox lifetime
async function runOne(options: { ... }): Promise<RunResult> { ... const sandbox = await createLlmsSandbox({ fixtureDir, variant }); + try { const start = Date.now(); const transcriptCalls: ToolCall[] = []; ... await writeTranscript(sandbox.tempDir, transcript); - try { const expected = loadLlmsExpected(fixtureDir); ... return { ... }; } finally { await sandbox.cleanup(); } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@evals/run-llms-eval.ts` around lines 166 - 257, The temp sandbox created by createLlmsSandbox is not guaranteed to be cleaned up because the current try/finally only wraps the evaluation and archive block; move the try/finally so the sandbox is cleaned up for the entire runOne scope (i.e., immediately after sandbox is created) by wrapping everything from after createLlmsSandbox through the final await sandbox.cleanup() in a single try { ... } finally { await sandbox.cleanup(); } block; ensure writeTranscript, loadLlmsExpected, runVitest, archiveTranscript, and any early returns (the returned result object) remain inside the try so cleanup always runs.
56-65:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win
parsePositiveIntaccepts malformed inputs like"1.5"or"1foo".
Number.parseInt("1.5", 10)returns1, andNumber.parseInt("1foo", 10)also returns1. This silently accepts invalid CLI input instead of failing fast.Proposed fix: validate format before parsing
function parsePositiveInt(value: string | undefined, flag: string): number { if (value === undefined) { throw new Error(`${flag} requires a value`); } - const parsed = Number.parseInt(value, 10); - if (!Number.isInteger(parsed) || parsed < 1) { + const trimmed = value.trim(); + if (!/^[1-9]\d*$/.test(trimmed)) { throw new Error(`${flag} must be a positive integer, got ${value}`); } - return parsed; + return Number(trimmed); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@evals/run-llms-eval.ts` around lines 56 - 65, parsePositiveInt currently uses Number.parseInt which accepts strings like "1.5" or "1foo" and returns 1; update parsePositiveInt to validate the input format first (e.g., test value against a regex like /^[1-9]\d*$/ to ensure it is a whole positive integer string) and only then convert to a number (Number or parseInt) and return it; if the regex fails or value is undefined, throw the existing error messages. Use the function name parsePositiveInt as the location to change.
183-197:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAdd timeout to
generateTextto prevent indefinite hangs.The
generateTextcall lacks timeout protection. If a provider stalls or the network hangs, the entire eval matrix blocks indefinitely. TheaiSDK supports atimeoutoption (available in v6.0.16+).Proposed fix
+const GENERATE_TIMEOUT_MS = 120_000; + try { const result = await generateText({ model, system: SYSTEM_PROMPT, prompt: promptText, tools, stopWhen: stepCountIs(STEP_LIMIT), + timeout: GENERATE_TIMEOUT_MS, });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@evals/run-llms-eval.ts` around lines 183 - 197, The generateText call in run-llms-eval.ts can hang—add the ai SDK timeout option to the call to bound how long the provider can stall: update the generateText invocation (the one using model, SYSTEM_PROMPT, promptText, tools, stopWhen: stepCountIs(STEP_LIMIT)) to include a timeout: <ms> property (e.g., timeout: TIMEOUT_MS) and define TIMEOUT_MS as a configurable constant or env-backed value; ensure the project uses ai SDK v6.0.16+ so the timeout option is supported and let existing catch logic continue to record timeout errors.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@evals/lib/llms-sandbox.ts`:
- Around line 21-25: The current filter uses path.basename(src) which excludes
any nested files named PROMPT.md, EVAL.ts, or expected.json; change the
predicate in the filter (the anonymous function passed to filter) to only
exclude those names when they live at the fixture root (i.e., when the file is
directly inside the fixture directory). Implement this by computing the file's
path relative to the fixture root (or checking path.dirname(src) /
path.relative(fixtureRoot, src) and ensuring there are no path separators) and
only apply the basename exclusion when the relative path has no directory
component; keep the same excluded names (PROMPT.md, EVAL.ts, expected.json) but
scope them to the fixture root.
In `@evals/llms/exact-symbol-readability/PROMPT.md`:
- Line 1: The file PROMPT.md fails markdownlint MD041 because the first line is
body text instead of a top-level H1; add a single top-level heading as the very
first line (e.g., "# Prevent agent-readability artifacts" or a short summary of
the prompt) so the first-line-heading check passes and the rest of the prompt
content follows the heading. Ensure the H1 is before any existing text on the
first line.
In `@evals/llms/single-group-authoring/PROMPT.md`:
- Line 1: The file PROMPT.md is missing a top-level heading causing markdownlint
MD041; add a single top-level heading line at the very top (for example "#
Prompt" or "# Summary") before the existing body text to satisfy markdownlint,
ensuring the heading is the first line of the file so the rest of the content
(the prompt about Leadtype docs, frontmatter `group`, and optional fields)
remains unchanged.
In `@evals/run-eval.ts`:
- Around line 47-48: The function parseRequiredFlagValue currently treats any
token starting with "--" as a missing value but still accepts short-flag tokens
like "-h"; update parseRequiredFlagValue so it rejects any token that begins
with a single dash as a missing value (e.g., change the check from
value.startsWith("--") to value.startsWith("-") or equivalent) so short flags
cannot be consumed as values for required flags like --fixture/--model; ensure
the thrown Error message remains `${flag} requires a value`.
---
Outside diff comments:
In `@packages/leadtype/src/cli.test.ts`:
- Around line 409-412: Add an assertion to ensure the docs/llms-full directory
is absent in the test that checks for "no website artifacts": locate the block
in packages/leadtype/src/cli.test.ts where existsSync assertions run (the test
using outDir and path.join for "llms-full.txt", "llms.txt", and "sitemap.xml")
and add expect(existsSync(path.join(outDir, "docs", "llms-full"))).toBe(false);
matching the style of the existing assertions so the directory (not just the
file) is asserted missing.
---
Duplicate comments:
In `@evals/run-llms-eval.ts`:
- Around line 166-257: The temp sandbox created by createLlmsSandbox is not
guaranteed to be cleaned up because the current try/finally only wraps the
evaluation and archive block; move the try/finally so the sandbox is cleaned up
for the entire runOne scope (i.e., immediately after sandbox is created) by
wrapping everything from after createLlmsSandbox through the final await
sandbox.cleanup() in a single try { ... } finally { await sandbox.cleanup(); }
block; ensure writeTranscript, loadLlmsExpected, runVitest, archiveTranscript,
and any early returns (the returned result object) remain inside the try so
cleanup always runs.
- Around line 56-65: parsePositiveInt currently uses Number.parseInt which
accepts strings like "1.5" or "1foo" and returns 1; update parsePositiveInt to
validate the input format first (e.g., test value against a regex like
/^[1-9]\d*$/ to ensure it is a whole positive integer string) and only then
convert to a number (Number or parseInt) and return it; if the regex fails or
value is undefined, throw the existing error messages. Use the function name
parsePositiveInt as the location to change.
- Around line 183-197: The generateText call in run-llms-eval.ts can hang—add
the ai SDK timeout option to the call to bound how long the provider can stall:
update the generateText invocation (the one using model, SYSTEM_PROMPT,
promptText, tools, stopWhen: stepCountIs(STEP_LIMIT)) to include a timeout: <ms>
property (e.g., timeout: TIMEOUT_MS) and define TIMEOUT_MS as a configurable
constant or env-backed value; ensure the project uses ai SDK v6.0.16+ so the
timeout option is supported and let existing catch logic continue to record
timeout errors.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 5b48bb3b-7690-4b02-b6a1-a026c50dbf46
⛔ Files ignored due to path filters (4)
apps/example/src/generated/agent-readability.jsonis excluded by!**/generated/**apps/example/src/generated/docs-nav.jsonis excluded by!**/generated/**apps/example/src/generated/docs-search-content.jsonis excluded by!**/generated/**apps/example/src/generated/docs-search-index.jsonis excluded by!**/generated/**
📒 Files selected for processing (54)
apps/example/scripts/llm-generate-real.tsapps/example/scripts/llm-generate.tsapps/example/scripts/mdx-convert.tsapps/example/src/routeTree.gen.tsapps/example/src/routes/docs/reference/evals.tsxapps/example/tests/e2e/smoke.e2e.tsdocs/authoring/components.mdxdocs/authoring/frontmatter.mdxdocs/build/bundle-package-docs.mdxdocs/build/connect-docs-site.mdxdocs/build/optimize-docs-for-agents.mdxdocs/docs.config.tsdocs/how-it-works.mdxdocs/index.mdxdocs/methodology.mdxdocs/quickstart.mdxdocs/reference/cli.mdxdocs/reference/evals.mdxdocs/reference/llm.mdxdocs/reference/remark.mdxevals/README.mdevals/lib/llms-eval.tsevals/lib/llms-metrics.test.tsevals/lib/llms-metrics.tsevals/lib/llms-sandbox.tsevals/lib/llms-variants.tsevals/lib/transcript.tsevals/llms/ambiguous-output-routing/EVAL.tsevals/llms/ambiguous-output-routing/PROMPT.mdevals/llms/ambiguous-output-routing/expected.jsonevals/llms/cross-group-agent-flows/EVAL.tsevals/llms/cross-group-agent-flows/PROMPT.mdevals/llms/cross-group-agent-flows/expected.jsonevals/llms/exact-symbol-readability/EVAL.tsevals/llms/exact-symbol-readability/PROMPT.mdevals/llms/exact-symbol-readability/expected.jsonevals/llms/negative-vector-index/EVAL.tsevals/llms/negative-vector-index/PROMPT.mdevals/llms/negative-vector-index/expected.jsonevals/llms/single-group-authoring/EVAL.tsevals/llms/single-group-authoring/PROMPT.mdevals/llms/single-group-authoring/expected.jsonevals/llms/single-page-cli-flag/EVAL.tsevals/llms/single-page-cli-flag/PROMPT.mdevals/llms/single-page-cli-flag/expected.jsonevals/package.jsonevals/run-eval.tsevals/run-llms-eval.tsevals/vitest.config.tspackages/leadtype/src/cli.test.tspackages/leadtype/src/cli/generate.tspackages/leadtype/src/llm/llm.test.tspackages/leadtype/src/llm/llm.tspackages/leadtype/src/llm/readability.ts
📜 Review details
🧰 Additional context used
📓 Path-based instructions (5)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
**/*.{ts,tsx}: Use explicit types for function parameters and return values when they enhance clarity
Preferunknownoveranywhen the type is genuinely unknown
Use const assertions (as const) for immutable values and literal types
Leverage TypeScript's type narrowing instead of type assertions
Files:
evals/llms/negative-vector-index/EVAL.tsevals/llms/cross-group-agent-flows/EVAL.tsevals/llms/single-page-cli-flag/EVAL.tsevals/vitest.config.tsdocs/docs.config.tsapps/example/scripts/llm-generate.tsevals/lib/llms-sandbox.tsapps/example/scripts/llm-generate-real.tsevals/llms/exact-symbol-readability/EVAL.tsapps/example/scripts/mdx-convert.tsevals/lib/transcript.tsevals/lib/llms-metrics.tsevals/llms/single-group-authoring/EVAL.tsapps/example/tests/e2e/smoke.e2e.tsevals/run-eval.tspackages/leadtype/src/cli/generate.tspackages/leadtype/src/cli.test.tsevals/llms/ambiguous-output-routing/EVAL.tsevals/lib/llms-metrics.test.tsapps/example/src/routes/docs/reference/evals.tsxpackages/leadtype/src/llm/readability.tspackages/leadtype/src/llm/llm.tsevals/lib/llms-eval.tsapps/example/src/routeTree.gen.tsevals/lib/llms-variants.tspackages/leadtype/src/llm/llm.test.tsevals/run-llms-eval.ts
**/*.{js,ts,jsx,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
**/*.{js,ts,jsx,tsx}: Use meaningful variable names instead of magic numbers - extract constants with descriptive names
Use arrow functions for callbacks and short functions
Preferfor...ofloops over.forEach()and indexedforloops
Use optional chaining (?.) and nullish coalescing (??) for safer property access
Prefer template literals over string concatenation
Use destructuring for object and array assignments
Useconstby default,letonly when reassignment is needed, nevervar
Alwaysawaitpromises in async functions - don't forget to use the return value
Useasync/awaitsyntax instead of promise chains for better readability
Handle errors appropriately in async code with try-catch blocks
Don't use async functions as Promise executors
Removeconsole.log,debugger, andalertstatements from production code
ThrowErrorobjects with descriptive messages, not strings or other values
Usetry-catchblocks meaningfully - don't catch errors just to rethrow them
Prefer early returns over nested conditionals for error cases
Extract complex conditions into well-named boolean variables
Use early returns to reduce nesting
Prefer simple conditionals over nested ternary operators
Don't useeval()or assign directly todocument.cookie
Avoid spread syntax in accumulators within loops
Use top-level regex literals instead of creating them in loops
Prefer specific imports over namespace imports
Use descriptive names for functions, variables, and types for meaningful naming
Add comments for complex logic, but prefer self-documenting code
Files:
evals/llms/negative-vector-index/EVAL.tsevals/llms/cross-group-agent-flows/EVAL.tsevals/llms/single-page-cli-flag/EVAL.tsevals/vitest.config.tsdocs/docs.config.tsapps/example/scripts/llm-generate.tsevals/lib/llms-sandbox.tsapps/example/scripts/llm-generate-real.tsevals/llms/exact-symbol-readability/EVAL.tsapps/example/scripts/mdx-convert.tsevals/lib/transcript.tsevals/lib/llms-metrics.tsevals/llms/single-group-authoring/EVAL.tsapps/example/tests/e2e/smoke.e2e.tsevals/run-eval.tspackages/leadtype/src/cli/generate.tspackages/leadtype/src/cli.test.tsevals/llms/ambiguous-output-routing/EVAL.tsevals/lib/llms-metrics.test.tsapps/example/src/routes/docs/reference/evals.tsxpackages/leadtype/src/llm/readability.tspackages/leadtype/src/llm/llm.tsevals/lib/llms-eval.tsapps/example/src/routeTree.gen.tsevals/lib/llms-variants.tspackages/leadtype/src/llm/llm.test.tsevals/run-llms-eval.ts
**/*.{test,spec}.{js,ts,jsx,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
**/*.{test,spec}.{js,ts,jsx,tsx}: Write assertions insideit()ortest()blocks
Avoid done callbacks in async tests - use async/await instead
Don't use.onlyor.skipin committed code
Keep test suites reasonably flat - avoid excessivedescribenesting
Files:
packages/leadtype/src/cli.test.tsevals/lib/llms-metrics.test.tspackages/leadtype/src/llm/llm.test.ts
**/*.{jsx,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
**/*.{jsx,tsx}: Use function components over class components in React
Call hooks at the top level only, never conditionally
Specify all dependencies in hook dependency arrays correctly
Use thekeyprop for elements in iterables (prefer unique IDs over array indices)
Nest children between opening and closing tags instead of passing as props
Don't define components inside other components
AvoiddangerouslySetInnerHTMLunless absolutely necessary
Use proper image components (e.g., Next.js<Image>) over<img>tags
Use Next.js<Image>component for images
Usenext/heador App Router metadata API for head elements in Next.js
Use Server Components for async data fetching instead of async Client Components in Next.js
Use ref as a prop instead ofReact.forwardRefin React 19+
Files:
apps/example/src/routes/docs/reference/evals.tsx
**/*.{jsx,tsx,html}
📄 CodeRabbit inference engine (AGENTS.md)
**/*.{jsx,tsx,html}: Use semantic HTML and ARIA attributes for accessibility: provide meaningful alt text for images, use proper heading hierarchy, add labels for form inputs, include keyboard event handlers alongside mouse events, use semantic elements instead of divs with roles
Addrel="noopener"when usingtarget="_blank"on links
Files:
apps/example/src/routes/docs/reference/evals.tsx
🪛 ast-grep (0.42.1)
evals/lib/llms-eval.ts
[warning] 48-48: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(pattern, "i")
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html
(regexp-from-variable)
[warning] 51-51: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(pattern, "i")
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html
(regexp-from-variable)
🪛 LanguageTool
docs/build/connect-docs-site.mdx
[uncategorized] ~167-~167: Did you mean the formatting language “Markdown” (= proper noun)?
Context: ... public/llms-full.txt — all generated markdown docs flattened into one fallback file. ...
(MARKDOWN_NNP)
docs/reference/remark.mdx
[uncategorized] ~108-~108: Did you mean the formatting language “Markdown” (= proper noun)?
Context: ...file, finds the named type, and emits a markdown table with one row per property. The re...
(MARKDOWN_NNP)
docs/authoring/components.mdx
[uncategorized] ~9-~9: Did you mean the formatting language “Markdown” (= proper noun)?
Context: ...ipeline can flatten each component into markdown for agents, search, and llms-full.txt...
(MARKDOWN_NNP)
docs/how-it-works.mdx
[uncategorized] ~73-~73: Did you mean the formatting language “Markdown” (= proper noun)?
Context: ...lms.txt over HTTP and follow page-level markdown links first. The root /llms-full.txt fi...
(MARKDOWN_NNP)
docs/reference/llm.mdx
[uncategorized] ~12-~12: Did you mean the formatting language “Markdown” (= proper noun)?
Context: ....txtfallback containing all generated markdown docs. Pairs withgenerateLlmsTxt`. - *...
(MARKDOWN_NNP)
[uncategorized] ~37-~37: Did you mean the formatting language “Markdown” (= proper noun)?
Context: ...| <out>/llms-full.txt | All generated markdown docs flattened into one fallback file. ...
(MARKDOWN_NNP)
🪛 markdownlint-cli2 (0.22.1)
evals/llms/exact-symbol-readability/PROMPT.md
[warning] 1-1: First line in a file should be a top-level heading
(MD041, first-line-heading, first-line-h1)
evals/llms/single-group-authoring/PROMPT.md
[warning] 1-1: First line in a file should be a top-level heading
(MD041, first-line-heading, first-line-h1)
🔍 Remote MCP Context7
Summary of additional, review-relevant facts
-
TanStack Router
- createFileRoute(...) is the documented pattern for file-based routes and supports exporting Route constants built with createFileRoute('/path')({ component, loader, ... }). This matches the PR's new file route at /docs/reference/evals created with createFileRoute.
- TypeScript module augmentation (declare module '@tanstack/react-router' { interface Register { router: typeof router } }) is the recommended way to register router types globally so generated routeTree.gen.ts module augmentation is idiomatic.
-
Vitest
- defineConfig({ test: { include: [...] } }) supports arbitrary glob patterns resolved relative to the config root; including EVAL.ts (e.g., 'evals/**/EVAL.ts') is a supported way to discover non-standard test filenames. This corroborates the PR change adding EVAL.ts discovery to evals/vitest.config.ts.
Tools/sources used
- Context7_resolve-library-id → resolved /tanstack/router and /vitest-dev/vitest.
- Context7_query-docs → documentation pages for TanStack Router (createFileRoute, module augmentation) and Vitest (defineConfig + test.include).
🔇 Additional comments (45)
docs/reference/remark.mdx (1)
108-108: LGTM! Documentation accurately reflects the new architecture.The updated wording correctly describes the new monolithic artifact layout, where
remarkTypeTableToMarkdownoutput is emitted to the rootllms-full.txtinstead of topic-scoped bundles. This aligns with the PR's objective of consolidating to a single root fallback file.(Note: The static analysis hint suggesting "Markdown" capitalization is a false positive—this file consistently uses lowercase "markdown" as a common noun throughout.)
docs/authoring/components.mdx (1)
9-9: Good terminology update for the monolithic full-context artifact.This wording now correctly points readers to
llms-full.txtas the root fallback output.apps/example/scripts/mdx-convert.ts (1)
40-40: Clean output reset before conversion looks correct.Removing
outDirbefore regeneration is a solid safeguard against stale docs artifacts.evals/llms/ambiguous-output-routing/PROMPT.md (1)
1-5: Prompt structure is clean and lint-friendly.The H1 at Line 1 and explicit
ANSWER.mdinstruction make this fixture clear and consistent.apps/example/scripts/llm-generate.ts (1)
4-6: Comment clarifications are helpful and accurate.These updates better explain shared artifact ownership and frontmatter-driven membership without changing behavior.
Also applies to: 64-64
evals/llms/negative-vector-index/PROMPT.md (1)
1-4: This prompt is well-formed and consistent with fixture conventions.Heading + concise task framing + explicit
ANSWER.mddestination are all in good shape.docs/build/bundle-package-docs.mdx (1)
62-62: Docs updates correctly reflect the new public artifact set.The revised wording is consistent with the root
llms-full.txtfallback model.Also applies to: 178-178
evals/llms/cross-group-agent-flows/PROMPT.md (1)
1-5: Clear, scoped fixture prompt with good structure.This is ready as-is for eval harness usage.
docs/index.mdx (1)
17-17: Documentation updates align with the new artifact structure.The Mermaid diagram, step description, and quickstart link text are updated consistently to reflect the root
llms-full.txtmonolith output. The terminology shift from "generated bundle" to "generated output" matches the broader docs changes.Also applies to: 56-56, 65-65
docs/methodology.mdx (1)
25-25: Documentation correctly reflects the simplified artifact surface.The methodology page now accurately describes the root
llms-full.txtfallback pattern instead of topic-scoped bundles, consistent with the PR's consolidation of full-context output.Also applies to: 40-40
packages/leadtype/src/llm/readability.ts (1)
20-23: Pattern update correctly reflects the new artifact layout.
DOCS_AGENT_ARTIFACT_PATTERNnow excludes/docs/llms-full*.txtpaths since full-context output moved to root. TheROOT_AGENT_ARTIFACT_PATTERNstill matches/llms-full.txt, soisAgentReadabilityArtifactPathwill recognize the new monolithic artifact location.docs/quickstart.mdx (1)
50-51: Quickstart accurately documents the simplified output structure.The step title, file tree, and inspection instructions are updated to reflect the single
llms-full.txtfile at the output root, matching the implementation changes.Also applies to: 68-68, 81-81
evals/llms/negative-vector-index/expected.json (1)
1-15: Eval fixture structure is reasonable for negative assertion testing.The combination of requiring keywords like "does not" with
forbiddenAnswerPatternsblocking the affirmative phrase provides a reasonable guard against false positives. TheexpectedPagestargetingdocs/reference/search.mdcorrectly anchors the expected documentation source.packages/leadtype/src/llm/llm.ts (3)
594-600: Helper correctly strips duplicate title headings.
stripLeadingTitleHeadingsafely handles edge cases (empty content, missing newlines) via the nullish coalescing onlines[0]and returns the original content when no match occurs.
602-636: Full-context document renderer produces clean consolidated output.The function correctly:
- Generates an index of included pages with links
- Strips leading H1 from each page's content to avoid duplication with the injected title block
- Handles empty pages gracefully with a fallback message
666-690: Generation correctly moves full-context output to root and cleans stale artifacts.The cleanup of both
llms-full/directory anddocs/llms-full.txt(lines 684-686) ensures migration from the old structure is clean. TheresolveGroupscall validates config consistency even though the groups aren't used for routing anymore.packages/leadtype/src/cli/generate.ts (2)
76-76: Type and usage text correctly reflect the new artifact location.The
GenerateResult.filesproperty rename fromdocsLlmsFullTxttollmsFullTxtand the updatedGENERATE_USAGEtext accurately document that the full-context artifact is now at the output root.Also applies to: 96-101
463-463: Site mode result correctly reports root-level path.The
llmsFullTxtpath is set topath.join(outDir, "llms-full.txt"), matching the generation logic inllm.ts.docs/reference/cli.mdx (1)
28-29: CLI reference documentation accurately reflects the implementation changes.The flag descriptions, JSON output shape example, and cross-references are all updated to match the new monolithic
llms-full.txtartifact at the output root. ThellmsFullTxtproperty in the example JSON matches theGenerateResulttype definition.Also applies to: 47-47, 65-65, 100-100
evals/lib/llms-variants.ts (1)
212-213:/docs/llms.txtis still not materialized in the sandbox variant output.Line 212 writes only root
llms.txt; the docs-scoped index expected by the hosted contract is still absent from generated fixture files.evals/llms/exact-symbol-readability/expected.json (1)
1-11: Fixture expectations are aligned with the new artifact model.The patterns and expected page/group checks are coherent with validating root
llms-full.txtbehavior.evals/llms/single-page-cli-flag/PROMPT.md (1)
1-5: Prompt structure and instructions look good.The heading + explicit “write to
ANSWER.md” instruction makes this fixture unambiguous for eval runs.evals/llms/ambiguous-output-routing/EVAL.ts (1)
1-3: Fixture entrypoint wiring is clean and correct.Using
new URL(".", import.meta.url)here keeps fixture resolution local and reproducible.evals/package.json (1)
9-9: New eval script is a good addition.
evals:llmsis clear and directly maps to the new harness entrypoint.docs/build/connect-docs-site.mdx (1)
12-12: Docs updates are consistent with the monolithicllms-full.txtchange.The flow diagram, verification checklist, and new evals reference are all aligned with the reduced public artifact surface.
Also applies to: 167-167, 188-188
evals/llms/single-group-authoring/EVAL.ts (1)
1-3: Standardized fixture entrypoint looks good.This matches the shared
assertLlmsFixturepattern and keeps test wiring consistent.apps/example/tests/e2e/smoke.e2e.ts (1)
186-187: Updated e2e assertions correctly track the new full-context output.These checks now validate the monolithic
/llms-full.txtcontent instead of router-specific wording.evals/llms/cross-group-agent-flows/EVAL.ts (1)
1-3: Entry-point implementation is solid.This is concise and consistent with the other LLMS fixture evaluators.
evals/llms/single-page-cli-flag/EVAL.ts (1)
1-3: Looks good — fixture entrypoint is minimal and consistent.This keeps fixture execution deterministic by resolving against the local eval directory.
evals/vitest.config.ts (1)
3-6: Config update is aligned with the eval harness goals.Including both unit tests and
EVAL.tsentrypoints in one place is clear and maintainable.evals/llms/exact-symbol-readability/EVAL.ts (1)
1-3: Nice and consistent with the shared fixture pattern.This keeps eval behavior centralized in
assertLlmsFixture.docs/docs.config.ts (1)
7-10: Documentation contract updates are coherent and internally consistent.The wording now clearly communicates root-level
llms-full.txtas fallback behavior.Also applies to: 23-23, 48-48
evals/README.md (1)
57-83: Great addition — benchmark variants and invocation examples are clear.The monolith vs router distinction is especially well documented.
apps/example/scripts/llm-generate-real.ts (1)
4-8: Good alignment with the new root fallback behavior.Both header docs and
agentGuidancenow consistently describe the intended agent path.Also applies to: 36-36
docs/reference/evals.mdx (1)
23-30: Strong reference doc — variant matrix, outcome table, and default contract are all clear.This is a solid addition for explaining why the default artifact shape changed.
Also applies to: 33-43, 47-57
packages/leadtype/src/cli.test.ts (1)
122-125: Test contract updates for rootllms-full.txtlook solid.These assertions correctly pin both filesystem shape and generated content semantics.
Also applies to: 147-150, 177-195
apps/example/src/routes/docs/reference/evals.tsx (1)
1-14: LGTM!The route follows the established TanStack Router file-based routing pattern. The
createFileRouteusage withcomponentandheadoptions is idiomatic, and the simple wrapper component correctly renders the MDX document.docs/build/optimize-docs-for-agents.mdx (1)
13-14: LGTM!Documentation updates correctly reflect the new monolithic
/llms-full.txtartifact layout. The file tree, artifact lists, and checklist are all consistent with the PR's change from docs-scoped full-context files to a single root fallback.Also applies to: 52-52, 103-103, 225-225, 264-264
packages/leadtype/src/llm/llm.test.ts (4)
209-268: LGTM!Test correctly validates the new monolithic output model: checks that root
llms-full.txtexists with expected content, and verifies docs-scoped full-context artifacts (docs/llms-full.txt,docs/llms-full/) are absent.
270-301: LGTM!The multi-group page deduplication test correctly validates that shared pages appear exactly once in the monolithic
llms-full.txtusing a regex match count.
303-348: LGTM!The stale cleanup test properly seeds legacy docs-scoped full-context files and verifies they're removed after generation while the new root
llms-full.txtexists.
748-754: LGTM!Artifact path recognition tests correctly updated:
/llms-full.txtremains recognized while/docs/llms-full.txtand nested paths are no longer treated as agent-readability artifacts.evals/run-llms-eval.ts (2)
303-317: LGTM!The
toSafeArchivePathhelper correctly guards against path traversal by rejecting absolute paths, normalized traversal patterns, and paths containing..segments.
319-379: LGTM!The Vitest spawn wrapper correctly implements timeout protection with
SIGKILLafterVITEST_TIMEOUT_MS, clears the timeout on normal completion, and handles spawn errors gracefully.apps/example/src/routeTree.gen.ts (1)
28-28: LGTM!Auto-generated route tree correctly includes the new
/docs/reference/evalsroute across all type maps, module augmentations, and route children structures.Also applies to: 123-127, 221-221, 251-251, 284-284, 318-318, 348-348, 380-380, 513-519, 627-627, 647-647
- Make llms variant dispatch exhaustive in materializeLlmsVariant and renderLlmsTxt so a new LLMS_VARIANT_VALUES entry fails to compile until both call sites handle it. - Strip any leading H1 from page content (not only the exact frontmatter title) so the monolithic llms-full.txt does not double-print titles when the source markdown's H1 differs from the frontmatter title. - Scope the llms eval toolset to read-only docs tools; npm is unused for the hosted-docs benchmark and only burned model steps. - Unify run-eval.ts parsePositiveInt with the stricter run-llms-eval.ts version so missing values throw instead of silently defaulting to 1. - Remove dead TRANSCRIPT_PATH guard after readTranscript and refresh the evals README layout to cover both benchmarks.
Summary
llms-fullgeneration to publish a single root/llms-full.txtfull-context fallback instead of docs-scoped and per-group full-context artifacts.AGENTS.md, while cleaning stale generateddocs/llms-full*artifacts during generation.Why
The eval results showed the root
llms-full.txtmonolith was the only tested hosted-docs format that passed all six fixtures on both Claude Opus 4.7 and GPT-5.5. This keeps the default public artifact surface small while leaving grouped/router variants available in the eval harness for future larger-corpus testing.Validation
bun run --filter example buildbun test packages/leadtype/src/llm/llm.test.ts packages/leadtype/src/cli.test.tscd evals && bun test libbun x ultracite checkgit diff --checkThe branch is rebased onto
mainand the compare is 1 commit ahead, 0 behind.