Skip to content

Support Vercel Agent Readability spec end-to-end (sitemap, llms.txt, markdown mirrors)#19

Merged
KayleeWilliams merged 5 commits into
mainfrom
KayleeWilliams/support-vercel-agent-readability-spec-sitemap.xm
May 10, 2026
Merged

Support Vercel Agent Readability spec end-to-end (sitemap, llms.txt, markdown mirrors)#19
KayleeWilliams merged 5 commits into
mainfrom
KayleeWilliams/support-vercel-agent-readability-spec-sitemap.xm

Conversation

@KayleeWilliams
Copy link
Copy Markdown
Collaborator

Summary

Wires up the Vercel Agent Readability discovery layer — /llms.txt, markdown mirrors, JSON-LD/canonical/alternate metadata, sitemap, robots, and an agent-readability.json manifest — and ships runtime helpers that work across edge runtimes, plus a complete dogfooded reference in apps/example running through nitro middleware in dev + preview + prod.

  • Build-time generation (leadtype/llm): generateAgentReadabilityArtifacts produces a versioned manifest plus docs-scoped sitemap.xml/sitemap.md/robots.txt. llms-full.txt routing files now use root-relative URLs so they're origin-agnostic.
  • Runtime helpers (leadtype/llm/readability, fs-free, edge-safe):
    • createAgentMarkdownResponse — async, returns a Web Response. Async-tolerant readMarkdownFile works in CF Workers / Vercel Edge / KV / R2.
    • createSitemapXmlResponse / createSitemapMarkdownResponse / createRobotsTxtResponse — rebase manifest URLs against the live requestOrigin, no string-replace hacks.
    • createDocsHead — framework-neutral { meta, links } for canonical, alternate, og:*, JSON-LD.
  • Polish: expanded AI user-agent default list (GPTBot/ClaudeBot/AmazonBot/Bingbot/MetaExternalAgent/ByteSpider/PerplexityBot/MistralBot/AppleBot/YouBot/…), configurable userAgentPattern, q-value-aware Accept parsing, CRLF-tolerant frontmatter, Cache-Control: public, max-age=300, must-revalidate defaults, manifest version runtime assertion. Fixed a bug where /llms-full.txt was being shadowed by the missing-page handler when an agent sent Accept: text/markdown.
  • Dogfooded end-to-end in apps/example/server/middleware/agent-readability.ts — a single nitro+h3 middleware handles every artifact path + markdown content negotiation in dev, preview, and production. Replaces a 200-line dev-only Vite plugin and the build-time URL string-replace.
  • Refactor: readability.ts is now the source of truth for runtime helpers; llm.ts re-exports from it (drops ~200 lines of duplication).

Test plan

  • bun test in packages/leadtype — 122 passing
  • bun run check-types in packages/leadtype and apps/example — clean
  • bun run build in packages/leadtype — emits dist/llm/readability.{js,d.ts} with the new helpers
  • Playwright e2e (apps/example/tests/e2e/smoke.e2e.ts) — 10 passing, including the new /llms-full.txt regression and Cache-Control assertion
  • Manual curl against the running dev server: /sitemap.xml and /robots.txt reflect the live origin (not the build-time https://docs.example.com); User-Agent: AmazonBot/1.0 triggers a markdown response; HEAD /docs/quickstart.md returns headers with empty body
  • Reviewer: run npx @vercel/agent-readability audit http://localhost:5173/docs against the dev server and confirm no regressions

🤖 Generated with Claude Code

…ase, dogfooded via nitro

- readability.ts becomes the source of truth (drops ~200 lines of duplication
  with llm.ts); fixes /llms-full.txt being shadowed by the missing-page handler
- createAgentMarkdownResponse is now async + returns Web Response; supports
  async readMarkdownFile so edge runtimes (CF Workers, Vercel Edge) can plug in
- Adds createSitemapXmlResponse / createSitemapMarkdownResponse /
  createRobotsTxtResponse runtime regenerators that rebase to the live origin,
  and createDocsHead for canonical/alternate/og/json-ld metadata
- Tightens AI UA list, q-value parsing on Accept, CRLF in frontmatter; adds
  configurable userAgentPattern, Cache-Control defaults, manifest version guard
- llms-full.txt routing files use root-relative URLs so they're origin-agnostic
- apps/example: server/middleware/agent-readability.ts handles every artifact
  path + markdown content negotiation in dev/preview/prod via nitro+h3,
  replacing the 200-line dev-only Vite plugin
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 10, 2026

Review Change Stack

Caution

Review failed

Pull request was closed or merged during review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 059a5c8d-1b94-41aa-a0e6-09477c9fcddb

📥 Commits

Reviewing files that changed from the base of the PR and between 552ab85 and ab64bca.

📒 Files selected for processing (1)
  • apps/example/server/utils/agent-readability.ts
📜 Recent review details
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{ts,tsx}: Use explicit types for function parameters and return values when they enhance clarity
Prefer unknown over any when the type is genuinely unknown
Use const assertions (as const) for immutable values and literal types
Leverage TypeScript's type narrowing instead of type assertions

Files:

  • apps/example/server/utils/agent-readability.ts
**/*.{js,ts,jsx,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{js,ts,jsx,tsx}: Use meaningful variable names instead of magic numbers - extract constants with descriptive names
Use arrow functions for callbacks and short functions
Prefer for...of loops over .forEach() and indexed for loops
Use optional chaining (?.) and nullish coalescing (??) for safer property access
Prefer template literals over string concatenation
Use destructuring for object and array assignments
Use const by default, let only when reassignment is needed, never var
Always await promises in async functions - don't forget to use the return value
Use async/await syntax instead of promise chains for better readability
Handle errors appropriately in async code with try-catch blocks
Don't use async functions as Promise executors
Remove console.log, debugger, and alert statements from production code
Throw Error objects with descriptive messages, not strings or other values
Use try-catch blocks meaningfully - don't catch errors just to rethrow them
Prefer early returns over nested conditionals for error cases
Extract complex conditions into well-named boolean variables
Use early returns to reduce nesting
Prefer simple conditionals over nested ternary operators
Don't use eval() or assign directly to document.cookie
Avoid spread syntax in accumulators within loops
Use top-level regex literals instead of creating them in loops
Prefer specific imports over namespace imports
Use descriptive names for functions, variables, and types for meaningful naming
Add comments for complex logic, but prefer self-documenting code

Files:

  • apps/example/server/utils/agent-readability.ts
🔍 Remote MCP Context7

Summary of additional, concrete facts relevant to reviewing this PR

  • Vercel documents an "Agent Resources" surface and exposes a machine-readable llms-full.txt at https://vercel.com/docs/llms-full.txt (Agent Resources / llms-full.txt) — relevant to the PR's llms.txt / llms-full.txt outputs and markdown-mirror endpoints.

Sources / tool calls

  • Context7_resolve-library-id → resolved Vercel-related libraries (included Vercel AI SDK metadata)
  • Context7_query-docs → Vercel docs (Agent Resources / llms-full.txt reference)
🔇 Additional comments (2)
apps/example/server/utils/agent-readability.ts (2)

22-34: Harden forwarded-header origin derivation before emitting absolute URLs.

Line 23 and Line 26 still trust x-forwarded-host / x-forwarded-proto directly; this can allow origin poisoning if those headers are not sanitized upstream. Please validate protocol/host and fall back to getRequestURL(event).origin for malformed values.


46-61: Good async read path and selective missing-file handling.

This now uses non-blocking file reads and correctly rethrows non-missing-file errors instead of masking runtime faults.


📝 Walkthrough

Summary by CodeRabbit

  • New Features
    • Agent-readability discovery layer: site exposes sitemap/robots/manifest and serves markdown to agents, plus generated docs head metadata and canonical/alternate links.
  • Documentation
    • New "Optimize Docs for Agents" guide and updated docs describing discovery artifacts, content negotiation, and verification steps.
  • Tests
    • End-to-end and CLI tests updated to cover agent-readability artifacts and docs/head behavior.

Walkthrough

Implements Vercel Agent Readability specification: adds build-time artifact generation and manifest, runtime readability helpers and response builders, package/CLI exports, example app middleware and docs-head wiring, route additions, expanded tests, and documentation.

Changes

Agent Readability Spec Implementation

Layer / File(s) Summary
Build script & Example integration
apps/example/scripts/llm-generate.ts, apps/example/package.json
Build script now calls generateAgentReadabilityArtifacts, writes agent-readability.json, and removes stale sitemap/robots; example npm scripts run pipeline:build.
Server Middleware
apps/example/server/middleware/agent-readability.ts
Nitro middleware serves discovery endpoints (/sitemap.xml, /sitemap.md, /robots.txt, docs variants) and delegates other paths to createAgentMarkdownResponse.
Example Utilities
apps/example/server/utils/agent-readability.ts
Loads generated manifest as typed agentReadabilityManifest, derives request origin from forwarded headers, and reads markdown mirror files safely, returning null for missing-file errors.
Route Tree & New Page
apps/example/src/routeTree.gen.ts, apps/example/src/routes/docs/build/optimize-docs-for-agents.tsx
Adds new /docs/build/optimize-docs-for-agents route to generated route tree with typing and registers it under docs children.
Docs Head Wiring
apps/example/src/lib/docs-head.ts, apps/example/src/routes/docs/...
Adds typed createDocsHead(urlPath) and wires it into many docs routes to supply metadata (title, description, JSON-LD, canonical/alternate links).
Vite / TSConfig / Scripts
apps/example/vite.config.ts, apps/example/tsconfig.json, apps/example/package.json
Removes legacy Vite markdown negotiation plugin, sets nitro serverDir, includes server/**/* in TS include, adds path alias leadtype/llm/readability, and updates scripts to use pipeline:build.
Search Index Exclusion
packages/leadtype/src/search/node.ts
Excludes generated markdown files (e.g., sitemap.md) from readMarkdownDocs indexing.
LLM Build Changes
packages/leadtype/src/llm/llm.ts
Adds lastModified to MarkdownDoc, switches rendered links to markdown-relative url, computes lastModified from frontmatter/mtime, and implements generateAgentReadabilityArtifacts to emit docs/ sitemaps, robots, and manifest.
Readability Runtime Module
packages/leadtype/src/llm/readability.ts
New runtime-side module exporting types and helpers for agent readability: manifest/page/navigation types, Accept/user-agent negotiation, markdown mirror resolution, header builders, frontmatter enrichment, JSON-LD, sitemap/robots renderers, and createDocsHead.
Agent Markdown Response
packages/leadtype/src/llm/readability.ts
Implements createAgentMarkdownResponse with manifest validation, method gating, content negotiation, mirror resolution, markdown reading/enrichment, missing-page rendering, and response header assembly.
Discovery Response Builders
packages/leadtype/src/llm/readability.ts
Provides pure renderers and Response factories for sitemap XML/Markdown and robots.txt with origin rebasing and cache-control handling.
Package Exports & Build
packages/leadtype/package.json, packages/leadtype/rollup.config.ts, packages/leadtype/src/llm/index.ts, packages/leadtype/src/index.ts
Adds ./llm/readability export, Rollup entry for llm/readability, and reorganizes barrel exports to expose build-time generator and runtime helpers/types.
CLI Integration
packages/leadtype/src/cli/generate.ts, packages/leadtype/src/cli.test.ts
Extends GenerateResult.files with agent-readability outputs, updates CLI usage text, invokes generateAgentReadabilityArtifacts in site mode, and updates tests to assert new outputs.
Unit & Integration Tests
packages/leadtype/src/llm/llm.test.ts, packages/leadtype/src/cli.test.ts
Adds extensive tests for generateAgentReadabilityArtifacts, readability helpers (JSON-LD, negotiation, enrichment), artifact renderers, and createDocsHead behavior.
E2E Tests
apps/example/tests/e2e/smoke.e2e.ts
Playwright tests updated/expanded to validate discovery endpoints, sitemap/robots contents, llms endpoints, content negotiation for markdown, docs metadata, markdown mirrors, and markdown 404 behavior.
Documentation
docs/build/connect-docs-site.mdx, docs/build/optimize-docs-for-agents.mdx, docs/reference/cli.mdx, docs/reference/llm.mdx, docs/quickstart.mdx, docs/how-it-works.mdx, docs/index.mdx
Adds optimize-docs-for-agents guide, updates connect-docs-site with runtime wiring, updates CLI/LLM references and quickstart to include agent-readability artifacts and verification steps.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

Possibly related PRs

  • inthhq/leadtype#14: Related through Rollup/package export and build configuration overlap.
  • inthhq/docs#11: Related via example app route/docs wiring and tests.
  • inthhq/leadtype#13: Related through overlapping docs/site pipeline and example app edits.

Poem

🐰 A small rabbit hops and reads the site,

Sitemaps, manifests glowing bright.
Markdown mirrors, headers neat—so clever!
Agents feast on docs now and ever.
Hooray for manifests—hop, hop, delight!

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 667a9075d2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +844 to +848
renderSitemapMarkdown({
product: { name: config.productName ?? config.manifest.product.name },
navigation: config.navigation ?? config.manifest.navigation,
pages: rebased,
}),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Include merged pages in the markdown sitemap

When a host passes pages: [...manifest.pages, ...marketingPages] to merge non-docs routes, this response still renders with the manifest navigation tree, and renderSitemapMarkdown only emits pages reachable from that navigation. As a result, the same merged pages that appear in /sitemap.xml are silently omitted from /sitemap.md, despite the shared pages option being documented for sitemap merging; this affects sites that add blog/marketing/changelog pages at runtime.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@apps/example/server/utils/agent-readability.ts`:
- Around line 35-41: The readMarkdownFile function currently uses readFileSync
which blocks the event loop; change it to an asynchronous implementation by
replacing readFileSync with readFile imported from "node:fs/promises", update
the function signature readMarkdownFile to return Promise<string | null> and
make it async, and then update any callers/middleware that invoke
readMarkdownFile to await the result (e.g., where readMarkdownFile is used in
your middleware or handlers) so the non-blocking pattern is preserved.

In `@apps/example/src/lib/docs-head.ts`:
- Line 8: Replace the type assertion on the variable named manifest (currently
written as assigning agentReadability with "as AgentReadabilityManifest") with
an explicit type annotation so the declaration reads as manifest having type
AgentReadabilityManifest and is initialized from agentReadability; also apply
the same change to the other occurrence where agentReadability is asserted to
AgentReadabilityManifest (the similar manifest/assignment in
agent-readability.ts) so both sites use explicit type annotations instead of
"as" assertions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 12fd800b-bb8e-4d94-8da4-5bfdc071b26a

📥 Commits

Reviewing files that changed from the base of the PR and between f64df00 and 667a907.

⛔ Files ignored due to path filters (4)
  • apps/example/src/generated/agent-readability.json is excluded by !**/generated/**
  • apps/example/src/generated/docs-nav.json is excluded by !**/generated/**
  • apps/example/src/generated/docs-search-content.json is excluded by !**/generated/**
  • apps/example/src/generated/docs-search-index.json is excluded by !**/generated/**
📒 Files selected for processing (45)
  • apps/example/package.json
  • apps/example/scripts/llm-generate.ts
  • apps/example/server/middleware/agent-readability.ts
  • apps/example/server/utils/agent-readability.ts
  • apps/example/src/lib/docs-head.ts
  • apps/example/src/routeTree.gen.ts
  • apps/example/src/routes/docs/authoring/components.tsx
  • apps/example/src/routes/docs/authoring/frontmatter.tsx
  • apps/example/src/routes/docs/build/bundle-package-docs.tsx
  • apps/example/src/routes/docs/build/connect-docs-site.tsx
  • apps/example/src/routes/docs/build/optimize-docs-for-agents.tsx
  • apps/example/src/routes/docs/build/validate-in-ci.tsx
  • apps/example/src/routes/docs/how-it-works.tsx
  • apps/example/src/routes/docs/index.tsx
  • apps/example/src/routes/docs/methodology.tsx
  • apps/example/src/routes/docs/quickstart.tsx
  • apps/example/src/routes/docs/reference/cli.tsx
  • apps/example/src/routes/docs/reference/convert.tsx
  • apps/example/src/routes/docs/reference/lint.tsx
  • apps/example/src/routes/docs/reference/llm.tsx
  • apps/example/src/routes/docs/reference/remark.tsx
  • apps/example/src/routes/docs/reference/search.tsx
  • apps/example/src/routes/index.tsx
  • apps/example/tests/e2e/smoke.e2e.ts
  • apps/example/tsconfig.json
  • apps/example/vite.config.ts
  • docs/build/connect-docs-site.mdx
  • docs/build/optimize-docs-for-agents.mdx
  • docs/docs.config.ts
  • docs/how-it-works.mdx
  • docs/index.mdx
  • docs/quickstart.mdx
  • docs/reference/cli.mdx
  • docs/reference/llm.mdx
  • packages/leadtype/package.json
  • packages/leadtype/src/cli.test.ts
  • packages/leadtype/src/cli/generate.ts
  • packages/leadtype/src/index.ts
  • packages/leadtype/src/internal/package-surface.test.ts
  • packages/leadtype/src/llm/index.ts
  • packages/leadtype/src/llm/llm.test.ts
  • packages/leadtype/src/llm/llm.ts
  • packages/leadtype/src/llm/readability.ts
  • packages/leadtype/src/search/node.ts
  • packages/leadtype/tsup.config.ts
📜 Review details
🧰 Additional context used
📓 Path-based instructions (6)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{ts,tsx}: Use explicit types for function parameters and return values when they enhance clarity
Prefer unknown over any when the type is genuinely unknown
Use const assertions (as const) for immutable values and literal types
Leverage TypeScript's type narrowing instead of type assertions

Files:

  • apps/example/src/routes/docs/reference/search.tsx
  • apps/example/src/routes/docs/how-it-works.tsx
  • apps/example/src/routes/docs/build/bundle-package-docs.tsx
  • docs/docs.config.ts
  • apps/example/src/routes/docs/quickstart.tsx
  • apps/example/src/routes/docs/reference/convert.tsx
  • apps/example/src/routes/docs/reference/cli.tsx
  • apps/example/src/routes/docs/methodology.tsx
  • apps/example/src/routes/docs/reference/llm.tsx
  • apps/example/src/lib/docs-head.ts
  • packages/leadtype/tsup.config.ts
  • apps/example/src/routes/index.tsx
  • apps/example/src/routes/docs/reference/remark.tsx
  • apps/example/src/routes/docs/build/validate-in-ci.tsx
  • apps/example/src/routes/docs/build/optimize-docs-for-agents.tsx
  • apps/example/src/routes/docs/authoring/components.tsx
  • apps/example/src/routes/docs/build/connect-docs-site.tsx
  • apps/example/src/routes/docs/reference/lint.tsx
  • apps/example/src/routes/docs/authoring/frontmatter.tsx
  • apps/example/server/middleware/agent-readability.ts
  • packages/leadtype/src/index.ts
  • apps/example/src/routes/docs/index.tsx
  • apps/example/server/utils/agent-readability.ts
  • packages/leadtype/src/search/node.ts
  • apps/example/scripts/llm-generate.ts
  • packages/leadtype/src/internal/package-surface.test.ts
  • packages/leadtype/src/cli/generate.ts
  • packages/leadtype/src/cli.test.ts
  • packages/leadtype/src/llm/index.ts
  • apps/example/tests/e2e/smoke.e2e.ts
  • packages/leadtype/src/llm/llm.test.ts
  • apps/example/vite.config.ts
  • apps/example/src/routeTree.gen.ts
  • packages/leadtype/src/llm/readability.ts
  • packages/leadtype/src/llm/llm.ts
**/*.{js,ts,jsx,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{js,ts,jsx,tsx}: Use meaningful variable names instead of magic numbers - extract constants with descriptive names
Use arrow functions for callbacks and short functions
Prefer for...of loops over .forEach() and indexed for loops
Use optional chaining (?.) and nullish coalescing (??) for safer property access
Prefer template literals over string concatenation
Use destructuring for object and array assignments
Use const by default, let only when reassignment is needed, never var
Always await promises in async functions - don't forget to use the return value
Use async/await syntax instead of promise chains for better readability
Handle errors appropriately in async code with try-catch blocks
Don't use async functions as Promise executors
Remove console.log, debugger, and alert statements from production code
Throw Error objects with descriptive messages, not strings or other values
Use try-catch blocks meaningfully - don't catch errors just to rethrow them
Prefer early returns over nested conditionals for error cases
Extract complex conditions into well-named boolean variables
Use early returns to reduce nesting
Prefer simple conditionals over nested ternary operators
Don't use eval() or assign directly to document.cookie
Avoid spread syntax in accumulators within loops
Use top-level regex literals instead of creating them in loops
Prefer specific imports over namespace imports
Use descriptive names for functions, variables, and types for meaningful naming
Add comments for complex logic, but prefer self-documenting code

Files:

  • apps/example/src/routes/docs/reference/search.tsx
  • apps/example/src/routes/docs/how-it-works.tsx
  • apps/example/src/routes/docs/build/bundle-package-docs.tsx
  • docs/docs.config.ts
  • apps/example/src/routes/docs/quickstart.tsx
  • apps/example/src/routes/docs/reference/convert.tsx
  • apps/example/src/routes/docs/reference/cli.tsx
  • apps/example/src/routes/docs/methodology.tsx
  • apps/example/src/routes/docs/reference/llm.tsx
  • apps/example/src/lib/docs-head.ts
  • packages/leadtype/tsup.config.ts
  • apps/example/src/routes/index.tsx
  • apps/example/src/routes/docs/reference/remark.tsx
  • apps/example/src/routes/docs/build/validate-in-ci.tsx
  • apps/example/src/routes/docs/build/optimize-docs-for-agents.tsx
  • apps/example/src/routes/docs/authoring/components.tsx
  • apps/example/src/routes/docs/build/connect-docs-site.tsx
  • apps/example/src/routes/docs/reference/lint.tsx
  • apps/example/src/routes/docs/authoring/frontmatter.tsx
  • apps/example/server/middleware/agent-readability.ts
  • packages/leadtype/src/index.ts
  • apps/example/src/routes/docs/index.tsx
  • apps/example/server/utils/agent-readability.ts
  • packages/leadtype/src/search/node.ts
  • apps/example/scripts/llm-generate.ts
  • packages/leadtype/src/internal/package-surface.test.ts
  • packages/leadtype/src/cli/generate.ts
  • packages/leadtype/src/cli.test.ts
  • packages/leadtype/src/llm/index.ts
  • apps/example/tests/e2e/smoke.e2e.ts
  • packages/leadtype/src/llm/llm.test.ts
  • apps/example/vite.config.ts
  • apps/example/src/routeTree.gen.ts
  • packages/leadtype/src/llm/readability.ts
  • packages/leadtype/src/llm/llm.ts
**/*.{jsx,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{jsx,tsx}: Use function components over class components in React
Call hooks at the top level only, never conditionally
Specify all dependencies in hook dependency arrays correctly
Use the key prop for elements in iterables (prefer unique IDs over array indices)
Nest children between opening and closing tags instead of passing as props
Don't define components inside other components
Avoid dangerouslySetInnerHTML unless absolutely necessary
Use proper image components (e.g., Next.js <Image>) over <img> tags
Use Next.js <Image> component for images
Use next/head or App Router metadata API for head elements in Next.js
Use Server Components for async data fetching instead of async Client Components in Next.js
Use ref as a prop instead of React.forwardRef in React 19+

Files:

  • apps/example/src/routes/docs/reference/search.tsx
  • apps/example/src/routes/docs/how-it-works.tsx
  • apps/example/src/routes/docs/build/bundle-package-docs.tsx
  • apps/example/src/routes/docs/quickstart.tsx
  • apps/example/src/routes/docs/reference/convert.tsx
  • apps/example/src/routes/docs/reference/cli.tsx
  • apps/example/src/routes/docs/methodology.tsx
  • apps/example/src/routes/docs/reference/llm.tsx
  • apps/example/src/routes/index.tsx
  • apps/example/src/routes/docs/reference/remark.tsx
  • apps/example/src/routes/docs/build/validate-in-ci.tsx
  • apps/example/src/routes/docs/build/optimize-docs-for-agents.tsx
  • apps/example/src/routes/docs/authoring/components.tsx
  • apps/example/src/routes/docs/build/connect-docs-site.tsx
  • apps/example/src/routes/docs/reference/lint.tsx
  • apps/example/src/routes/docs/authoring/frontmatter.tsx
  • apps/example/src/routes/docs/index.tsx
**/*.{jsx,tsx,html}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{jsx,tsx,html}: Use semantic HTML and ARIA attributes for accessibility: provide meaningful alt text for images, use proper heading hierarchy, add labels for form inputs, include keyboard event handlers alongside mouse events, use semantic elements instead of divs with roles
Add rel="noopener" when using target="_blank" on links

Files:

  • apps/example/src/routes/docs/reference/search.tsx
  • apps/example/src/routes/docs/how-it-works.tsx
  • apps/example/src/routes/docs/build/bundle-package-docs.tsx
  • apps/example/src/routes/docs/quickstart.tsx
  • apps/example/src/routes/docs/reference/convert.tsx
  • apps/example/src/routes/docs/reference/cli.tsx
  • apps/example/src/routes/docs/methodology.tsx
  • apps/example/src/routes/docs/reference/llm.tsx
  • apps/example/src/routes/index.tsx
  • apps/example/src/routes/docs/reference/remark.tsx
  • apps/example/src/routes/docs/build/validate-in-ci.tsx
  • apps/example/src/routes/docs/build/optimize-docs-for-agents.tsx
  • apps/example/src/routes/docs/authoring/components.tsx
  • apps/example/src/routes/docs/build/connect-docs-site.tsx
  • apps/example/src/routes/docs/reference/lint.tsx
  • apps/example/src/routes/docs/authoring/frontmatter.tsx
  • apps/example/src/routes/docs/index.tsx
**/index.{js,ts,jsx,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Avoid barrel files (index files that re-export everything)

Files:

  • apps/example/src/routes/index.tsx
  • packages/leadtype/src/index.ts
  • apps/example/src/routes/docs/index.tsx
  • packages/leadtype/src/llm/index.ts
**/*.{test,spec}.{js,ts,jsx,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{test,spec}.{js,ts,jsx,tsx}: Write assertions inside it() or test() blocks
Avoid done callbacks in async tests - use async/await instead
Don't use .only or .skip in committed code
Keep test suites reasonably flat - avoid excessive describe nesting

Files:

  • packages/leadtype/src/internal/package-surface.test.ts
  • packages/leadtype/src/cli.test.ts
  • packages/leadtype/src/llm/llm.test.ts
🪛 ast-grep (0.42.1)
packages/leadtype/src/llm/readability.ts

[warning] 252-252: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(^${name}\\s*:, "m")
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html

(regexp-from-variable)


[warning] 263-263: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(^${name}\\s*:\\s*['"]?([^'"\\r\\n]+)['"]?\\s*$, "m")
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html

(regexp-from-variable)

🪛 LanguageTool
docs/build/optimize-docs-for-agents.mdx

[style] ~124-~124: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...our framework has a typed metadata API. Use renderJsonLdScript(page, manifest) if...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~150-~150: To form a complete sentence, be sure to include a subject or ‘there’.
Context: ..., must-revalidate. readMarkdownFile` may be sync or async, so the same code work...

(MISSING_IT_THERE)

docs/how-it-works.mdx

[style] ~73-~73: Consider a different adjective to strengthen your wording.
Context: ...TTP and follow markdown mirror links to deeper context. Useless inside an npm tarball,...

(DEEP_PROFOUND)

docs/reference/llm.mdx

[style] ~37-~37: As an alternative to the over-used intensifier ‘very’, consider replacing this phrase.
Context: ...ened into one file. Use for agents with very large context windows. | | `/docs/llms-f...

(EN_WEAK_ADJECTIVE)


[uncategorized] ~285-~285: Did you mean the formatting language “Markdown” (= proper noun)?
Context: ...hether an Accept header is asking for markdown (q-value aware). | | isAgentUserAgent...

(MARKDOWN_NNP)

🔍 Remote MCP

That's a different library. Let me search more specifically for the leadtype project mentioned in this PR:The search results are not about the specific leadtype project. Let me search more specifically for the Vercel Agent Readability spec that this PR implements:Perfect! Now I have concrete information about the Vercel Agent Readability spec. Let me search for any additional implementation details about manifest versions or content negotiation patterns:Excellent! I've now gathered comprehensive information about the Vercel Agent Readability spec and content negotiation patterns. Let me compile the key findings for the PR review.

Summary of Key Implementation Details for PR Review

Based on my research, here is the relevant context and implementation guidance for reviewing this PR:

Vercel Agent Readability Spec Overview

AI agents look for llms.txt as a machine-readable index of a site's content, similar to how search engines use sitemap.xml. Without it, agents must crawl a site to find pages, which is slower and less reliable.

The spec covers three critical layers:

  1. Discovery Layer (site-wide): llms.txt, sitemaps, and robots.txt help agents find pages

  2. Structure Layer (per-page): Meta tags, headings, structured data, and markdown mirrors help agents parse pages and understand content

  3. Content Negotiation: AI agents use content negotiation, .md endpoints, agent auto-detection, llms.txt, sitemap.md, and MCP

Key Implementation Patterns from the PR

Based on the PR changes, the leadtype implementation aligns with:

  • Manifest Versioning: The PR includes manifest.version === 1 assertion that throws on mismatch (per llm.ts and readability.ts changes), ensuring runtime compatibility.

  • Agent Detection: User-agent matching checks against a maintained list of known AI agent strings (Claude, ChatGPT, GPTBot, Cursor, Copilot, and others), which is the most reliable signal

  • Markdown Response Headers: When an agent fetches a docs page, it should get markdown with frontmatter including title, description, canonical_url, md_url, and last_updated ISO timestamp

  • JSON-LD Metadata: Include a <script type="application/ld+json"> block with at minimum title, description, canonical URL, dateModified, and BreadcrumbList

  • Sitemap Strategy: Publish both sitemap.xml and sitemap.md to help agents understand site structure—XML sitemaps are standard for search crawlers, while markdown sitemaps give agents a structured, readable overview of documentation hierarchy

Critical Review Points

  1. Cache-Control Defaults: The PR mentions "default Cache-Control: public, max-age=300, must-revalidate" which aligns with agent-friendly best practices for discovery files.

  2. Origin Rebasing: The runtime helpers (createSitemapXmlResponse, createRobotsTxtResponse) rebase manifest URLs against requestOrigin at runtime—critical for multi-domain deployments.

  3. Markdown Mirror Tolerance: Markdown links, not HTML are expected. Link to .md versions of pages when possible, or link to HTML and accept the parse cost

  4. llms.txt + llms-full.txt Pattern: The most common pattern in 2026 is llms.txt + llms-full.txt—the index for orientation and the full-text dump for deep ingestion (used by Anthropic, Vercel, and LangGraph)

Testing Validation

The PR's e2e test additions (132 lines added to smoke.e2e.ts) validate:

  • Artifact serving (sitemap XML/MD, robots.txt, llms.txt content negotiation)
  • Markdown mirror availability
  • Canonical/alternate metadata headers
  • Accept: text/markdown request handling
  • Cache-control headers and structured meta tags
🔇 Additional comments (65)
docs/docs.config.ts (1)

19-19: LGTM!

The new best starting point entry follows the established pattern and aligns with the new agent optimization documentation page.

packages/leadtype/src/search/node.ts (1)

12-12: LGTM!

The Set provides O(1) lookup for generated files, and the early continue pattern keeps the loop clean. The design is extensible for additional generated files.

Also applies to: 150-152

apps/example/vite.config.ts (1)

9-9: LGTM!

The explicit serverDir: "./server" configuration clarifies where Nitro server code lives, and the import cleanup removes the now-unused PluginOption type.

Also applies to: 30-30

docs/index.mdx (1)

17-17: LGTM!

Documentation accurately reflects the new pipeline outputs and agent-readable discovery features added in this PR.

Also applies to: 39-40, 56-56

docs/quickstart.mdx (1)

44-44: LGTM!

The quickstart accurately documents the expanded five-step pipeline, the new Agent Readability artifacts in the output tree, and the updated navigation cards.

Also applies to: 56-58, 75-78, 97-109

docs/how-it-works.mdx (1)

19-19: LGTM!

The documentation comprehensively covers the new Agent Readability artifacts, updated pipeline diagram, vocabulary definitions, and execution order. The terminology is clear and consistent.

Also applies to: 46-46, 51-51, 73-73, 80-84, 142-142, 154-155

packages/leadtype/src/internal/package-surface.test.ts (1)

13-13: LGTM!

The test correctly expects the new ./llm/readability export path, ensuring the package surface remains consistent with the documented entry points.

packages/leadtype/tsup.config.ts (1)

9-9: LGTM!

The new tsup entry follows the established pattern and will produce the expected dist/llm/readability.js output, aligning with the package exports configuration.

apps/example/src/routes/docs/how-it-works.tsx (1)

4-9: Head metadata wiring looks correct.

Importing createDocsHead and attaching head: () => createDocsHead("/docs/how-it-works") is consistent with the docs-route metadata pattern.

apps/example/src/routes/docs/build/optimize-docs-for-agents.tsx (1)

1-14: New docs route is implemented cleanly.

Route registration, MDX rendering, and route-level head metadata are wired consistently and correctly.

apps/example/src/routes/docs/authoring/frontmatter.tsx (1)

4-9: Route-level head integration is correct.

This follows the same metadata strategy as other docs routes and keeps path mapping consistent.

apps/example/src/routes/docs/methodology.tsx (1)

4-9: Looks good for metadata hookup.

The new head callback is correctly connected and consistent with the route path.

apps/example/src/routes/docs/reference/llm.tsx (1)

4-9: Good update for route head metadata.

The createDocsHead import and head config are correctly applied for this docs page.

apps/example/src/routes/docs/reference/cli.tsx (1)

4-9: Head configuration change is solid.

This is a clean, consistent addition for docs metadata generation on the CLI reference route.

packages/leadtype/package.json (1)

28-31: New subpath export is correctly defined.

"./llm/readability" includes both ESM import and typings targets and matches the existing exports-map conventions.

apps/example/src/routes/docs/reference/remark.tsx (1)

4-9: This route metadata update looks good.

The head callback is added correctly and follows the same docs-route convention used elsewhere in this PR.

apps/example/tsconfig.json (1)

7-34: Config updates look correct and aligned with the readability wiring.

Including server/**/* and adding the leadtype/llm/readability path alias are consistent with the new server/runtime helper usage.

apps/example/src/routes/docs/build/bundle-package-docs.tsx (1)

4-9: Head metadata hookup is clean and correct for this route.

The head integration matches the intended docs readability pattern without altering route rendering behavior.

apps/example/src/routes/index.tsx (1)

6-17: Route behavior and metadata changes are consistent with the new homepage flow.

Switching from redirect to rendered docs shell and wiring route head metadata is implemented cleanly.

apps/example/src/routes/docs/build/validate-in-ci.tsx (1)

4-9: This route-level head integration looks good.

The createDocsHead usage is consistent and correctly scoped to /docs/build/validate-in-ci.

apps/example/src/routes/docs/index.tsx (1)

4-9: Docs index head wiring is correctly applied.

The route now participates in the shared docs metadata pipeline as expected.

apps/example/src/routes/docs/reference/convert.tsx (1)

4-9: Route metadata integration is correctly implemented here.

The new head callback follows the same docs-head contract used across the docs routes.

apps/example/src/routes/docs/reference/lint.tsx (1)

4-9: This head wiring is solid and consistent with the readability rollout.

No concerns with the added createDocsHead usage for this route.

apps/example/src/routes/docs/quickstart.tsx (1)

4-10: Good route-level head integration.

The head resolver is wired cleanly to the shared docs metadata helper while keeping the route component focused on rendering.

apps/example/src/routes/docs/authoring/components.tsx (1)

4-10: Consistent metadata wiring for docs route.

This follows the same centralized head-generation pattern and keeps the route implementation straightforward.

apps/example/src/routes/docs/reference/search.tsx (1)

4-10: Nice: shared head metadata added without extra route complexity.

Path-specific head generation is correctly attached to the file route.

packages/leadtype/src/index.ts (1)

5-8: Public type surface expansion looks good.

These re-exports make the agent-readability types available from the package root in a clear, curated way.

apps/example/src/routes/docs/build/connect-docs-site.tsx (1)

4-10: Clean adoption of centralized docs head helper.

The route remains simple while correctly adding path-specific metadata generation.

apps/example/scripts/llm-generate.ts (1)

52-101: Build pipeline integration is solid.

Generating the readability manifest and removing static sitemap/robots copies aligns well with the runtime rebase middleware behavior.

docs/build/optimize-docs-for-agents.mdx (1)

64-229: Great implementation guide coverage.

The middleware/regenerator guidance, negotiation behavior, and cache/Vary notes are practical and consistent with the new runtime helpers.

apps/example/server/middleware/agent-readability.ts (1)

19-66: Middleware routing and fallback flow look correct.

The artifact-path switch plus markdown-response fallback is clean and aligns with the expected runtime behavior for discovery + negotiation.

packages/leadtype/src/cli/generate.ts (1)

436-449: LGTM! Clean integration of agent readability artifact generation.

The new generateAgentReadabilityArtifacts call is correctly placed in site-mode-only flow (skipped in bundle mode as expected), and the returned file paths are properly assigned to the result object. The implementation aligns well with the PR's goal of supporting the Vercel Agent Readability spec.

packages/leadtype/src/cli.test.ts (1)

129-134: LGTM! Comprehensive test coverage for new artifacts.

The test suite correctly verifies that agent readability artifacts (sitemap.xml, sitemap.md, robots.txt, agent-readability.json) are generated in site mode and appropriately omitted in bundle mode.

apps/example/server/utils/agent-readability.ts (1)

20-33: LGTM! Robust origin resolution with proper fallbacks.

The getRequestOrigin function correctly prioritizes x-forwarded-* headers (essential for proxied deployments) and falls back gracefully to Nitro's built-in helpers. The protocol defaulting to "http" when neither forwarded-proto nor request protocol is available is a safe choice.

apps/example/package.json (1)

7-11: LGTM! Script updates align with the new pipeline flow.

The changes correctly wire pipeline:build (which includes the new agent readability artifact generation) into the dev, build, and test:e2e workflows.

docs/reference/cli.mdx (1)

68-79: LGTM! Clear documentation of new artifacts and their purpose.

The documentation effectively explains that sitemap/robots files are docs-scoped and meant to be merged with other site routes. The distinction between site mode and bundle mode outputs is well articulated.

apps/example/tests/e2e/smoke.e2e.ts (2)

135-188: Excellent E2E coverage for agent readability discovery layer.

This test comprehensively validates:

  • Sitemap XML/markdown serving with correct content and origin rebasing
  • Robots.txt with AI agent directives
  • llms.txt with root-relative markdown mirror links (line 171)
  • The critical edge case at lines 178-188 ensuring llms-full.txt isn't shadowed by markdown content negotiation

The test design aligns perfectly with the Vercel Agent Readability spec requirements.


190-246: LGTM! Thorough validation of content negotiation and metadata.

The test validates multiple critical aspects:

  • HTML pages include canonical, alternate, og:*, and JSON-LD metadata
  • Accept: text/markdown triggers markdown responses with proper headers
  • AI user-agent detection (ClaudeBot) automatically returns markdown
  • Cache-Control headers are set appropriately (line 212)
  • 404 pages return markdown for agent requests (lines 234-246)

This comprehensive coverage ensures the runtime helpers work correctly end-to-end.

docs/build/connect-docs-site.mdx (2)

103-141: LGTM! Clear integration guide for agent readability.

The new section effectively explains the runtime integration requirements and provides a concise code example using createAgentMarkdownResponse. The explanation of the discovery layer (llms.txt, sitemaps, robots.txt) and content negotiation is well-structured.


169-176: LGTM! Practical verification steps.

The curl examples provide concrete verification steps for developers to confirm their agent readability setup is working. The emphasis on checking the content-type header is particularly valuable for debugging content negotiation issues.

docs/reference/llm.mdx (1)

121-296: Excellent comprehensive documentation of agent readability APIs.

This documentation thoroughly covers:

  • Build-time generation with generateAgentReadabilityArtifacts
  • All runtime helpers with clear examples
  • Edge-runtime compatibility guarantees (line 182)
  • Cache-Control behavior and CDN considerations (lines 289-291)
  • Manifest version validation (lines 293-295)
  • Lower-level helper reference table (lines 274-287)

The documentation provides developers with everything they need to implement the Vercel Agent Readability spec correctly.

apps/example/src/routeTree.gen.ts (1)

1-526: Auto-generated file — no review required.

This is a TanStack Router generated route tree file (marked with @ts-nocheck and eslint-disable). The changes mechanically register the new /docs/build/optimize-docs-for-agents route, which is expected behavior when adding a new docs page.

packages/leadtype/src/llm/llm.test.ts (7)

1-66: LGTM!

Test setup is well-structured with proper temp directory cleanup and a reusable seedDocs helper for creating test fixtures.


120-131: LGTM!

Good test coverage for the URL format change — verifying that markdown mirror links use root-relative paths (]/docs/...md)) rather than absolute URLs with baseUrl.


416-513: LGTM!

Comprehensive test coverage for generateAgentReadabilityArtifacts — verifies file creation, sitemap XML/MD content, robots.txt directives, and manifest structure including page metadata.


553-572: LGTM!

Good verification that JSON-LD rendering properly escapes HTML special characters (<start>\u003cstart\u003e) to prevent XSS in script tags.


598-637: LGTM!

Thorough test coverage for content negotiation — validates q-value parsing, browser-safety bias (HTML wins on ties), and configurable user-agent pattern matching.


782-920: LGTM!

Solid test coverage for runtime response helpers — verifies origin rebasing, Cache-Control customization, and manifest version validation.


922-995: LGTM!

createDocsHead tests properly verify the framework-neutral metadata output including JSON-LD key override support and graceful handling of unknown pages.

packages/leadtype/src/llm/llm.ts (7)

23-45: LGTM!

Clean organization of constants. Using a Set for GENERATED_MARKDOWN_FILES provides efficient lookup when filtering generated outputs.


302-323: LGTM!

Robust date normalization with proper NaN validation and sensible fallback chain (lastModifiedlast_updatedlastUpdated → file mtime).


291-293: LGTM!

Clean handling of the /docs/docs/index.md special case for markdown URL generation.


346-392: LGTM!

Consistent migration to root-relative markdown URLs (toMarkdownUrlPath) in link rendering, supporting origin-agnostic routing as stated in the PR objectives.


472-516: LGTM!

Good additions: file stat() for mtime fallback, filtering of generated markdown files (like sitemap.md), and deterministic sorting by URL path.


937-998: LGTM!

generateAgentReadabilityArtifacts follows established patterns, includes a helpful error message for empty docs, and properly constructs the versioned manifest with all required fields.


743-765: LGTM!

Root full-context router now uses root-relative paths (/llms.txt, /docs/llms.txt) instead of absolute URLs, supporting origin-agnostic routing.

packages/leadtype/src/llm/index.ts (1)

1-63: LGTM!

Clear separation of build-time vs runtime exports. While this is a barrel file, the explicit two-block organization makes the API surface understandable and aligns with the module's dual purpose.

packages/leadtype/src/llm/readability.ts (8)

47-204: LGTM!

Comprehensive type definitions with good JSDoc comments. The version: 1 literal type in AgentReadabilityManifest enables compile-time version checking.


294-311: LGTM!

Correct XSS-safe escaping for embedding JSON in <script> tags — handles <>& and Unicode line/paragraph separators (U+2028/U+2029) that could break script parsing.


251-271: Static analysis false positive — controlled input.

The name parameter in frontmatterHasField and readFrontmatterField comes from hardcoded arrays within this module (e.g., ["canonical_url", "canonical"]), not from user input. The ReDoS warning from static analysis is a false positive in this context.


382-410: LGTM!

Well-documented content negotiation with intentional browser-safety bias — when Accept contains both text/html and text/markdown without explicit q-values, HTML wins to prevent browsers from accidentally receiving markdown.


420-461: LGTM!

Good path traversal protection (!relativePath.split("/").includes("..")) and clean handling of the docs root special cases.


582-654: LGTM!

Comprehensive markdown response handling with proper HEAD support, manifest version validation, and agent-friendly 404 behavior (returns 200 with structured "Page not found" markdown for agents).


743-772: LGTM!

Correctly generates robots.txt with AI crawler-specific User-agent blocks and explicit Allow directives for agent readability paths.


920-949: LGTM!

Framework-neutral head metadata builder with configurable JSON-LD key (script:ld+json for TanStack Router), graceful empty-array fallback for unknown pages, and complete OG/canonical/alternate link coverage.

Comment thread apps/example/server/utils/agent-readability.ts Outdated
Comment thread apps/example/src/lib/docs-head.ts Outdated
…t-vercel-agent-readability-spec-sitemap.xm

# Conflicts:
#	packages/leadtype/tsup.config.ts
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@apps/example/server/utils/agent-readability.ts`:
- Around line 45-46: The catch block that currently swallows all errors and
returns null should only map "file not found" errors to null; update the catch
to accept the error (e.g. catch (err)) and if err.code === 'ENOENT' (or other
platform-specific not-found codes you expect) return null, otherwise rethrow the
error (throw err) so real runtime faults surface; apply this change in the
try/catch surrounding the file-read logic in agent-readability.ts.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 1a68850b-99b3-47a2-9d07-e8bd625a5566

📥 Commits

Reviewing files that changed from the base of the PR and between 667a907 and 7ecbdbb.

⛔ Files ignored due to path filters (3)
  • apps/example/src/generated/agent-readability.json is excluded by !**/generated/**
  • apps/example/src/generated/docs-search-content.json is excluded by !**/generated/**
  • apps/example/src/generated/docs-search-index.json is excluded by !**/generated/**
📒 Files selected for processing (2)
  • apps/example/server/utils/agent-readability.ts
  • apps/example/src/lib/docs-head.ts
📜 Review details
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{ts,tsx}: Use explicit types for function parameters and return values when they enhance clarity
Prefer unknown over any when the type is genuinely unknown
Use const assertions (as const) for immutable values and literal types
Leverage TypeScript's type narrowing instead of type assertions

Files:

  • apps/example/src/lib/docs-head.ts
  • apps/example/server/utils/agent-readability.ts
**/*.{js,ts,jsx,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{js,ts,jsx,tsx}: Use meaningful variable names instead of magic numbers - extract constants with descriptive names
Use arrow functions for callbacks and short functions
Prefer for...of loops over .forEach() and indexed for loops
Use optional chaining (?.) and nullish coalescing (??) for safer property access
Prefer template literals over string concatenation
Use destructuring for object and array assignments
Use const by default, let only when reassignment is needed, never var
Always await promises in async functions - don't forget to use the return value
Use async/await syntax instead of promise chains for better readability
Handle errors appropriately in async code with try-catch blocks
Don't use async functions as Promise executors
Remove console.log, debugger, and alert statements from production code
Throw Error objects with descriptive messages, not strings or other values
Use try-catch blocks meaningfully - don't catch errors just to rethrow them
Prefer early returns over nested conditionals for error cases
Extract complex conditions into well-named boolean variables
Use early returns to reduce nesting
Prefer simple conditionals over nested ternary operators
Don't use eval() or assign directly to document.cookie
Avoid spread syntax in accumulators within loops
Use top-level regex literals instead of creating them in loops
Prefer specific imports over namespace imports
Use descriptive names for functions, variables, and types for meaningful naming
Add comments for complex logic, but prefer self-documenting code

Files:

  • apps/example/src/lib/docs-head.ts
  • apps/example/server/utils/agent-readability.ts
🔍 Remote MCP Context7

Summary of additional, concrete facts relevant to reviewing this PR

  • Context7 returned several Vercel-related library IDs that surface Vercel agent/AI tooling (candidates found: /llmstxt/vercel_llms_txt, /llmstxt/vercel_llms-full_txt, /vercel/ai, /vercel/vercel, /websites/vercel). Use these when looking up Vercel docs or samples in Context7:

  • Vercel provides an agent-detection helper (determineAgent / @vercel/detect-agent) with this usage pattern: call determineAgent() → inspect { isAgent, agent } and adapt behavior. The package docs show examples for gating features, tracking, and environment setup based on agent detection:

  • The detect-agent docs list supported agents and mention ability to add custom agents via environment overrides; examples include Cursor, Claude Code, Gemini CLI, GitHub Copilot, Replit, and Vercel tooling:

Tools used

  • Context7_resolve-library-id (Context7) — to locate Vercel-related library IDs for further doc lookup.
  • Context7_query-docs (Context7) — to fetch Vercel agent-detection documentation and concrete usage snippets.
🔇 Additional comments (2)
apps/example/src/lib/docs-head.ts (1)

8-15: Looks good — typed manifest + wrapper delegation are solid.

This segment is clear, type-safe, and consistent with the readability helper contract.

apps/example/server/utils/agent-readability.ts (1)

37-44: Async file-read implementation looks good.

Using readFile with await here avoids event-loop blocking and matches the edge-safe async pattern described in the PR objective.

Comment on lines +22 to +34
export function getRequestOrigin(event: H3Event): string | undefined {
const forwardedHost = getHeader(event, "x-forwarded-host")
?.split(",")[0]
?.trim();
const forwardedProto = getHeader(event, "x-forwarded-proto")
?.split(",")[0]
?.trim();
if (forwardedHost) {
const protocol = forwardedProto || getRequestProtocol(event) || "http";
return `${protocol}://${forwardedHost}`;
}
const url = getRequestURL(event);
return url.origin;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Harden forwarded-header origin derivation against spoofed hosts/protocols.

Line 23 and Line 26 trust x-forwarded-* directly. If those headers are not strictly sanitized by infrastructure, responses can emit attacker-controlled absolute URLs.

🔐 Suggested hardening
 export function getRequestOrigin(event: H3Event): string | undefined {
+  const requestUrl = getRequestURL(event);
   const forwardedHost = getHeader(event, "x-forwarded-host")
     ?.split(",")[0]
     ?.trim();
   const forwardedProto = getHeader(event, "x-forwarded-proto")
     ?.split(",")[0]
     ?.trim();
   if (forwardedHost) {
-    const protocol = forwardedProto || getRequestProtocol(event) || "http";
-    return `${protocol}://${forwardedHost}`;
+    const protocol =
+      forwardedProto === "https" || forwardedProto === "http"
+        ? forwardedProto
+        : getRequestProtocol(event) ?? "http";
+    try {
+      // Parse to reject malformed injected values.
+      return new URL(`${protocol}://${forwardedHost}`).origin;
+    } catch {
+      return requestUrl.origin;
+    }
   }
-  const url = getRequestURL(event);
-  return url.origin;
+  return requestUrl.origin;
 }

Comment thread apps/example/server/utils/agent-readability.ts Outdated
@KayleeWilliams KayleeWilliams merged commit da8057e into main May 10, 2026
2 of 3 checks passed
@KayleeWilliams KayleeWilliams deleted the KayleeWilliams/support-vercel-agent-readability-spec-sitemap.xm branch May 10, 2026 06:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant