Background
Docs are authored by humans but increasingly consumed by AI agents on behalf of
developers. This RFC asks: are we optimizing for the right consumer, and what
concrete changes would close the gap?
Two converging use cases require the same foundation — accurate, structured,
discoverable content:
- AI that reads docs directly (retrieval) — coding tools like Cursor and
Claude Code fetching pages via llms.txt and .md endpoints
- AI that generates explanations on demand (generation grounding) — the
source of truth AI grounds its answers in
We plan to integrate with Kapa.AI (as on the current docs.internetcomputer.org)
which handles the conversational AI layer — ingestion, indexing, retrieval, and
the chat interface for developers. The action items below directly serve Kapa.AI
ingestion quality, not just abstract "AI optimization."
This is not a proposal to build AI tooling. It's about ensuring this repo is the
best possible ground truth for both use cases.
Research summary
agentdocsspec.com defines 22 specific checks across 7 categories with
concrete thresholds — not just "have an llms.txt." Notable checks we likely
fail today:
- Coverage: llms.txt must link to ≥95% of pages (stub pages may degrade this)
- Content negotiation: serve
Content-Type: text/markdown for
Accept: text/markdown requests — currently not implemented
- Cache hygiene: markdown endpoints need
max-age < 3600 or must-revalidate
with ETag
- Platform truncation limits are documented: Claude Code ~100KB, MCP Fetch 5KB
default, Claude API web_fetch ~20.7KB — relevant for page size decisions
Stripe's instructions section in llms.txt is the one structurally novel
pattern worth adopting. They encode semantic directives for AI directly in
llms.txt: preferred APIs, deprecated alternatives, behavioral guidance. No
infrastructure required. Directly applicable — we could encode "always use
icp CLI, never dfx", preferred patterns, deprecation signals. Currently only
Stripe does this publicly.
The llms.txt + .md endpoints pattern is already correct for this site's
scale. The "nested llms.txt" variant (section-level index files) is a scaling
solution for sites where the root index exceeds 50KB — not a concern at current
page count. llms-full.txt (full content concatenated) is served by some
framework docs auto-generated by Starlight/VitePress, but is too large for any
current AI fetch pipeline and primarily useful for humans manually piping docs
into an LLM — not an agent optimization.
Diataxis has real value for AI routing — separating concept / guide /
reference / tutorial pages gives AI a structural signal about the type of answer
a page contains. Its limit is that it was designed around human cognitive modes,
not knowledge structure. It provides no relationship signals AI would benefit
from: prerequisites, related concepts, which APIs a page covers.
GraphRAG is cost-viable at this scale (~$1–5 one-time indexing for ~100
pages) but the query workload has to justify it. Useful for cross-cutting
questions ("how does auth work across the system"); overkill for lookup queries.
With Kapa.AI handling the retrieval layer, a separate GraphRAG implementation
would overlap significantly — revisit if Kapa.AI proves insufficient for
complex cross-cutting queries.
On-demand query-time generation is not mature. DeepWiki (Cognition)
pre-generates then retrieves — it doesn't generate at query time. No shipping
product with documented results exists for pure query-time generation. The
ground truth layer is the right investment now regardless of which model
dominates later.
Current state
The plugin (plugins/astro-agent-docs.mjs) generates llms.txt, clean .md
endpoints, agent signaling blockquote in HTML, and a sitemap alias. Solid
foundation.
Key gaps:
cleanMarkdown() strips all YAML frontmatter — agents and Kapa.AI see
only title + body, no metadata
- No
instructions section in llms.txt
- No content negotiation (
Accept: text/markdown)
- Stub pages create dead entries in the discovery index
- No journey-aligned ordering in
llms.txt (currently site taxonomy)
- No relationship signals (prerequisites, related pages, API surface per page)
- No AI-optimization guidance in the content authoring workflow
Proposed action items
Tier 1 — Low effort, high impact
Tier 2 — Medium effort, medium term
Tier 3 — Larger investment, revisit later
Non-goals
- Building AI tooling (generation systems, MCP server, custom retrieval)
- Changing the Diataxis content structure or Markdown-first authoring workflow
- Any change that increases authoring burden for content contributors
Open questions
- Should stub pages be excluded from
llms.txt entirely until they have real
content, or is a stub signal (explicit marker) better than absence?
- For the
instructions section: who owns it, and what's the process for
keeping AI directives accurate as APIs evolve?
- Which Kapa.AI ingestion path will be used — sitemap crawl,
.md endpoints,
or GitHub integration? This determines which Tier 1 items are highest
priority.
- Does the Tier 2 per-page JSON sidecar overlap with what Kapa.AI builds
internally, or is there a case for exposing it publicly?
Background
Docs are authored by humans but increasingly consumed by AI agents on behalf of
developers. This RFC asks: are we optimizing for the right consumer, and what
concrete changes would close the gap?
Two converging use cases require the same foundation — accurate, structured,
discoverable content:
Claude Code fetching pages via llms.txt and .md endpoints
source of truth AI grounds its answers in
We plan to integrate with Kapa.AI (as on the current docs.internetcomputer.org)
which handles the conversational AI layer — ingestion, indexing, retrieval, and
the chat interface for developers. The action items below directly serve Kapa.AI
ingestion quality, not just abstract "AI optimization."
This is not a proposal to build AI tooling. It's about ensuring this repo is the
best possible ground truth for both use cases.
Research summary
agentdocsspec.com defines 22 specific checks across 7 categories with
concrete thresholds — not just "have an llms.txt." Notable checks we likely
fail today:
Content-Type: text/markdownforAccept: text/markdownrequests — currently not implementedmax-age < 3600ormust-revalidatewith ETag
default, Claude API web_fetch ~20.7KB — relevant for page size decisions
Stripe's
instructionssection in llms.txt is the one structurally novelpattern worth adopting. They encode semantic directives for AI directly in
llms.txt: preferred APIs, deprecated alternatives, behavioral guidance. No
infrastructure required. Directly applicable — we could encode "always use
icp CLI, never dfx", preferred patterns, deprecation signals. Currently only
Stripe does this publicly.
The
llms.txt+.mdendpoints pattern is already correct for this site'sscale. The "nested llms.txt" variant (section-level index files) is a scaling
solution for sites where the root index exceeds 50KB — not a concern at current
page count.
llms-full.txt(full content concatenated) is served by someframework docs auto-generated by Starlight/VitePress, but is too large for any
current AI fetch pipeline and primarily useful for humans manually piping docs
into an LLM — not an agent optimization.
Diataxis has real value for AI routing — separating concept / guide /
reference / tutorial pages gives AI a structural signal about the type of answer
a page contains. Its limit is that it was designed around human cognitive modes,
not knowledge structure. It provides no relationship signals AI would benefit
from: prerequisites, related concepts, which APIs a page covers.
GraphRAG is cost-viable at this scale (~$1–5 one-time indexing for ~100
pages) but the query workload has to justify it. Useful for cross-cutting
questions ("how does auth work across the system"); overkill for lookup queries.
With Kapa.AI handling the retrieval layer, a separate GraphRAG implementation
would overlap significantly — revisit if Kapa.AI proves insufficient for
complex cross-cutting queries.
On-demand query-time generation is not mature. DeepWiki (Cognition)
pre-generates then retrieves — it doesn't generate at query time. No shipping
product with documented results exists for pure query-time generation. The
ground truth layer is the right investment now regardless of which model
dominates later.
Current state
The plugin (
plugins/astro-agent-docs.mjs) generates llms.txt, clean.mdendpoints, agent signaling blockquote in HTML, and a sitemap alias. Solid
foundation.
Key gaps:
cleanMarkdown()strips all YAML frontmatter — agents and Kapa.AI seeonly title + body, no metadata
instructionssection inllms.txtAccept: text/markdown)llms.txt(currently site taxonomy)Proposed action items
Tier 1 — Low effort, high impact
Accept: text/markdown→Content-Type: text/markdown)instructionssection tollms.txtwith ICP-specific AIdirectives: never dfx, preferred APIs, deprecation signals
titleanddescriptionthroughcleanMarkdown()—currently stripped entirely; Kapa.AI and agents receive no metadata
llms.txtuntil they have real content — deadentries degrade retrieval quality for both agents and Kapa.AI
Tier 2 — Medium effort, medium term
prerequisites,category,entities— not required authoring overhead, but enablesricher indexing when populated
llms.txtentries by developer journey rather than sitehierarchy — ordering functions as a priority signal for models
reference pages prefer tables over prose, ≤50K characters per page,
≤25% generic section headers ("Overview", "Introduction")
.mdendpoint — relationship signals, entities, prerequisitesTier 3 — Larger investment, revisit later
cross-cutting queries once content volume is substantial
llms-full.txt— if so, serveit passively; useful for humans manually ingesting docs into an LLM
context window, not an agent optimization priority
Non-goals
Open questions
llms.txtentirely until they have realcontent, or is a stub signal (explicit marker) better than absence?
instructionssection: who owns it, and what's the process forkeeping AI directives accurate as APIs evolve?
.mdendpoints,or GitHub integration? This determines which Tier 1 items are highest
priority.
internally, or is there a case for exposing it publicly?