繁體中文版本請見 README-zh.md
Most knowledge bases are built around a retrieval pipeline — documents go in, embeddings come out, and the LLM is handed a handful of chunks to summarise. KnowDB is a prototype exploring a different premise: what if the knowledge layer were a knowledge database designed from the start for agent access patterns?
The goal is not a RAG pipeline or an LLM wiki. It is a database where documents are ingested with enforced structure, the index is maintained by the system (not the LLM), and the query API is designed around how an agent actually navigates knowledge — following the document tree, expanding context on demand, jumping across related sections.
This repository is Tier 1 of that idea: the simplest possible implementation that demonstrates the concept. The knowledge base is a folder of plain .md files. There is no backend. It runs in a browser.
A knowledge database designed for agents has three properties that distinguish it from a knowledge base:
Structure is preserved at ingest, not reconstructed at query time. When a document enters the system, its heading hierarchy becomes the chunk tree. Each chunk has a stable address (db/<docId>/<chunkId>.md) that encodes its position in the tree: 01 is the first top-level section, 01-02 its second subsection, 01-02-03 one level deeper.
The system owns the index. The agent's job is to query, not to maintain index integrity. The ingest pipeline produces _search_index.json and _index.md heading trees. The agent never writes to these.
The query API follows the document tree. An agent navigates by moving vertically (parent, expand) and horizontally (search, related). It starts with the smallest useful context and expands only as needed — not because the system decided the chunk size, but because the agent decided it needed more.
The Tier 1 implementation deliberately uses no infrastructure beyond the filesystem:
npm run ingest <file.md>parses headings and writes chunk files todb/_search_index.jsonis loaded once by the browser; all search is client-side- The UI is two panels: document navigator on the left, agent Q&A on the right
- The agent has seven tools:
get_instructions,list_docs,read_index,search,read_chunk,read_chunks,parent - No backend. Deployable to any static host.
The design is intended to scale across three tiers that share the same conceptual model — chunk tree, structural index, agent-native query API — while swapping the storage layer underneath.
| Tier | Storage | Scale | Status |
|---|---|---|---|
| 1 — Filesystem | Plain .md files + JSON index |
Tens to hundreds of documents | This prototype |
| 2 — Embedded DB | SQLite FTS5 (sqlite-wasm) | Thousands of documents | Planned |
| 3 — Enterprise | TBD | Hundreds of thousands and beyond | Future |
Detailed design and specification for Tier 2 and Tier 3 will be published separately.
Clone the repo first:
git clone https://github.com/kirisame-wang/knowdb.git
cd knowdb
npm installReplace the files in raw/ with your own Markdown files, then rebuild the knowledge base:
rm -rf db/
npm run ingest raw/Then choose how to query it:
Option A — Browser UI (requires an Anthropic API key)
npm run dev # open http://localhost:5173Paste your API key into the UI and ask questions. Build for static deployment with npm run build.
Option B — Local coding agent (no API key, no browser)
Reference SKILL.md in your prompt and the agent navigates db/ directly using bash and grep. See Using with a local coding agent for details.
SKILL.md in the repo root is a skill file for local coding agents (e.g. Claude Code). It describes how to query the knowledge base using bash and grep — no browser, no API key required.
Reference inline — mention the file directly in your prompt and the agent will read it on the spot:
@SKILL.md what is the revenue for 2023?
Register as a persistent skill — the agent loads the instructions automatically on every invocation:
The following commands target Claude Code. Other agents may use a different skill directory.
macOS / Linux
mkdir -p .claude/skills/knowdb-local-search
cp SKILL.md .claude/skills/knowdb-local-search/Skill.mdWindows (PowerShell)
New-Item -ItemType Directory -Force .claude\skills\knowdb-local-search
Copy-Item SKILL.md .claude\skills\knowdb-local-search\Skill.mdWindows (CMD)
mkdir .claude\skills\knowdb-local-search
copy SKILL.md .claude\skills\knowdb-local-search\Skill.mdEither way, the agent follows the same four-step workflow — discover → orient → search → read — to answer questions about documents ingested into db/.
Issues and pull requests are welcome.
A few things that would be especially useful:
- Bug reports — if ingestion produces unexpected chunk splits or the agent behaves oddly, a minimal reproduction is very helpful
- Ingest edge cases — Markdown structures that break the heading parser
- Agent tool feedback — observations on how the agent navigates, where it gets stuck, or tool designs that would improve retrieval
- Ideas toward Tier 2 — thoughts on the SQLite FTS5 path or the embedded DB query API
For larger changes, opening an issue to discuss the direction first is appreciated before writing code.