Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 4 additions & 5 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -58,8 +58,7 @@ jobs:
uv pip install --editable='.[develop,test]'

- name: Run linter and software tests
run: |
poe check
poe build
cratedb-about --version
cratedb-about list-questions
run: poe check

- name: Run build
run: poe build
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
.coverage
coverage.xml
.idea
.venv*
*.egg-info
*.lock
__pycache__
bdist.*
dist
public_html
5 changes: 5 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,11 @@
# About CrateDB changelog

## Unreleased
- Data backend: Refactored the source of truth for the documentation outline
into the package itself, to `cratedb-outline.yaml`
- CLI: Provided new subcommand `cratedb-about outline`
- API: Provided `cratedb_about.CrateDbKnowledgeOutline` for retrieving
information from the knowledge base outline within Python programs

## v0.0.2 - 2025-05-09
- Chore: Removed `sponge` command in `poe build`
Expand Down
51 changes: 44 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,20 +10,55 @@ to relevant resources in the spirit of a curated knowledge backbone.

## What's inside

- The [cratedb-overview.md] file includes hints about what CrateDB is
- A few tidbits of _structured docs_.

- The [cratedb-outline.yaml] file indexes documents about what CrateDB is
and what you can do with it.

- The [about/v1] folder includes a few [llms.txt] files generated from
[cratedb-overview.md]. They can be used to provide better context
for conversations about CrateDB.
- The [about/v1] folder includes [llms.txt] files generated from
[cratedb-outline.yaml] by expanding all links. They can be used
to provide better context for conversations about CrateDB.

## Usage
## Install

Install `cratedb-about` package.
```shell
uv tool install --upgrade 'cratedb-about @ git+https://github.com/crate/about'
```

## Usage

### Outline

#### CLI
Convert documentation outline from `cratedb-outline.yaml` into Markdown format,
which is the source file for subsequently expanding it into an `llms.txt` file.
```shell
cratedb-about outline --format=markdown > outline.md
llms_txt2ctx --optional=true outline.md > llms-full.txt
```

#### API
Use the Python API to retrieve individual sets of outline items, for example,
by section name. The standard section names are: Docs, API, Examples, Optional.
The API can be used to feed information to a [Model Context Protocol (MCP)]
documentation server, for example, a subsystem of [cratedb-mcp].
```python
from cratedb_about import CrateDbKnowledgeOutline

# Load information from YAML file.
outline = CrateDbKnowledgeOutline.load()

# Retrieve information about resources from the "Docs" and "Examples" sections.
doc_items = outline.get_items("Docs", as_dict=True)
example_items = outline.get_items("Examples", as_dict=True)

# List available section names.
section_names = outline.section_names
```

### Query with LLM

Ask questions about CrateDB.
```shell
export OPENAI_API_KEY=<YOUR_OPENAI_API_KEY>
Expand All @@ -36,10 +71,12 @@ cratedb-about list-questions
```

To configure a different context file, use the `CRATEDB_CONTEXT_URL` environment
variable.
variable. The default value is https://cdn.crate.io/about/v1/llms-full.txt.


[about/v1]: https://cdn.crate.io/about/v1/
[CrateDB]: https://cratedb.com/database
[cratedb-overview.md]: ./src/index/cratedb-overview.md
[cratedb-mcp]: https://github.com/crate/cratedb-mcp
[cratedb-outline.yaml]: https://github.com/crate/about/blob/main/src/cratedb_about/outline/cratedb-outline.yaml
[llms.txt]: https://llmstxt.org/
[Model Context Protocol (MCP)]: https://modelcontextprotocol.io/introduction
2 changes: 2 additions & 0 deletions docs/backlog.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
# Backlog

## Iteration +1
- Publish to PyPI
- Let the user optionally select a local `llms.txt` file
- Let the user select the model, reasoning effort, and other parameters
- JSON/YAML/Markdown output

Expand Down
75 changes: 68 additions & 7 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,7 @@ dynamic = [
"version",
]
dependencies = [
"cattrs<25",
"claudette",
"click<9",
"llms-txt==0.0.4",
Expand All @@ -82,12 +83,23 @@ optional-dependencies.develop = [
"ruff<0.12",
"validate-pyproject<1",
]
optional-dependencies.release = [
"build<2",
"twine<6",
]
optional-dependencies.test = [
"pytest<9",
"pytest-cov<6",
]
urls.Changelog = "https://github.com/crate/about/blob/main/CHANGES.md"
urls.Issues = "https://github.com/crate/about/issues"
urls.Repository = "https://github.com/crate/about"

scripts.cratedb-about = "cratedb_about.cli:cli"

[tool.setuptools.package-data]
cratedb_about = [ "*.yaml" ]

[tool.ruff]
line-length = 100

Expand Down Expand Up @@ -122,6 +134,45 @@ lint.select = [
"YTT",
]

lint.per-file-ignores."tests/*" = [
"S101", # Allow use of `assert`.
]

[tool.pytest.ini_options]
addopts = """
-rfEXs -p pytester --strict-markers --verbosity=3
--cov --cov-report=term-missing --cov-report=xml
"""
minversion = "2.0"
log_level = "DEBUG"
log_cli_level = "DEBUG"
log_format = "%(asctime)-15s [%(name)-36s] %(levelname)-8s: %(message)s"
pythonpath = [
"src",
]
xfail_strict = true
markers = [
]

[tool.coverage.paths]
source = [
"src/",
]

[tool.coverage.run]
branch = false
omit = [
"tests/*",
]

[tool.coverage.report]
fail_under = 0
show_missing = true
exclude_lines = [
"# pragma: no cover",
"raise NotImplemented",
]

[tool.mypy]
mypy_path = "src"
packages = [
Expand All @@ -144,9 +195,20 @@ describe-subst = "$Format:%(describe:match=v*)$"

[tool.poe.tasks]

build.env = { OUTDIR = "public_html" }
build.sequence = [
{ shell = "echo Generating content, target: ${OUTDIR}" },
{ shell = "mkdir -p ${OUTDIR}" },
{ shell = "cp src/content/about/llms-txt.md ${OUTDIR}/readme.md" },
{ shell = "cp src/cratedb_about/outline/cratedb-outline.yaml ${OUTDIR}/outline.yaml" },
{ shell = "cratedb-about outline --format markdown > ${OUTDIR}/outline.md" },
{ shell = "llms_txt2ctx --optional=false ${OUTDIR}/outline.md > ${OUTDIR}/llms.txt" },
{ shell = "llms_txt2ctx --optional=true ${OUTDIR}/outline.md > ${OUTDIR}/llms-full.txt" },
]

check = [
"lint",
# "test",
"test",
]

format = [
Expand All @@ -164,10 +226,9 @@ lint = [
{ cmd = "mypy" },
]

build = [
{ shell = "mkdir -p public_html/llm" },
{ shell = "cp src/index/cratedb-overview.md public_html/llm/" },
{ shell = "cp src/content/about/llms-txt.md public_html/llm/readme.md" },
{ shell = "llms_txt2ctx --optional=false src/index/cratedb-overview.md > public_html/llm/llms.txt" },
{ shell = "llms_txt2ctx --optional=true src/index/cratedb-overview.md > public_html/llm/llms-full.txt" },
release = [
{ cmd = "python -m build" },
{ cmd = "twine upload --skip-existing dist/*" },
]

test = { cmd = "pytest" }
3 changes: 2 additions & 1 deletion src/content/about/llms-txt.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,8 @@ helping LLMs understand how to interpret this information in context.

## What's Inside

- `cratedb-overview.md`: The source file for generating `llms.txt`.
- `cratedb-outline.yaml`: The YAML source file for generating a Markdown file
`cratedb-outline.md` and subsequently an `llms.txt`.
- `llms.txt`: Standard `llms.txt` file.
- `llms-full.txt`: Full `llms.txt` file, including the "Optional" subsection.

Expand Down
5 changes: 5 additions & 0 deletions src/cratedb_about/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
from .outline import CrateDbKnowledgeOutline

__all__ = [
"CrateDbKnowledgeOutline",
]
22 changes: 22 additions & 0 deletions src/cratedb_about/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

from cratedb_about.core import CrateDBConversation
from cratedb_about.model import Example
from cratedb_about.outline.model import CrateDbKnowledgeOutline


@click.group()
Expand All @@ -13,6 +14,27 @@ def cli(ctx: click.Context) -> None:
pass


@cli.command()
@click.option(
"--format", "-f", "format_", type=click.Choice(["markdown", "yaml", "json"]), default="markdown"
)
def outline(format_: t.Literal["markdown", "yaml", "json"] = "markdown"):
"""
Display the outline of the CrateDB documentation.

Available output formats: Markdown, YAML, JSON.
"""
cratedb_outline = CrateDbKnowledgeOutline.load()
if format_ == "json":
print(cratedb_outline.to_json()) # noqa: T201
elif format_ == "yaml":
print(cratedb_outline.to_yaml()) # noqa: T201
elif format_ == "markdown":
print(cratedb_outline.to_markdown()) # noqa: T201
else:
raise ValueError(f"Invalid output format: {format_}")


@cli.command()
@click.argument("question", type=str, required=False)
@click.option("--backend", type=click.Choice(["openai", "claude"]), default="openai")
Expand Down
5 changes: 5 additions & 0 deletions src/cratedb_about/outline/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
from .model import CrateDbKnowledgeOutline

__all__ = [
"CrateDbKnowledgeOutline",
]
Loading