Skip to content

Add JSON search, JSON-RPC bridge, and repo-scoped vector partitions to ccc#153

Open
faysou wants to merge 1 commit into
cocoindex-io:mainfrom
faysou:feature/search-json-output
Open

Add JSON search, JSON-RPC bridge, and repo-scoped vector partitions to ccc#153
faysou wants to merge 1 commit into
cocoindex-io:mainfrom
faysou:feature/search-json-output

Conversation

@faysou
Copy link
Copy Markdown
Contributor

@faysou faysou commented Apr 30, 2026

Summary

  • Adds a --json option to ccc search.
  • Adds ccc bridge --jsonrpc as a long-running stdin/stdout bridge for external tools.
  • Adds --repo-key and repo_keys search filtering for workspace indexes containing multiple repositories.
  • Stores repo_key on indexed chunks and makes it a sqlite-vec partition key alongside language.
  • Derives repo_key from the nearest .git root relative to the CocoIndex project root, so nested repos such as ADK/a2a-samples are scoped exactly.
  • Preserves compatibility with existing indexes that do not have repo_key by falling back to the previous path-glob scan.

Design decisions

  • The JSON formatter is separate from the existing human formatter to avoid changing terminal output.
  • The bridge uses newline-delimited JSON-RPC 2.0 on stdin/stdout so a parent process can keep one Python runtime alive across many requests.
  • The bridge currently supports ping, search, and shutdown; bridge errors are returned as JSON-RPC error objects rather than printed to stdout.
  • repo_key is a stored partition key so new workspace indexes can use fast KNN search with repo-level filtering instead of a full vector scan.
  • Existing indexes are not broken: if code_chunks_vec has no repo_key column, repo_keys=["nautilus_trader"] is translated to file_path GLOB "nautilus_trader/*".

Files changed

  • src/cocoindex_code/cli.py.
  • src/cocoindex_code/client.py.
  • src/cocoindex_code/daemon.py.
  • src/cocoindex_code/indexer.py.
  • src/cocoindex_code/project.py.
  • src/cocoindex_code/protocol.py.
  • src/cocoindex_code/query.py.
  • src/cocoindex_code/schema.py.
  • src/cocoindex_code/server.py.
  • src/cocoindex_code/shared.py.
  • tests/test_cli_helpers.py.
  • tests/test_indexer_helpers.py.
  • tests/test_protocol.py.

Validation

  • Passed uv run pytest.
  • Passed uv run ruff check ... on the touched source and test files.
  • Passed uv run mypy.
  • Passed uv run prek run --all-files --verbose --show-diff-on-failure --skip pytest.
  • Smoke-tested ccc search --json --repo-key nautilus_trader --limit 1 streamwriter against the existing Nautilus workspace index and validated JSON output.

@faysou faysou force-pushed the feature/search-json-output branch from 47e7a1d to 589b07d Compare April 30, 2026 11:44
@faysou faysou changed the title Add JSON output for ccc search Add machine-readable search interfaces for ccc Apr 30, 2026
@faysou faysou changed the title Add machine-readable search interfaces for ccc Add JSON search, JSON-RPC bridge, and repo-scoped vector partitions to ccc Apr 30, 2026
@faysou
Copy link
Copy Markdown
Contributor Author

faysou commented May 1, 2026

Fyi, I've done this PR to be able to integrate ccc with gitnexus for the semantic search part, and it works for me locally. Gitnexus works at single repo level but I use one index to index all my repos at the same time. The repo key is useful so gitnexus can filter ccc results on a single result, else a query was taking 25 seconds instead of half a second.

@faysou faysou force-pushed the feature/search-json-output branch from af78391 to c662652 Compare May 12, 2026 07:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant