Skip to content

Implement Synix v0.9: multi-provider, concurrent, hybrid search#3

Merged
marklubin merged 4 commits intomainfrom
v0.9-implementation
Feb 8, 2026
Merged

Implement Synix v0.9: multi-provider, concurrent, hybrid search#3
marklubin merged 4 commits intomainfrom
v0.9-implementation

Conversation

@marklubin
Copy link
Copy Markdown
Owner

Summary

  • Module restructure: New layout (core/, build/, adapters/, cli/, search/) with backward-compat shims for old import paths
  • Multi-provider LLM (S03): LLMClient wrapping openai.OpenAI(base_url=...) — works with any OpenAI-compatible API (OpenAI, Anthropic, Ollama, vLLM, DeepSeek)
  • Concurrent execution (S04): ConcurrentExecutor with ThreadPool + Semaphore + exponential backoff. --concurrency/-j CLI flag
  • Embeddings & hybrid search (S05): EmbeddingProvider with content-hash caching, HybridRetriever with keyword/semantic/hybrid modes, RRF fusion (k=60)
  • Logging & observability (S01): SynixLogger with per-run JSONL file logs, RunLog/StepLog tracking LLM calls/tokens/cache hits, -v/-vv verbosity
  • synix plan (S06): Dry-run showing what would build, with per-layer LLM call/token/cost estimates
  • Shadow index swap (S07): Build to search_shadow.db, atomic os.replace() on success
  • Artifact diffing (S08): diff_builds(), diff_artifact() with unified diff output
  • synix verify (S09): 8 integrity checks including merge cross-customer contamination detection
  • Text adapter (S10): YAML frontmatter, date inference, turn detection, adapter registry
  • Merge transform: Jaccard similarity clustering with union-find, natural language constraints, cache key
  • Search CLI extensions: --step (layer filter), --trace (provenance tree), --customer (metadata filter)
  • Test infrastructure: Mock LLM server, 3 demo corpora (30/50/100 conversations), 477 tests passing

Test plan

  • uv run pytest tests/ -v — 477 passed, 0 skipped, 0 failed (~20s)
  • Unit tests cover all new modules (executor, LLM client, embeddings, retriever, diff, plan, verify, merge transform, text adapter, shadow index, logging, search CLI)
  • Integration tests: full pipeline runs, incremental rebuilds, config changes
  • E2E Demo 1 (personal): fresh build, search with --step/--trace, config swap, full cache hit (12 tests)
  • E2E Demo 2 (startup): financial profiles, parallel pipeline paths, independent search indexes, cross-path diffing (18 tests)
  • E2E Demo 3 (incident): merge contamination detection, provenance tracing, verify checks, incremental fix, collateral damage check (12 tests)

…d search, and full observability

Module restructure (S02):
- New package layout: core/, build/, adapters/, cli/, search/
- Old modules become thin re-export shims for backward compatibility
- Core models extracted to core/models.py, re-exported from __init__.py

Logging & observability (S01):
- SynixLogger with JSONL file logging per run (build/logs/)
- RunLog/StepLog dataclasses tracking LLM calls, tokens, cache hits
- -v/-vv verbosity flags on CLI

Multi-provider LLM (S03):
- LLMClient wrapping openai.OpenAI(base_url=...) for any OpenAI-compatible API
- LLMConfig/EmbeddingConfig dataclasses with config precedence
- Replaced anthropic SDK dependency with openai SDK

Concurrent execution (S04):
- LLMExecutor ABC with LLMRequest/LLMResult
- SequentialExecutor and ConcurrentExecutor (ThreadPool + Semaphore + backoff)
- --concurrency/-j CLI flag, wired into runner for by_conversation grouping

Embeddings & hybrid search (S05):
- EmbeddingProvider with content-hash caching (binary float32 files)
- HybridRetriever: keyword, semantic, and hybrid modes
- RRF score fusion (k=60), --mode/--top-k CLI flags

synix plan (S06):
- BuildPlan with per-layer estimates (LLM calls, tokens, cost)
- --json/--save flags for plan output

Shadow index swap (S07):
- Build search index to search_shadow.db, atomic os.replace() on success
- Old index preserved on build failure

Artifact diffing (S08):
- diff_builds(), diff_artifact() with unified diff output

synix verify (S09):
- 8 integrity checks: build_exists, manifest, artifacts, provenance,
  search_index, content_hashes, no_orphans, merge_integrity
- --check flag for selective verification, --json output

Text adapter (S10):
- YAML frontmatter parsing, filename date inference, turn detection
- Adapter registry with auto-detection by file extension

Merge transform:
- Jaccard similarity clustering with union-find
- Natural language constraint parsing (e.g., "NEVER merge different customer_id")
- Threshold + constraints in cache key

Search CLI extensions:
- --step (layer filter), --trace (provenance tree), --customer (metadata filter)

Test infrastructure:
- Mock LLM server (OpenAI-compatible HTTP, deterministic fixtures, error injection)
- 3 demo corpora: personal (30 conv), startup (50 conv, 10 customers),
  incident (100 conv, 20 customers)
- 477 tests (0 skipped, 0 failed): unit, integration, and E2E
@marklubin
Copy link
Copy Markdown
Owner Author

Review Response

Accepted all P0 and P1 items. Working on fixes now.

P0 (adapter correctness):

  1. ChatGPT linearization: switching to current_node path with first-child fallback
  2. Role filtering: user/assistant only by default
  3. Claude sender normalization: humanuser
  4. Claude timestamp: proper ISO parsing with fallback

P1 (perf + reliability):
5. Merge transform: pre-tokenize to eliminate O(n²) re-tokenization
6. Semantic search: in-memory embedding cache (load once per query session, not per query)
7. Single root-layer constraint: relaxed to allow multi-source pipelines
8. Atomic cache writes: temp file + fsync + rename pattern
9. Verify output: added fix_hint field with actionable suggestions
10. Thread safety: deep-copy config before passing to concurrent transform workers

Note on anthropic dep: reviewer was incorrect — anthropic IS used in llm_client.py for the native Anthropic provider path. Keeping as required dep for now; will consider making optional in a future PR.

Pushing fix commit shortly.

P0 — Adapter correctness:
- ChatGPT: follow current_node path for linearization, fall back to
  first-child traversal for exports without current_node
- ChatGPT: filter to user/assistant roles only (exclude system/tool/plugin)
- Claude: normalize sender labels (human → user) for cross-source consistency
- Claude: proper ISO-8601 timestamp parsing with fallback

P1 — Performance:
- Merge transform: pre-tokenize inputs once, eliminating O(n²) re-tokenization
  in pairwise Jaccard similarity comparisons
- Semantic search: in-memory embedding cache keyed by content hash, loaded
  once per session instead of re-embedding all rows per query

P1 — Architecture:
- Relax single root-layer constraint to allow multi-source pipelines
  (e.g., separate ChatGPT and Claude level-0 layers)
- Deep-copy config dict before passing to concurrent transform workers
  to prevent race conditions on shared mutable state

P1 — Reliability:
- Atomic cache writes (temp file + fsync + os.replace) for artifact store,
  provenance tracker, and embedding manifest
- Actionable verify output: fix_hint field on VerifyCheck with specific
  remediation commands for each failure type

Tests: 481 passed (4 new adapter tests, updated pipeline validation tests)
Conversation create_time only reflects when the conversation started,
not when content was last produced. Both ChatGPT and Claude adapters
now derive last_message_date from individual per-message timestamps.
Consolidate design documents under docs/: DESIGN.md, sprint-checklist.md,
demo-test-specs.md, v09-build-plan.md. Add BACKLOG.md capturing deferred
items from v0.9 PR review (episode chunking, retrospective provenance
docs, full tree export, projections as DAG nodes, etc).
@marklubin marklubin merged commit 95da436 into main Feb 8, 2026
marklubin added a commit that referenced this pull request Mar 9, 2026
…atch race

- Remove `open` re-export from __init__.py — no longer shadows builtin
  (both reviews, v2)
- Export `SdkError` from __init__.py for user-facing error handling
- Add path traversal validation in SdkSource — rejects `../` and
  absolute paths in add_text/remove (GPT critical)
- Fix scratch release race: _get_closure reads snapshot_oid from receipt
  written by execute_release, not by re-resolving HEAD independently
  (Claude concern #3)
- Scratch close() cleans up both work/ and releases/ dirs
- Update sdk-design.md examples to use open_project (Claude question #4)
- Update sdk.md quick start to use open_project
- All tests use open_project; 3 new tests for path traversal + undeclared source
marklubin added a commit that referenced this pull request Mar 9, 2026
* feat: add Python SDK for programmatic access to synix projects

Introduces synix.open(path) and synix.init(path, pipeline=...) entry
points with Project, Release, SearchHandle, and typed error hierarchy.
Supports build, release, search, artifact inspection, and ref listing
through a stable Python API without touching CLI or internals directly.

68 e2e tests + 18 incremental cache tests covering build idempotency,
release lifecycle, search correctness, and content-addressed dedup.

* fix: address PR #93 review — API safety, error types, naming

- Rename `open()` to `open_project()` to avoid shadowing Python builtin;
  keep `open` as deprecated alias for backwards compatibility
- Use UUID-based scratch dir for `release("HEAD")` — prevents concurrent
  stomping (GPT critical finding)
- Deep-copy pipeline in `build()` to prevent caller mutation (both reviews)
- Remove `source()` fallback to undeclared sources — now raises SdkError
  with list of declared sources (GPT warning)
- Add `ProjectionNotFoundError` — `flat_file()` no longer raises
  `SearchNotAvailableError` (GPT minor, wrong error taxonomy)
- Extract `_resolve_flat_file_path()` — dedup flat_file/flat_file_path
  (Claude concern)
- Wrap `_get_closure()` receipt parsing in SdkError (Claude concern)
- Remove dead `SdkArtifact._from_snapshot_dict` (Claude nit)

* fix: address round-2 review — remove open alias, path validation, scratch race

- Remove `open` re-export from __init__.py — no longer shadows builtin
  (both reviews, v2)
- Export `SdkError` from __init__.py for user-facing error handling
- Add path traversal validation in SdkSource — rejects `../` and
  absolute paths in add_text/remove (GPT critical)
- Fix scratch release race: _get_closure reads snapshot_oid from receipt
  written by execute_release, not by re-resolving HEAD independently
  (Claude concern #3)
- Scratch close() cleans up both work/ and releases/ dirs
- Update sdk-design.md examples to use open_project (Claude question #4)
- Update sdk.md quick start to use open_project
- All tests use open_project; 3 new tests for path traversal + undeclared source

* fix: round-3 review — export all error types, remove stale docs

- Export full error hierarchy from __init__.py (SynixNotFoundError,
  ReleaseNotFoundError, ArtifactNotFoundError, SearchNotAvailableError,
  EmbeddingRequiredError, PipelineRequiredError, ProjectionNotFoundError)
- Remove stale deprecated-alias note from sdk.md (open was already removed)
- Update sdk.md error import example to use `from synix import` (not sdk)
- Fix variable name `l` → `layer` in list comprehension

* docs: fix SDK documentation gaps

- Fix stale synix.open() → open_project() in sdk.md and sdk-design.md
- Fix incorrect BuildResult attributes in sdk-design.md (layers_built,
  cost_estimate → built, total_time, snapshot_oid)
- Fix incorrect release_to() return type in sdk-design.md (dict, not object)
- Document path traversal validation in SdkSource
- Document build() deep-copy behavior
- Update CLAUDE.md module comment to open_project
- Add SDK link to README Learn More table
marklubin added a commit that referenced this pull request Mar 10, 2026
Closes #62

P0 trust/correctness:
- Resolve relative source_dir/build_dir against pipeline file, not cwd (#3)
- Clear synix_dir on --build-dir override to prevent stale routing (#4)
- Propagate source load failures instead of silently succeeding (#5)
- Add Layer.level read-only property to fix info crash (#8)
- Rewrite info/status to read .synix/ snapshot store, not legacy build/ (#9)
- Diff uses RefStore run history instead of legacy versions/ dir (#11)

P1 operator consistency:
- Planner uses estimated-count placeholders for downstream cardinality (#1)
- Standardize invalid ref handling to sys.exit(1) across all inspectors (#10)
- Clean also removes refs/releases/ ref files (#12)

P2 docs/discoverability:
- Mesh commands honor SYNIX_MESH_ROOT env var via resolve_mesh_root() (#2)
- Batch planner tracks DAG cardinality instead of estimate_output_count(1) (#6)
- Fix llms.txt diff syntax to match actual CLI (#7)
- Add refs/plans to refs list prefix scan (#13)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant