Releases: daniloaguiarbr/sqlite-graphrag
v1.0.36
Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog,
and this project adheres to Semantic Versioning.
[Unreleased]
[1.0.36] - 2026-04-30
Fixed (Linguistic policy)
- C1 (CRITICAL): Synced
--typeenum inskill/sqlite-graphrag-en/SKILL.md:46and-pt/SKILL.md:46from 4 listed values to the full set of 9 (user, feedback, project, reference, decision, incident, skill, document, note). Agents using SKILL.md as a contract had been silently losing five memory types since v1.0.30. Source of truth:src/cli.rs:364-374(MemoryTypeenum) andsrc/commands/remember.rs:26long-help. - H1+H2+H3 (HIGH): Translated three Portuguese-without-accent strings in
tracing::warn!macros that escaped the audit gaterg '[áéíóúâêôãõç]' src/documented in v1.0.33:src/extraction.rs:1204("NER falhou..."→"NER failed..."),src/extraction.rs:964("batch NER falhou (chunk de N janelas)..."→"batch NER failed (chunk of N windows)..."),src/commands/remember.rs:345("auto-extraction falhou..."→"auto-extraction failed..."). Bonus: also translatedsrc/storage/urls.rs:37("falha ao persistir url..."→"failed to persist url...") and the production error insrc/commands/remember.rs:367("limite de N namespaces ativos excedido..."→"active namespace limit of N reached..."). - M1 (MEDIUM): Added a complementary CI gate in
.github/workflows/ci.yml language-checkjob that scanstracing::*!,#[error(...)], doc comments, andpanic!/assert!/expect/bail!/ensure!macros for Portuguese words without diacritical marks (falhou,janelas,usando apenas,nao foi,ja existe,obrigatorio,memoria, etc.). Plain string literals are intentionally not scanned because they hold legitimate PT test fixtures for multilingual extraction. - M3 (MEDIUM): Renamed 33 Portuguese test function names to English across
tests/integration.rs,tests/exit_codes_integration.rs,tests/concurrency_limit_integration.rs,tests/recall_integration.rs,tests/prd_compliance.rs,tests/loom_lock_slots.rs,tests/vacuum_integration.rs,src/commands/optimize.rs,list.rs,health.rs,debug_schema.rs,unlink.rs. Examples:test_link_idempotente_retorna_already_exists→test_link_idempotent_returns_already_exists;prd_optimize_executa_e_retorna_status_ok→prd_optimize_runs_and_returns_status_ok;optimize_response_serializa_campos_obrigatorios→optimize_response_serializes_required_fields. Plus ~80.expect("X falhou")test helpers translated to.expect("X failed"), doc comments and assert messages cleaned insrc/graph.rs,src/memory_guard.rs,src/cli.rs,src/storage/entities.rs, and severaltests/*.rsfiles. Test fixture STRINGS that exercise PT-BR ingestion (e.g. multilingual NER inputs) remain intentionally in PT-BR.
Fixed (Code logic)
- H5 (HIGH): Extended
regex_section_marker()insrc/extraction.rs:210-218to includeCamadaalongsideEtapa,Fase,Passo,Seção,Capítulo. Audit on a 50-file PT-BR corpus showedCamada 1throughCamada 5leaking through toentitieswith degree 3 each, polluting the graph. The filter now strips them at both the regex prefilter and the BERT NER post-merge stages. - M7 (MEDIUM): Expanded
ALL_CAPS_STOPWORDSinsrc/extraction.rs:60-165withADICIONADA,ADICIONADAS,ADICIONADO,ADICIONADOS,CLARO,CONFIRMARAM,CONFIRMEI,CONFIRMOU(alphabetically merged into the list). The earlier audit found these PT-BR adjective/verb forms being captured asconceptentities byregex_all_caps()inapply_regex_prefilter. - L2 (LOW): Daemon spawn backoff in
src/daemon.rs:record_spawn_failurenow applies half jitter (base/2 + rand([0, base/2))) instead of pure exponential. Avoids retry herd if multiple CLI instances detect daemon failure simultaneously. UsesSystemTime::now().subsec_nanos()as a dependency-free entropy source — sufficient for low-frequency spawn coordination. - L5+L6 (LOW):
src/i18n.rs::Language::from_env_or_localenow treats emptySQLITE_GRAPHRAG_LANG=""as unset (notracing::warn!emitted), matching POSIX convention.src/i18n.rs::initshort-circuits when the OnceLock is already populated, preventing the env-resolver from running a second time and emitting the warning twice.
Improved
- M2 (MEDIUM): Added a "JSON Schemas" section to
README.md,README.pt-BR.md,docs/AGENT_PROTOCOL.md, anddocs/AGENT_PROTOCOL.pt-BR.mdlinking to the 30 canonical JSON Schema files indocs/schemas/. These contracts existed since v1.0.33 but were undiscoverable from the public docs. - M4 (MEDIUM):
src/i18n.rs::trno longer leaks one allocation per call. The signature now requires&'static strinputs (which all in-tree callers already pass — they are string literals) and returns one of them directly. The previousBox::leak(en.to_string().into_boxed_str())pattern accumulated allocations in long-running pipelines. - L3 (LOW): Added an MSRV (Rust 1.88) callout to
README.mdandREADME.pt-BR.mdInstallation sections. Previously documented only as a footnote in the Mac Intel notes.
Notes
- M6 was reclassified as a documentation/test artefact:
related --jsonwas reported to returngraph_depth: null, but the field is namedhop_distance(src/commands/related.rs:77and serialised key). The audit query used.graph_depthwhich did not exist. The field has always been populated correctly. No code change required. - L1 (sys_locale) was deferred: the manual
LC_ALL/LANGparsing insrc/i18n.rs:34-57works correctly across the targets used in CI. Addingsys_localewould introduce a dependency for marginal benefit (macOS CFLocale APIs and Windows GetUserDefaultLocaleName) without a confirmed reproducer. - L4 (BERT NER misclassifications) is out of scope:
Tokio=location,Borda=person,Campos=location, andAdapterRun=organizationare limitations ofDavlan/bert-base-multilingual-cased-ner-hrl. Filtering would require either a different model or a curated whitelist; both deferred until they cause concrete user impact. - All 427 lib tests pass with the new test names and translated assertions.
cargo fmt --check,cargo clippy -- -D warnings,cargo doc,cargo audit, andcargo deny check advisories licenses bans sourcesare clean. - The new
language-checkgate in CI now blocks any PR re-introducing PT in tracing/error/doc/assert surfaces.
[1.0.35] - 2026-04-30
Fixed
- WAL-AUTO-INIT (HIGH): Auto-init path (
remember,ingest,recall,list, ... — every command that goes throughensure_db_ready()) now activatesjournal_mode=walconsistently. Before v1.0.35 only the explicitinitcommand flipped journal mode to WAL; databases created on-demand by other commands stayed injournal_mode=delete, breakingsync-safe-copycheckpoint semantics, the documented concurrency guarantees, and the troubleshooting advice that referenced WAL. Fix movesPRAGMA journal_mode = WALintoapply_connection_pragmas(called by everyopen_rw) and adds a defensive re-assertion (ensure_wal_mode) after migrations to neutralise refinery's internal handle reuse. Regression coverage:tests/wal_auto_init_regression.rs. - JSON-SCHEMA-VERSION (MEDIUM-HIGH):
init --json,stats --jsonandmigrate --jsonnow emitschema_versionas a JSON number instead of a string, aligning withhealth --json(which already used number). Fixes parsing inconsistency for clients that consumed both shapes. JSON Schemas (docs/schemas/stats.schema.json,docs/schemas/migrate.schema.json,docs/schemas/debug-schema.schema.json) updated to reflect the canonical type. Breaking for clients that explicitly compared as string; clients using numeric comparisons are unaffected. - DAEMON-SOCKET-FALLBACK (LOW): Unix socket fallback path in
to_local_socket_name()now respectsXDG_RUNTIME_DIRthenSQLITE_GRAPHRAG_HOMEbefore falling back to/tmp. Reduces collision risk on multi-tenant hosts. Path is only used when abstract namespace sockets fail to bind (rare).
Added
- CLI-LIMIT-ALIAS (UX):
recallandhybrid-searchnow accept--limitas alias of-k/--k. Aligns withlist/relatedwhich already used--limit. Non-breaking, additive. - CLI-RENAME-FROM-TO (UX):
renamenow accepts--from/--toas aliases of--name/--new-name. Non-breaking, additive. - JSON-RELATED-INPUT-ECHO (UX):
related --jsonresponse now includesnameandmax_hopsecho fields for input transparency. Non-breaking, additive.
Changed
- GRAPH-NODE-KIND-DEPRECATED:
graph --format jsonstill emits bothkindandtypefields per node, butkindis now formally documented as deprecated (kept for pre-v1.0.35 backward compat). New consumers MUST readtype. The duplicate field will be removed in a future major release.
Documentation
- PRAGMA-USER-VERSION-49: Added doc comment in
src/constants.rsexplaining whySCHEMA_USER_VERSION = 49(project signature for external diagnostic tools) versusCURRENT_SCHEMA_VERSION = 9(application-level migration count). They are intentionally different and serve distinct purposes. - README: Expanded the Memory content lifecycle table with
--body-file/--body-stdin/--entities-file/--relationships-file/--graph-stdinflags forremember, the new aliases forrecall/rename, and a callout about kebab-case ASCII memory name validation. Added explicit rows foringestandcache clear-models.
Notes
- Audit findings #4 (structured truncation flags in JSON output) and #6 (progress/ETA in ingest summary) are deferred to v1.0.36 — they require schema design beyond a patch release. Truncation is currently surfaced via
tracing::warn!only; pipeline consumers should...
v1.0.35
Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog,
and this project adheres to Semantic Versioning.
[Unreleased]
[1.0.35] - 2026-04-30
Fixed
- WAL-AUTO-INIT (HIGH): Auto-init path (
remember,ingest,recall,list, ... — every command that goes throughensure_db_ready()) now activatesjournal_mode=walconsistently. Before v1.0.35 only the explicitinitcommand flipped journal mode to WAL; databases created on-demand by other commands stayed injournal_mode=delete, breakingsync-safe-copycheckpoint semantics, the documented concurrency guarantees, and the troubleshooting advice that referenced WAL. Fix movesPRAGMA journal_mode = WALintoapply_connection_pragmas(called by everyopen_rw) and adds a defensive re-assertion (ensure_wal_mode) after migrations to neutralise refinery's internal handle reuse. Regression coverage:tests/wal_auto_init_regression.rs. - JSON-SCHEMA-VERSION (MEDIUM-HIGH):
init --json,stats --jsonandmigrate --jsonnow emitschema_versionas a JSON number instead of a string, aligning withhealth --json(which already used number). Fixes parsing inconsistency for clients that consumed both shapes. JSON Schemas (docs/schemas/stats.schema.json,docs/schemas/migrate.schema.json,docs/schemas/debug-schema.schema.json) updated to reflect the canonical type. Breaking for clients that explicitly compared as string; clients using numeric comparisons are unaffected. - DAEMON-SOCKET-FALLBACK (LOW): Unix socket fallback path in
to_local_socket_name()now respectsXDG_RUNTIME_DIRthenSQLITE_GRAPHRAG_HOMEbefore falling back to/tmp. Reduces collision risk on multi-tenant hosts. Path is only used when abstract namespace sockets fail to bind (rare).
Added
- CLI-LIMIT-ALIAS (UX):
recallandhybrid-searchnow accept--limitas alias of-k/--k. Aligns withlist/relatedwhich already used--limit. Non-breaking, additive. - CLI-RENAME-FROM-TO (UX):
renamenow accepts--from/--toas aliases of--name/--new-name. Non-breaking, additive. - JSON-RELATED-INPUT-ECHO (UX):
related --jsonresponse now includesnameandmax_hopsecho fields for input transparency. Non-breaking, additive.
Changed
- GRAPH-NODE-KIND-DEPRECATED:
graph --format jsonstill emits bothkindandtypefields per node, butkindis now formally documented as deprecated (kept for pre-v1.0.35 backward compat). New consumers MUST readtype. The duplicate field will be removed in a future major release.
Documentation
- PRAGMA-USER-VERSION-49: Added doc comment in
src/constants.rsexplaining whySCHEMA_USER_VERSION = 49(project signature for external diagnostic tools) versusCURRENT_SCHEMA_VERSION = 9(application-level migration count). They are intentionally different and serve distinct purposes. - README: Expanded the Memory content lifecycle table with
--body-file/--body-stdin/--entities-file/--relationships-file/--graph-stdinflags forremember, the new aliases forrecall/rename, and a callout about kebab-case ASCII memory name validation. Added explicit rows foringestandcache clear-models.
Notes
- Audit findings #4 (structured truncation flags in JSON output) and #6 (progress/ETA in ingest summary) are deferred to v1.0.36 — they require schema design beyond a patch release. Truncation is currently surfaced via
tracing::warn!only; pipeline consumers should monitor stderr. - All 427 lib tests pass. Regression test
wal_auto_init_regression.rsadded (usesassert_cmd+tempfile, same pattern as existing integration tests).
[1.0.34] - 2026-04-30
Added
- JS7 (LOW):
vacuum --jsonresponse now includesreclaimed_bytes: u64derived field, computed assize_before_bytes.saturating_sub(size_after_bytes). Callers no longer need to compute the delta themselves. Schema insrc/commands/vacuum.rs:32-41. Existing fieldssize_before_bytesandsize_after_bytespreserved unchanged.
Documentation
- PRD-sync (LOW): Updated
docs_rules/prd.md(excluded from published crate viaCargo.toml exclude) to reflect schema reality after V008 (v1.0.25) and V009 (v1.0.30) migrations:- MemoryType enum: 7 → 9 (added
document,noteper V009 CHECK constraint andMemoryTypeenum insrc/cli.rs). - EntityType enum: 10 → 13 (added
organization,location,dateper V008 CHECK constraint and BERT NER types).
- MemoryType enum: 7 → 9 (added
Notes
- Audit dimension
unwrap/expectreaffirmed clean byaudit-team-v1033/diagnostician: ZERO production unwraps; 12 production expects all carry English-language documented invariants (regex literal compilation, BERT NER no-NaN logits, OnceLock just-set get, const compile-time invariants) — all fall under CLAUDE.md's "casos impossíveis" exception. - Unsafe blocks audit reaffirmed clean: all ~14
unsafe { }blocks acrossmain.rs(4×),embedder.rs(1×),storage/connection.rs(1×),commands/optimize.rs(2×), andpaths.rs(6× tests) carry SAFETY comments. The earlier finding flagging missing SAFETY comments was a false positive (the comments precede theunsafekeyword, outside-B3grep context). - Bumped patch (1.0.33 → 1.0.34) because the new
reclaimed_bytesfield is purely additive (#[derive(Serialize)]adds the key) and PRD changes are doc-only (file is inCargo.toml exclude). No API removed; no behavior changed.
[1.0.33] - 2026-04-30
Fixed (Linguistic Policy)
- C3-residual (HIGH): Translated remaining Portuguese string in
src/daemon.rs:183(Drop impltracing::debug!for spawn lock removal). v1.0.32 A1 covered lines 113/131/154/307/419 but missed line 183 insideimpl Drop for DaemonSpawnGuard. Audit gaterg '[áéíóúâêôãõç]' src/ -g '!i18n.rs'now returns ZERO matches. - PT-V007 (HIGH): Translated 5-line Portuguese SQL header comment in
migrations/V007__memory_urls.sqlto English. The file is part of the published crate (not inCargo.toml exclude), so docs.rs and crates.io tarball previously shipped Portuguese SQL comments. - AS-PT (MEDIUM): Translated 20 Portuguese
assert!messages to English acrosssrc/commands/hybrid_search.rs(19 occurrences) andsrc/commands/list.rs(1 occurrence). Allmem-* deveria existirassertion messages insrc/storage/memories.rs(9 occurrences) translated tomem-* should exist. Per CLAUDE.md "NUNCAassert!com mensagem em português" — even test code is EN-only.
Fixed (Documentation)
- D3 (MEDIUM): Synchronized
--typedoc-comment insrc/commands/recall.rs:33,src/commands/list.rs:30,src/commands/hybrid_search.rs:35to list all 13 graph entity types (project/tool/person/file/concept/incident/decision/memory/dashboard/issue_tracker/organization/location/date). Previously listed only 10, omittingorganization/location/dateadded bymigrations/V008__expand_entity_types.sql(BERT NER types). Aligns CLI help with PRDdocs_rules/prd.mdand the V008 CHECK constraint.
Notes
- Validated against real-world ingest of 50 representative
.mdfiles (~6.6 MB corpus): 50/50 indexed in 56.9s with--skip-extraction; 5/5 indexed with full BERT NER extraction in 57.3s. All 12 functional CLI scenarios (init, ingest, recall, hybrid-search, list, related, graph, health, stats, lifecycle, vacuum, sync-safe-copy) returned exit 0 with valid JSON. Auto-create ofgraphrag.sqlitein CWD (without priorinit) confirmed working with mode 0600. - Backwards-compatible duplicate fields in
stats --json(memories/memories_total,entities/entities_total,relationships/relationships_total,db_size_bytes/db_bytes,edges/relationships) andlist --json(id/memory_id) are intentional per existing test assertions insrc/commands/stats.rs:244-248andsrc/commands/list.rs:190. They are deliberately preserved for backwards compatibility with existing JSON parsers. schema_versiontype asymmetry betweenstats --json(String) andhealth --json(u32) is documented as a known issue. Normalization tou32everywhere would be a breaking change deferred to v2.0.kill_on_drop(true)for the daemon child process remains N/A (the orphan detach is deliberate, documented insrc/daemon.rs:491-499and v1.0.32 M4 / C2). The CLI must return immediately while the daemon stays warm.
[1.0.32] - 2026-04-30
Fixed (Critical — Audit findings from v1.0.31)
- C1 (CRITICAL): Auto-init unified across all CRUD handlers via new
ensure_db_readyhelper insrc/storage/connection.rs. Previouslyremembersilently auto-created the DB whilerecall,list, etc. returnedNotFound, breaking the implicit "if it works for one, it works for all" contract. Now every CRUD subcommand creates the database on first use with a singletracing::info!("creating database (auto-init) at <path> schema_version=9")log entry. Resolves the 23 inconsistentpaths.db.exists()checks acrossforget,related,optimize,edit,health,hybrid_search,cleanup_orphans,rename,recall,read,vacuum,graph_export(×4),purge,list,history,unlink,link,stats,sync_safe_copy,debug_schema. - C2 (CRITICAL): Documented the deliberate orphan-daemon detach in
src/daemon.rs:487. TheChildhandle is now intentionally dropped with a// SAFETY:comment explaining lifecycle ownership via spawn lock + ready file + idle-timeout shutdown, plus atracing::debug!log capturing the daemon PID.Stdio::null()already covered the I/O detach. - C3 (CRITICAL): New integration test
tests/readme_examples_executable.rsparses everybashfenced block fromREADME.mdandREADME.pt-BR.mdat compile time and executes eachsqlite-graphraginvocation against a real binary in an isolatedTempDir. Blocks containing pipes/redirect...
v1.0.34
Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog,
and this project adheres to Semantic Versioning.
[Unreleased]
[1.0.34] - 2026-04-30
Added
- JS7 (LOW):
vacuum --jsonresponse now includesreclaimed_bytes: u64derived field, computed assize_before_bytes.saturating_sub(size_after_bytes). Callers no longer need to compute the delta themselves. Schema insrc/commands/vacuum.rs:32-41. Existing fieldssize_before_bytesandsize_after_bytespreserved unchanged.
Documentation
- PRD-sync (LOW): Updated
docs_rules/prd.md(excluded from published crate viaCargo.toml exclude) to reflect schema reality after V008 (v1.0.25) and V009 (v1.0.30) migrations:- MemoryType enum: 7 → 9 (added
document,noteper V009 CHECK constraint andMemoryTypeenum insrc/cli.rs). - EntityType enum: 10 → 13 (added
organization,location,dateper V008 CHECK constraint and BERT NER types).
- MemoryType enum: 7 → 9 (added
Notes
- Audit dimension
unwrap/expectreaffirmed clean byaudit-team-v1033/diagnostician: ZERO production unwraps; 12 production expects all carry English-language documented invariants (regex literal compilation, BERT NER no-NaN logits, OnceLock just-set get, const compile-time invariants) — all fall under CLAUDE.md's "casos impossíveis" exception. - Unsafe blocks audit reaffirmed clean: all ~14
unsafe { }blocks acrossmain.rs(4×),embedder.rs(1×),storage/connection.rs(1×),commands/optimize.rs(2×), andpaths.rs(6× tests) carry SAFETY comments. The earlier finding flagging missing SAFETY comments was a false positive (the comments precede theunsafekeyword, outside-B3grep context). - Bumped patch (1.0.33 → 1.0.34) because the new
reclaimed_bytesfield is purely additive (#[derive(Serialize)]adds the key) and PRD changes are doc-only (file is inCargo.toml exclude). No API removed; no behavior changed.
[1.0.33] - 2026-04-30
Fixed (Linguistic Policy)
- C3-residual (HIGH): Translated remaining Portuguese string in
src/daemon.rs:183(Drop impltracing::debug!for spawn lock removal). v1.0.32 A1 covered lines 113/131/154/307/419 but missed line 183 insideimpl Drop for DaemonSpawnGuard. Audit gaterg '[áéíóúâêôãõç]' src/ -g '!i18n.rs'now returns ZERO matches. - PT-V007 (HIGH): Translated 5-line Portuguese SQL header comment in
migrations/V007__memory_urls.sqlto English. The file is part of the published crate (not inCargo.toml exclude), so docs.rs and crates.io tarball previously shipped Portuguese SQL comments. - AS-PT (MEDIUM): Translated 20 Portuguese
assert!messages to English acrosssrc/commands/hybrid_search.rs(19 occurrences) andsrc/commands/list.rs(1 occurrence). Allmem-* deveria existirassertion messages insrc/storage/memories.rs(9 occurrences) translated tomem-* should exist. Per CLAUDE.md "NUNCAassert!com mensagem em português" — even test code is EN-only.
Fixed (Documentation)
- D3 (MEDIUM): Synchronized
--typedoc-comment insrc/commands/recall.rs:33,src/commands/list.rs:30,src/commands/hybrid_search.rs:35to list all 13 graph entity types (project/tool/person/file/concept/incident/decision/memory/dashboard/issue_tracker/organization/location/date). Previously listed only 10, omittingorganization/location/dateadded bymigrations/V008__expand_entity_types.sql(BERT NER types). Aligns CLI help with PRDdocs_rules/prd.mdand the V008 CHECK constraint.
Notes
- Validated against real-world ingest of 50 representative
.mdfiles (~6.6 MB corpus): 50/50 indexed in 56.9s with--skip-extraction; 5/5 indexed with full BERT NER extraction in 57.3s. All 12 functional CLI scenarios (init, ingest, recall, hybrid-search, list, related, graph, health, stats, lifecycle, vacuum, sync-safe-copy) returned exit 0 with valid JSON. Auto-create ofgraphrag.sqlitein CWD (without priorinit) confirmed working with mode 0600. - Backwards-compatible duplicate fields in
stats --json(memories/memories_total,entities/entities_total,relationships/relationships_total,db_size_bytes/db_bytes,edges/relationships) andlist --json(id/memory_id) are intentional per existing test assertions insrc/commands/stats.rs:244-248andsrc/commands/list.rs:190. They are deliberately preserved for backwards compatibility with existing JSON parsers. schema_versiontype asymmetry betweenstats --json(String) andhealth --json(u32) is documented as a known issue. Normalization tou32everywhere would be a breaking change deferred to v2.0.kill_on_drop(true)for the daemon child process remains N/A (the orphan detach is deliberate, documented insrc/daemon.rs:491-499and v1.0.32 M4 / C2). The CLI must return immediately while the daemon stays warm.
[1.0.32] - 2026-04-30
Fixed (Critical — Audit findings from v1.0.31)
- C1 (CRITICAL): Auto-init unified across all CRUD handlers via new
ensure_db_readyhelper insrc/storage/connection.rs. Previouslyremembersilently auto-created the DB whilerecall,list, etc. returnedNotFound, breaking the implicit "if it works for one, it works for all" contract. Now every CRUD subcommand creates the database on first use with a singletracing::info!("creating database (auto-init) at <path> schema_version=9")log entry. Resolves the 23 inconsistentpaths.db.exists()checks acrossforget,related,optimize,edit,health,hybrid_search,cleanup_orphans,rename,recall,read,vacuum,graph_export(×4),purge,list,history,unlink,link,stats,sync_safe_copy,debug_schema. - C2 (CRITICAL): Documented the deliberate orphan-daemon detach in
src/daemon.rs:487. TheChildhandle is now intentionally dropped with a// SAFETY:comment explaining lifecycle ownership via spawn lock + ready file + idle-timeout shutdown, plus atracing::debug!log capturing the daemon PID.Stdio::null()already covered the I/O detach. - C3 (CRITICAL): New integration test
tests/readme_examples_executable.rsparses everybashfenced block fromREADME.mdandREADME.pt-BR.mdat compile time and executes eachsqlite-graphraginvocation against a real binary in an isolatedTempDir. Blocks containing pipes/redirects or marked<!-- skip-test -->are skipped. 22 commands per README are now CI-validated, eliminating the drift uncovered in v1.0.31 (8+ broken examples:--queryvs positional<QUERY>,--top-kvs-k,--dirvs positional<DIR>, etc.).
Fixed (High)
- A1 (HIGH): Translated 8 Portuguese runtime strings to English in
src/lock.rs:36,src/daemon.rs:113,131,154,307,419(including thedaemon.rs:307IPC payload that leaked PT into JSONmessagefields). AddedMessage::EmptyQueryValidationandMessage::EmptyBodyValidation(asvalidation::empty_query()/validation::empty_body()) insrc/i18n.rsso user-visible validation messages remain bilingual; internal errors are EN-only. Audit gaterg '[áéíóúâêôãõç]' src/ -g '!i18n.rs'now returns ZERO matches. - A2 (HIGH): Refactored
src/commands/ingest.rsfrom per-file fork-spawn (Command::new(current_exe).args(["remember", ...]).output()) to in-process pipeline. Loads the embedder once and reuses it across all files viacrate::daemon::embed_passage_or_local. Measured speedup: 50 files in 21 seconds vs ~14 minutes previously (≈40× faster, well under the 60s target). Per-file NDJSON event schema unchanged ({file, name, status, memory_id, action}). - A3 (HIGH): Replaced
.expect("OnceLock populated by set() above")insrc/embedder.rs:56with.ok_or_else(|| AppError::Embedding(...))?propagating a real error variant. Eliminates the only remaining production.expect()outside documented invariants. - A4 (HIGH): Added
#[command(after_long_help = "EXAMPLES: ...")]with 2-4 realistic invocations to 21 subcommands previously missing it (init,daemon,read,list,forget,purge,rename,edit,history,restore,health,migrate,namespace-detect,optimize,stats,sync-safe-copy,vacuum,related,cleanup-orphans,cache,__debug_schema, plus enrichment ofhybrid-search/ingest). - A5 (HIGH): Auto-migrate transparency.
ensure_db_readynow comparesPRAGMA user_versionagainstSCHEMA_USER_VERSIONand runs the remaining migrations automatically when an older DB (e.g. v1.0.27 schema 7) is opened by a newer binary. Logstracing::warn!(from, to, path, "auto-migrating database schema")so operators are not surprised. Eliminates the silent failure mode where stale DBs caused indeterminate runtime errors. - A6 (HIGH): Renamed 23 Portuguese identifiers to English across
tests/property_based.rs,tests/i18n_bilingual_integration.rs,tests/integration.rs,tests/vacuum_integration.rs,tests/exit_codes_integration.rs,tests/regression_v2_0_4.rs,tests/schema_contract_strict.rs,src/errors.rs,src/commands/health.rs. Plus residual PT comments and assert messages insrc/storage/entities.rs,src/commands/remember.rs,src/chunking.rs,src/graph.rs,src/embedder.rs,src/output.rs,src/tz.rs,src/memory_guard.rs,src/daemon.rs,src/lock.rstranslated to English.
Fixed (Medium)
- M1 (MEDIUM):
recall -kandhybrid-search -know usevalue_parser = parse_k_rangevalidating the inclusive range1..=4096(matchessqlite-vec's knn limit) at parse time. Out-of-range values surface a clean Clap error instead of leaking the engine's"k value in knn query too large"message. Added unit tests insrc/parsers/mod.rs. - M2 (MEDIUM):
purgeUX clarified. Added alias--max-age-daysfor the existing--retention-days. Whenpurged_count == 0, the JSON response now includes amessagefield (`"no soft-de...
v1.0.33
Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog,
and this project adheres to Semantic Versioning.
[Unreleased]
[1.0.33] - 2026-04-30
Fixed (Linguistic Policy)
- C3-residual (HIGH): Translated remaining Portuguese string in
src/daemon.rs:183(Drop impltracing::debug!for spawn lock removal). v1.0.32 A1 covered lines 113/131/154/307/419 but missed line 183 insideimpl Drop for DaemonSpawnGuard. Audit gaterg '[áéíóúâêôãõç]' src/ -g '!i18n.rs'now returns ZERO matches. - PT-V007 (HIGH): Translated 5-line Portuguese SQL header comment in
migrations/V007__memory_urls.sqlto English. The file is part of the published crate (not inCargo.toml exclude), so docs.rs and crates.io tarball previously shipped Portuguese SQL comments. - AS-PT (MEDIUM): Translated 20 Portuguese
assert!messages to English acrosssrc/commands/hybrid_search.rs(19 occurrences) andsrc/commands/list.rs(1 occurrence). Allmem-* deveria existirassertion messages insrc/storage/memories.rs(9 occurrences) translated tomem-* should exist. Per CLAUDE.md "NUNCAassert!com mensagem em português" — even test code is EN-only.
Fixed (Documentation)
- D3 (MEDIUM): Synchronized
--typedoc-comment insrc/commands/recall.rs:33,src/commands/list.rs:30,src/commands/hybrid_search.rs:35to list all 13 graph entity types (project/tool/person/file/concept/incident/decision/memory/dashboard/issue_tracker/organization/location/date). Previously listed only 10, omittingorganization/location/dateadded bymigrations/V008__expand_entity_types.sql(BERT NER types). Aligns CLI help with PRDdocs_rules/prd.mdand the V008 CHECK constraint.
Notes
- Validated against real-world ingest of 50 representative
.mdfiles (~6.6 MB corpus): 50/50 indexed in 56.9s with--skip-extraction; 5/5 indexed with full BERT NER extraction in 57.3s. All 12 functional CLI scenarios (init, ingest, recall, hybrid-search, list, related, graph, health, stats, lifecycle, vacuum, sync-safe-copy) returned exit 0 with valid JSON. Auto-create ofgraphrag.sqlitein CWD (without priorinit) confirmed working with mode 0600. - Backwards-compatible duplicate fields in
stats --json(memories/memories_total,entities/entities_total,relationships/relationships_total,db_size_bytes/db_bytes,edges/relationships) andlist --json(id/memory_id) are intentional per existing test assertions insrc/commands/stats.rs:244-248andsrc/commands/list.rs:190. They are deliberately preserved for backwards compatibility with existing JSON parsers. schema_versiontype asymmetry betweenstats --json(String) andhealth --json(u32) is documented as a known issue. Normalization tou32everywhere would be a breaking change deferred to v2.0.kill_on_drop(true)for the daemon child process remains N/A (the orphan detach is deliberate, documented insrc/daemon.rs:491-499and v1.0.32 M4 / C2). The CLI must return immediately while the daemon stays warm.
[1.0.32] - 2026-04-30
Fixed (Critical — Audit findings from v1.0.31)
- C1 (CRITICAL): Auto-init unified across all CRUD handlers via new
ensure_db_readyhelper insrc/storage/connection.rs. Previouslyremembersilently auto-created the DB whilerecall,list, etc. returnedNotFound, breaking the implicit "if it works for one, it works for all" contract. Now every CRUD subcommand creates the database on first use with a singletracing::info!("creating database (auto-init) at <path> schema_version=9")log entry. Resolves the 23 inconsistentpaths.db.exists()checks acrossforget,related,optimize,edit,health,hybrid_search,cleanup_orphans,rename,recall,read,vacuum,graph_export(×4),purge,list,history,unlink,link,stats,sync_safe_copy,debug_schema. - C2 (CRITICAL): Documented the deliberate orphan-daemon detach in
src/daemon.rs:487. TheChildhandle is now intentionally dropped with a// SAFETY:comment explaining lifecycle ownership via spawn lock + ready file + idle-timeout shutdown, plus atracing::debug!log capturing the daemon PID.Stdio::null()already covered the I/O detach. - C3 (CRITICAL): New integration test
tests/readme_examples_executable.rsparses everybashfenced block fromREADME.mdandREADME.pt-BR.mdat compile time and executes eachsqlite-graphraginvocation against a real binary in an isolatedTempDir. Blocks containing pipes/redirects or marked<!-- skip-test -->are skipped. 22 commands per README are now CI-validated, eliminating the drift uncovered in v1.0.31 (8+ broken examples:--queryvs positional<QUERY>,--top-kvs-k,--dirvs positional<DIR>, etc.).
Fixed (High)
- A1 (HIGH): Translated 8 Portuguese runtime strings to English in
src/lock.rs:36,src/daemon.rs:113,131,154,307,419(including thedaemon.rs:307IPC payload that leaked PT into JSONmessagefields). AddedMessage::EmptyQueryValidationandMessage::EmptyBodyValidation(asvalidation::empty_query()/validation::empty_body()) insrc/i18n.rsso user-visible validation messages remain bilingual; internal errors are EN-only. Audit gaterg '[áéíóúâêôãõç]' src/ -g '!i18n.rs'now returns ZERO matches. - A2 (HIGH): Refactored
src/commands/ingest.rsfrom per-file fork-spawn (Command::new(current_exe).args(["remember", ...]).output()) to in-process pipeline. Loads the embedder once and reuses it across all files viacrate::daemon::embed_passage_or_local. Measured speedup: 50 files in 21 seconds vs ~14 minutes previously (≈40× faster, well under the 60s target). Per-file NDJSON event schema unchanged ({file, name, status, memory_id, action}). - A3 (HIGH): Replaced
.expect("OnceLock populated by set() above")insrc/embedder.rs:56with.ok_or_else(|| AppError::Embedding(...))?propagating a real error variant. Eliminates the only remaining production.expect()outside documented invariants. - A4 (HIGH): Added
#[command(after_long_help = "EXAMPLES: ...")]with 2-4 realistic invocations to 21 subcommands previously missing it (init,daemon,read,list,forget,purge,rename,edit,history,restore,health,migrate,namespace-detect,optimize,stats,sync-safe-copy,vacuum,related,cleanup-orphans,cache,__debug_schema, plus enrichment ofhybrid-search/ingest). - A5 (HIGH): Auto-migrate transparency.
ensure_db_readynow comparesPRAGMA user_versionagainstSCHEMA_USER_VERSIONand runs the remaining migrations automatically when an older DB (e.g. v1.0.27 schema 7) is opened by a newer binary. Logstracing::warn!(from, to, path, "auto-migrating database schema")so operators are not surprised. Eliminates the silent failure mode where stale DBs caused indeterminate runtime errors. - A6 (HIGH): Renamed 23 Portuguese identifiers to English across
tests/property_based.rs,tests/i18n_bilingual_integration.rs,tests/integration.rs,tests/vacuum_integration.rs,tests/exit_codes_integration.rs,tests/regression_v2_0_4.rs,tests/schema_contract_strict.rs,src/errors.rs,src/commands/health.rs. Plus residual PT comments and assert messages insrc/storage/entities.rs,src/commands/remember.rs,src/chunking.rs,src/graph.rs,src/embedder.rs,src/output.rs,src/tz.rs,src/memory_guard.rs,src/daemon.rs,src/lock.rstranslated to English.
Fixed (Medium)
- M1 (MEDIUM):
recall -kandhybrid-search -know usevalue_parser = parse_k_rangevalidating the inclusive range1..=4096(matchessqlite-vec's knn limit) at parse time. Out-of-range values surface a clean Clap error instead of leaking the engine's"k value in knn query too large"message. Added unit tests insrc/parsers/mod.rs. - M2 (MEDIUM):
purgeUX clarified. Added alias--max-age-daysfor the existing--retention-days. Whenpurged_count == 0, the JSON response now includes amessagefield ("no soft-deleted memories older than {N} day(s); use --retention-days 0 to purge all soft-deleted memories regardless of age"). Help text on--yesrewritten to clarify it confirms intent but does NOT override--retention-days. - M3 (MEDIUM): Added
#[arg(help = "...")]to 9 positional arguments previously bare in--helpoutput:recall <QUERY>,hybrid-search <QUERY>,ingest <DIR>,read <NAME>,forget <NAME>,rename <NAME>,edit <NAME>,history <NAME>,related <NAME>. - M4 (MEDIUM): Verified
daemon --stopalready exists (dispatches tocrate::daemon::try_shutdown) and that the autostart spawn path usesstd::process::Commandwith intentional orphan detach (documented under C2).tokio::process::Commandkill_on_drop(true)was N/A — code path uses std spawn — so no change needed; the C2 safety comment now explains the design rationale. - M5 (MEDIUM): Audit finding "duplicate v1.0.29 entries with date 2026-04-29" was a false positive (v1.0.29 and v1.0.30 are distinct entries that legitimately share
2026-04-29as their release date). No CHANGELOG change required.
Fixed (Low)
- B_1 (LOW): README structure (split
README.md+README.pt-BR.md) preserved; the bilingual policy is documented elsewhere. ADR not required since the split is a deliberate product decision predating the audit. - B_2 (LOW): Added GitHub Actions CI badge (
[](...)) to bothREADME.mdandREADME.pt-BR.md. Final badge order: crates.io → docs.rs → CI → license → Contributor Covenant. - B_3 (LOW): Added bash example blocks for 16 subcommands previously without one in either README:
daemon,ingest,rename,edit,restore,migrate,namespace-detect,optimize,vacuum,link,unlink,related, `graph...
v1.0.32
Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog,
and this project adheres to Semantic Versioning.
[Unreleased]
[1.0.32] - 2026-04-30
Fixed (Critical — Audit findings from v1.0.31)
- C1 (CRITICAL): Auto-init unified across all CRUD handlers via new
ensure_db_readyhelper insrc/storage/connection.rs. Previouslyremembersilently auto-created the DB whilerecall,list, etc. returnedNotFound, breaking the implicit "if it works for one, it works for all" contract. Now every CRUD subcommand creates the database on first use with a singletracing::info!("creating database (auto-init) at <path> schema_version=9")log entry. Resolves the 23 inconsistentpaths.db.exists()checks acrossforget,related,optimize,edit,health,hybrid_search,cleanup_orphans,rename,recall,read,vacuum,graph_export(×4),purge,list,history,unlink,link,stats,sync_safe_copy,debug_schema. - C2 (CRITICAL): Documented the deliberate orphan-daemon detach in
src/daemon.rs:487. TheChildhandle is now intentionally dropped with a// SAFETY:comment explaining lifecycle ownership via spawn lock + ready file + idle-timeout shutdown, plus atracing::debug!log capturing the daemon PID.Stdio::null()already covered the I/O detach. - C3 (CRITICAL): New integration test
tests/readme_examples_executable.rsparses everybashfenced block fromREADME.mdandREADME.pt-BR.mdat compile time and executes eachsqlite-graphraginvocation against a real binary in an isolatedTempDir. Blocks containing pipes/redirects or marked<!-- skip-test -->are skipped. 22 commands per README are now CI-validated, eliminating the drift uncovered in v1.0.31 (8+ broken examples:--queryvs positional<QUERY>,--top-kvs-k,--dirvs positional<DIR>, etc.).
Fixed (High)
- A1 (HIGH): Translated 8 Portuguese runtime strings to English in
src/lock.rs:36,src/daemon.rs:113,131,154,307,419(including thedaemon.rs:307IPC payload that leaked PT into JSONmessagefields). AddedMessage::EmptyQueryValidationandMessage::EmptyBodyValidation(asvalidation::empty_query()/validation::empty_body()) insrc/i18n.rsso user-visible validation messages remain bilingual; internal errors are EN-only. Audit gaterg '[áéíóúâêôãõç]' src/ -g '!i18n.rs'now returns ZERO matches. - A2 (HIGH): Refactored
src/commands/ingest.rsfrom per-file fork-spawn (Command::new(current_exe).args(["remember", ...]).output()) to in-process pipeline. Loads the embedder once and reuses it across all files viacrate::daemon::embed_passage_or_local. Measured speedup: 50 files in 21 seconds vs ~14 minutes previously (≈40× faster, well under the 60s target). Per-file NDJSON event schema unchanged ({file, name, status, memory_id, action}). - A3 (HIGH): Replaced
.expect("OnceLock populated by set() above")insrc/embedder.rs:56with.ok_or_else(|| AppError::Embedding(...))?propagating a real error variant. Eliminates the only remaining production.expect()outside documented invariants. - A4 (HIGH): Added
#[command(after_long_help = "EXAMPLES: ...")]with 2-4 realistic invocations to 21 subcommands previously missing it (init,daemon,read,list,forget,purge,rename,edit,history,restore,health,migrate,namespace-detect,optimize,stats,sync-safe-copy,vacuum,related,cleanup-orphans,cache,__debug_schema, plus enrichment ofhybrid-search/ingest). - A5 (HIGH): Auto-migrate transparency.
ensure_db_readynow comparesPRAGMA user_versionagainstSCHEMA_USER_VERSIONand runs the remaining migrations automatically when an older DB (e.g. v1.0.27 schema 7) is opened by a newer binary. Logstracing::warn!(from, to, path, "auto-migrating database schema")so operators are not surprised. Eliminates the silent failure mode where stale DBs caused indeterminate runtime errors. - A6 (HIGH): Renamed 23 Portuguese identifiers to English across
tests/property_based.rs,tests/i18n_bilingual_integration.rs,tests/integration.rs,tests/vacuum_integration.rs,tests/exit_codes_integration.rs,tests/regression_v2_0_4.rs,tests/schema_contract_strict.rs,src/errors.rs,src/commands/health.rs. Plus residual PT comments and assert messages insrc/storage/entities.rs,src/commands/remember.rs,src/chunking.rs,src/graph.rs,src/embedder.rs,src/output.rs,src/tz.rs,src/memory_guard.rs,src/daemon.rs,src/lock.rstranslated to English.
Fixed (Medium)
- M1 (MEDIUM):
recall -kandhybrid-search -know usevalue_parser = parse_k_rangevalidating the inclusive range1..=4096(matchessqlite-vec's knn limit) at parse time. Out-of-range values surface a clean Clap error instead of leaking the engine's"k value in knn query too large"message. Added unit tests insrc/parsers/mod.rs. - M2 (MEDIUM):
purgeUX clarified. Added alias--max-age-daysfor the existing--retention-days. Whenpurged_count == 0, the JSON response now includes amessagefield ("no soft-deleted memories older than {N} day(s); use --retention-days 0 to purge all soft-deleted memories regardless of age"). Help text on--yesrewritten to clarify it confirms intent but does NOT override--retention-days. - M3 (MEDIUM): Added
#[arg(help = "...")]to 9 positional arguments previously bare in--helpoutput:recall <QUERY>,hybrid-search <QUERY>,ingest <DIR>,read <NAME>,forget <NAME>,rename <NAME>,edit <NAME>,history <NAME>,related <NAME>. - M4 (MEDIUM): Verified
daemon --stopalready exists (dispatches tocrate::daemon::try_shutdown) and that the autostart spawn path usesstd::process::Commandwith intentional orphan detach (documented under C2).tokio::process::Commandkill_on_drop(true)was N/A — code path uses std spawn — so no change needed; the C2 safety comment now explains the design rationale. - M5 (MEDIUM): Audit finding "duplicate v1.0.29 entries with date 2026-04-29" was a false positive (v1.0.29 and v1.0.30 are distinct entries that legitimately share
2026-04-29as their release date). No CHANGELOG change required.
Fixed (Low)
- B_1 (LOW): README structure (split
README.md+README.pt-BR.md) preserved; the bilingual policy is documented elsewhere. ADR not required since the split is a deliberate product decision predating the audit. - B_2 (LOW): Added GitHub Actions CI badge (
[](...)) to bothREADME.mdandREADME.pt-BR.md. Final badge order: crates.io → docs.rs → CI → license → Contributor Covenant. - B_3 (LOW): Added bash example blocks for 16 subcommands previously without one in either README:
daemon,ingest,rename,edit,restore,migrate,namespace-detect,optimize,vacuum,link,unlink,related,graph(withstats/traverse/entitiessubcommands),cleanup-orphans,cache,history. All 16 examples are validated by the newtests/readme_examples_executable.rs. - B_4 (LOW):
rememberJSON output now includesname_was_normalized: boolandoriginal_name: Option<String>(the latter elided via#[serde(skip_serializing_if = "Option::is_none")]when normalization was a no-op). Closes the UX gap where users passing--name "Hello World"saw only"name": "hello-world"with no indication that normalization had happened.
Added
tests/readme_examples_executable.rs— 442-line integration test (8 unit + 2 integration tests) validating every README bash example.parse_k_rangevalue parser insrc/parsers/mod.rswith full unit-test coverage of edge cases (zero, above-limit, non-integer, negative).validation::empty_query()andvalidation::empty_body()bilingual messages insrc/i18n.rs.ensure_db_ready(&AppPaths)helper insrc/storage/connection.rs(also makesregister_vec_extensionidempotent viaOnceLock).insert_default_schema_metahelper extracted to ensure auto-init populatesschema_version,model,dim,created_at,sqlite-graphrag_versionconsistently with explicitinit.
Changed
src/commands/ingest.rsgrew from 565 to ~959 lines as the in-process pipeline replicatesremember::run's validation + chunking + embedding + persistence transaction. The previous version offloaded that work to a child process per file.register_vec_extensionis now idempotent (guarded byOnceLock); safe to invoke from bothmain.rsand library helpers (unblocks unit tests touching CRUD handlers).- Optimize test
optimize_returns_not_found_when_db_missingrenamed tooptimize_auto_inits_when_db_missingand inverted to assert success (the new auto-init contract). - CI-aligned
clippy::uninlined_format_argscleanup on the newensure_db_readylog line.
Notes
- Validation pipeline summary:
cargo fmt --check✓,cargo clippy -- -D warnings✓,cargo test --lib427/427 ✓,cargo doc --no-deps✓ zero warnings,cargo audit✓ (2 pre-allowed advisories perdeny.toml),cargo deny check advisories licenses bans sources✓. - Language gate audit:
rg '[áéíóúâêôãõç]' src/ -g '!i18n.rs'returns ZERO matches. - Performance baseline: 50 files ingest in 21s wall-clock (≈40× faster than v1.0.31).
[1.0.31] - 2026-04-30
Fixed
- A2 (P1-CRITICAL):
ingestsubcommand now emits proper NDJSON (one JSON object per line). Previously emitted pretty-printed multiline JSON, breaking line-by-line consumers. Switched 5 calls insrc/commands/ingest.rsfromoutput::emit_jsontooutput::emit_json_compact. - A3 (P1-MEDIUM):
stats --jsonnow reports correctschema_versionvalue (e.g., "9") read fromrefinery_schema_historytable. Previously returned "unknown" because emptyschema_metatable was queri...
v1.0.31
Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog,
and this project adheres to Semantic Versioning.
[Unreleased]
[1.0.31] - 2026-04-30
Fixed
- A2 (P1-CRITICAL):
ingestsubcommand now emits proper NDJSON (one JSON object per line). Previously emitted pretty-printed multiline JSON, breaking line-by-line consumers. Switched 5 calls insrc/commands/ingest.rsfromoutput::emit_jsontooutput::emit_json_compact. - A3 (P1-MEDIUM):
stats --jsonnow reports correctschema_versionvalue (e.g., "9") read fromrefinery_schema_historytable. Previously returned "unknown" because emptyschema_metatable was queried. - A4 (P1-MEDIUM):
forgetcommand now populatesactionanddeleted_atfields in JSON output. Three explicit states:soft_deleted,already_deleted,not_found. Race-safe via re-SELECT after soft-delete. - A1 (P0-CRITICAL): Extraction pipeline no longer hangs on documents larger than ~50 KB. Added
EXTRACTION_MAX_TOKENS=5000cap (env overrideSQLITE_GRAPHRAG_EXTRACTION_MAX_TOKENS). Body exceeding cap is truncated for NER but full body still goes through regex. Empirical impact: 68 KB document went from >5 minutes to ~37 seconds (88% reduction) while preservingextraction_method=bert+regex-batch. - A9 (P2-MEDIUM): Relationship fan-out reduced — entities co-occurring in same sentence/paragraph now generate edges; previously generated C(N,2) "mentions" between all entities in memory.
- A10 (P2-MEDIUM): Name truncation at 60 chars now logs
tracing::warnand handles collisions with numeric suffix (-1, -2, ...).
Added
- A6: New integration test suite
tests/ingest_integration.rscovering NDJSON contract, fail-fast, max-files, name truncation, --skip-extraction, --pattern variants, recursive walk. - A7: V009 end-to-end migration tests in
tests/schema_migration_integration.rs:v009_document_type_lifecycle_e2e,v009_note_type_lifecycle_e2e,v009_invalid_type_rejected. - A11: PT-BR uppercase stoplist for NER false-positive filter (ADAPTER, PROJETO, PASSIVA, SOMENTE, LEITURA, etc.). Improves entity extraction quality for Portuguese-language corpora.
Improved
- A5 (P1-MEDIUM): Renamed 210 test functions in
src/*across 35 files from Portuguese to English identifiers (also covered helper functions likenova_memoria→new_memory,cria_node→make_node,resposta_vazia→empty_response). Brings codebase into full compliance with project's English-exclusive language policy for identifiers. - A8 (P1-MEDIUM): Refined production-only
.unwrap()/.expect()calls. Original audit count of 167 was inflated — most matches were inside#[cfg(test)] mod testsblocks (acceptable per CLAUDE.md). The actual production-path inventory was 13 occurrences. Improvements: 1.expect()insrc/embedder.rsgot a more precise invariant message; 10Regex::new(LITERAL).unwrap()insrc/extraction.rsstaticOnceLockinitializers replaced with.expect("compile-time validated <kind> regex literal"); 2.max_by(...).unwrap()over BERT NER logits replaced with.expect("BERT NER logits invariant: no NaN in classifier output"); 1.expect()insrc/chunking.rstranslated from PT to EN. The 4.unwrap()calls insrc/graph.rs, 3 insrc/namespace.rs, and 2 insrc/output.rsare inside///doctests (idiomatic per Rust API Guidelines C-EXAMPLE). - A12+A13: Translated ~38 PT comments in
tests/signal_handling_integration.rs,tests/lock_integration.rs, anddeny.toml. Removed 2 obsolete[advisories.ignore]entries (RUSTSEC-2024-0436, RUSTSEC-2025-0119) —cargo deny checknow reports zero advisory-not-detected warnings. - A14: Translated ~150 additional PT comments in
tests/prd_compliance.rs,tests/integration.rs,tests/concurrency_hardened.rs,tests/security_hardening.rs, and other test files.
Audit Methodology
- 13 gaps identified empirically via plan-mode audit on installed v1.0.30 binary against real-file corpus (20 markdown PT-BR docs).
- All fixes validated via PDCA + Agent Teams orchestration: 11 tasks, 9 teammates spawned in parallel, each with Rule Zero compliance and per-task validation.
- Validation passed: cargo fmt, cargo clippy --all-targets -- -D warnings, cargo audit, cargo deny check, cargo doc -D warnings, cargo nextest run.
[1.0.30] - 2026-04-29
Added (New Subcommand — Bulk Ingestion)
sqlite-graphrag ingest <DIR> --type <TYPE>subcommand for bulk-indexing every file in a directory as a separate memory. Supports--pattern(default*.md),--recursive,--skip-extraction,--fail-fast,--max-files(safety cap default 10000),--namespace,--db. Output is line-delimited JSON: one event per file ({file, name, status, memory_id, action}) followed by a final summary ({summary: true, files_total, files_succeeded, files_failed, files_skipped, elapsed_ms}). Names are derived from file basenames in kebab-case. Each file is processed by spawning a childremember --body-fileinvocation, so concurrency slots, lock semantics, and error semantics match standaloneremember. Resolves the long-standing UX gap where users had to shell-script overfor f in *.md; do remember ...; doneto ingest a corpus.
Changed (Help Text Clarity — link / unlink)
link --helpandunlink --helpnow make explicit that--fromand--toaccept ENTITY names (graph nodes auto-extracted by BERT NER, or created implicitly by priorlinkcalls), NOT memory names. Includes anEXAMPLES:block and aNOTES:block inafter_long_help. Previously the bare doc-comment "Source entity" was easily misread as "memory name" by new users; the resultingErro: entidade '<name>' não existewas confusing because the user thought they were passing a valid memory name. Field doc comments now mentiongraph --format json | jaq '.nodes[].name'as the canonical way to list eligible entity names.
Changed (Dependencies — rusqlite/refinery upgrade)
rusqlitebumped from0.32to0.37andrefinerybumped from0.8to0.9. Cargo.lock now resolvesrusqlite v0.37.0,refinery v0.9.1,refinery-core v0.9.1,refinery-macros v0.9.1, andlibsqlite3-sys v0.35.0. Zero source code changes were required — both crates kept the public APIs we use stable across these versions. Reach for rusqlite 0.39 was blocked byrefinery-core 0.9.0cappingrusqlite = ">=0.23, <=0.37"; revisit when refinery raises that ceiling.
Fixed (Critical — Schema/CLI Contract Mismatch)
migrations/V009__expand_memory_types.sql— new migration that recreates thememoriestable (and its FK children:memory_versions,memory_chunks,memory_entities,memory_relationships,memory_urls) to expand thetypeCHECK constraint from 7 to 9 values, adding'document'and'note'. Without this migration,--type documentand--type note(added to the CLI enum in v1.0.29) were always rejected at runtime withexit 10—CHECK constraint failed: type IN ('user','feedback','project','reference','decision','incident','skill'). The CLI Clap layer accepted nine values while the database enforced seven, breaking every README example that used--type document.tests/schema_migration_integration.rsupdated to assert exactly 9 migrations applied (previously expected 6) andschema_version = "9".
Fixed (Critical — Language Policy Violations Missed by v1.0.28 Audit)
The v1.0.28 audit used a single-line regex (rg "tracing::(info|warn|error|debug)!.*[áéíóúâêôãõç]") and reported zero violations. Multi-line macro invocations and identifiers without diacritics escaped detection. Fixed in this release:
src/extraction.rs:749— Portuguesetracing::warn!("relacionamentos truncados em {max_rels} (com {n} entidades, máx teórico era ~{}× combinações)", ...)translated to"relationships truncated to {max_rels} (with {n} entities, theoretical max was ~{}x combinations)".src/extraction.rs:1025— Portuguesetracing::warn!("extração truncada em {MAX_ENTS} entidades (entrada tinha {total_input} candidatos antes da deduplicação)")translated to"extraction truncated at {MAX_ENTS} entities (input had {total_input} candidates before deduplication)".src/extraction.rs— Eight.context("..."),.with_context(|| format!("..."))andanyhow::anyhow!("...")calls translated from Portuguese to English:"forward pass do BertModel"→"BertModel forward pass","forward pass do classificador"→"classifier forward pass","removendo dimensão batch"→"removing batch dimension","criando tensor de ids para batch"→"creating id tensor for batch","padding tensor de ids"→"padding id tensor","criando tensor de máscara para batch"→"creating mask tensor for batch","criando token_type_ids batch"→"creating token_type_ids tensor for batch","forward pass batch BertModel"→"BertModel batch forward pass","criando diretório do modelo"→"creating model directory","carregando tokenizer NER"→"loading NER tokenizer","encoding NER"→"encoding NER input".src/daemon.rs— Twotracing::*!strings translated:"falha ao remover lock file de spawn ao encerrar daemon"→"failed to remove spawn lock file while shutting down daemon";"daemon encerrado graciosamente; socket será limpo pelo OS ou pelo próximo daemon via try_overwrite"→"daemon shut down gracefully; socket will be cleaned up by OS or by the next daemon via try_overwrite".src/commands/restore.rs—tracing::info!("restore --version omitido; usando última versão não-restore: {}", v)translated to"restore --version omitted; using latest non-restore version: {}".
Fixed (Test Identifiers — English-only Policy)
~80 test identifiers (function names, helper names, mod names, type aliases) renamed from Portuguese to ...
v1.0.30
Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog,
and this project adheres to Semantic Versioning.
[Unreleased]
[1.0.30] - 2026-04-29
Added (New Subcommand — Bulk Ingestion)
sqlite-graphrag ingest <DIR> --type <TYPE>subcommand for bulk-indexing every file in a directory as a separate memory. Supports--pattern(default*.md),--recursive,--skip-extraction,--fail-fast,--max-files(safety cap default 10000),--namespace,--db. Output is line-delimited JSON: one event per file ({file, name, status, memory_id, action}) followed by a final summary ({summary: true, files_total, files_succeeded, files_failed, files_skipped, elapsed_ms}). Names are derived from file basenames in kebab-case. Each file is processed by spawning a childremember --body-fileinvocation, so concurrency slots, lock semantics, and error semantics match standaloneremember. Resolves the long-standing UX gap where users had to shell-script overfor f in *.md; do remember ...; doneto ingest a corpus.
Changed (Help Text Clarity — link / unlink)
link --helpandunlink --helpnow make explicit that--fromand--toaccept ENTITY names (graph nodes auto-extracted by BERT NER, or created implicitly by priorlinkcalls), NOT memory names. Includes anEXAMPLES:block and aNOTES:block inafter_long_help. Previously the bare doc-comment "Source entity" was easily misread as "memory name" by new users; the resultingErro: entidade '<name>' não existewas confusing because the user thought they were passing a valid memory name. Field doc comments now mentiongraph --format json | jaq '.nodes[].name'as the canonical way to list eligible entity names.
Changed (Dependencies — rusqlite/refinery upgrade)
rusqlitebumped from0.32to0.37andrefinerybumped from0.8to0.9. Cargo.lock now resolvesrusqlite v0.37.0,refinery v0.9.1,refinery-core v0.9.1,refinery-macros v0.9.1, andlibsqlite3-sys v0.35.0. Zero source code changes were required — both crates kept the public APIs we use stable across these versions. Reach for rusqlite 0.39 was blocked byrefinery-core 0.9.0cappingrusqlite = ">=0.23, <=0.37"; revisit when refinery raises that ceiling.
Fixed (Critical — Schema/CLI Contract Mismatch)
migrations/V009__expand_memory_types.sql— new migration that recreates thememoriestable (and its FK children:memory_versions,memory_chunks,memory_entities,memory_relationships,memory_urls) to expand thetypeCHECK constraint from 7 to 9 values, adding'document'and'note'. Without this migration,--type documentand--type note(added to the CLI enum in v1.0.29) were always rejected at runtime withexit 10—CHECK constraint failed: type IN ('user','feedback','project','reference','decision','incident','skill'). The CLI Clap layer accepted nine values while the database enforced seven, breaking every README example that used--type document.tests/schema_migration_integration.rsupdated to assert exactly 9 migrations applied (previously expected 6) andschema_version = "9".
Fixed (Critical — Language Policy Violations Missed by v1.0.28 Audit)
The v1.0.28 audit used a single-line regex (rg "tracing::(info|warn|error|debug)!.*[áéíóúâêôãõç]") and reported zero violations. Multi-line macro invocations and identifiers without diacritics escaped detection. Fixed in this release:
src/extraction.rs:749— Portuguesetracing::warn!("relacionamentos truncados em {max_rels} (com {n} entidades, máx teórico era ~{}× combinações)", ...)translated to"relationships truncated to {max_rels} (with {n} entities, theoretical max was ~{}x combinations)".src/extraction.rs:1025— Portuguesetracing::warn!("extração truncada em {MAX_ENTS} entidades (entrada tinha {total_input} candidatos antes da deduplicação)")translated to"extraction truncated at {MAX_ENTS} entities (input had {total_input} candidates before deduplication)".src/extraction.rs— Eight.context("..."),.with_context(|| format!("..."))andanyhow::anyhow!("...")calls translated from Portuguese to English:"forward pass do BertModel"→"BertModel forward pass","forward pass do classificador"→"classifier forward pass","removendo dimensão batch"→"removing batch dimension","criando tensor de ids para batch"→"creating id tensor for batch","padding tensor de ids"→"padding id tensor","criando tensor de máscara para batch"→"creating mask tensor for batch","criando token_type_ids batch"→"creating token_type_ids tensor for batch","forward pass batch BertModel"→"BertModel batch forward pass","criando diretório do modelo"→"creating model directory","carregando tokenizer NER"→"loading NER tokenizer","encoding NER"→"encoding NER input".src/daemon.rs— Twotracing::*!strings translated:"falha ao remover lock file de spawn ao encerrar daemon"→"failed to remove spawn lock file while shutting down daemon";"daemon encerrado graciosamente; socket será limpo pelo OS ou pelo próximo daemon via try_overwrite"→"daemon shut down gracefully; socket will be cleaned up by OS or by the next daemon via try_overwrite".src/commands/restore.rs—tracing::info!("restore --version omitido; usando última versão não-restore: {}", v)translated to"restore --version omitted; using latest non-restore version: {}".
Fixed (Test Identifiers — English-only Policy)
~80 test identifiers (function names, helper names, mod names, type aliases) renamed from Portuguese to English. Phase 1 audit only flagged the diacritic subset (*ção, *á); identifiers without accents (*_aceita_, *_rejeita, *_funciona, *_retorna, etc.) were missed. Touched files:
src/cli.rs—mod testes_concorrencia_pesada→mod heavy_concurrency_tests;mod testes_formato_json_only→mod json_only_format_tests; 3 inner test fns renamed.src/paths.rs—limpar_env_pathshelper + 5 test fns renamed (home_env_resolve_db_em_subdir,home_env_traversal_rejeitado,db_path_vence_home,flag_vence_home,home_env_vazio_cai_para_cwd,parent_or_err_aceita/rejeita_*).src/errors.rs— 11 test fns renamed (the_em_portuguessuffix family + 3 others).src/commands/init.rs— 5 test fns renamed (init_response_serializa_*,latest_schema_version_retorna_*,init_response_dim/namespace_alinhado_*).src/commands/migrate.rs— 5 test fns + 2 helper fns renamed.src/extraction.rs— 11 internal test fns renamed (theiob_mapeia_*,regex_*_aceita_*,build_relationships_sem_duplicatas, etc.).src/output.rs,src/memory_guard.rs,src/commands/{sync_safe_copy, cleanup_orphans, list, vacuum}.rs— 7 test fns renamed.src/storage/{urls, memories, entities}.rs—type Resultado→type TestResult(3 modules, ~70 occurrences).tests/security_hardening.rs— 16 test fns renamed (test_path_traversal_rejeitado_*,test_chmod_*_apos_init_*,test_blake3_*_diferente_*,test_sql_injection_em_*, etc.).tests/integration.rs— ~28 test fns renamed (thetest_remember_cria/rejeita/aceita_*,test_link_cria_relacao_*,test_graph_stdin_aceita_*, etc.).tests/prd_compliance.rs— ~15 test fns renamed.tests/concurrency_*.rs,tests/i18n_bilingual_integration.rs,tests/signal_handling_integration.rs,tests/v2_breaking_integration.rs,tests/lock_integration.rs,tests/property_based.rs,tests/loom_lock_slots.rs,tests/regression_positional_args.rs,tests/recall_integration.rs,tests/daemon_integration.rs,tests/schema_migration_integration.rs— remaining test fns and helpers translated.
Notes
errors::to_string_pt()andmain::emit_progress_i18n(en, pt)continue to hold legitimate Portuguese strings — these are the i18n branch invoked when--lang pt(or detected locale) is active. They are not violations.- Default behaviour
./graphrag.sqlitein CWD (resolved viapaths.rs:35-41) confirmed empirically against the v1.0.29 audit corpus (29 of 30 flowaiper Markdown documents indexed end-to-end; recall p50 ~50ms, hybrid-search p50 ~52ms; one stress-test failure was an external 60s timeout, not a tool defect). - Empirical evidence: the bug was reproducible with one CLI invocation:
sqlite-graphrag remember --type document --name x --description y --body zreturned exit 10 with the schema CHECK error message in v1.0.29.
[1.0.29] - 2026-04-29
Fixed (Critical — Language Policy Violations in Production Code)
src/paths.rs:21— Portuguese error message"não foi possível determinar o diretório home"inAppError::Iotranslated to"could not determine home directory". Was emitted intracing::error!and CLI stderr regardless of--langflag.src/paths.rs:85-89— Portuguese error message"caminho '{}' não possui componente pai válido"inAppError::Validationtranslated to"path '{}' has no valid parent component".src/main.rs:227— Portuguesetracing::warn!("recebido sinal de shutdown...")translated to"shutdown signal received; waiting for current command to finish gracefully". Tracing logs are required to be English regardless of locale.src/commands/purge.rs:21— Portuguese doc comment"[DEPRECATED em v2.0.0]"translated to"[DEPRECATED in v2.0.0]".src/commands/purge.rs:70-71— Portuguese warning string"--older-than-seconds está deprecado..."(emitted in JSONwarningsfield) translated to"--older-than-seconds is deprecated; use --retention-days in v2.0.0+". JSON output must be language-neutral.src/commands/purge.rs:123— Portugueseanyhow!("erro de relógio do sistema: {err}")translated to"system clock error: {err}".src/commands/purge.rs:192-193— Portuguese warning `"falha ao limpar vec_chunks....
v1.0.29
Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog,
and this project adheres to Semantic Versioning.
[Unreleased]
[1.0.29] - 2026-04-29
Fixed (Critical — Language Policy Violations in Production Code)
src/paths.rs:21— Portuguese error message"não foi possível determinar o diretório home"inAppError::Iotranslated to"could not determine home directory". Was emitted intracing::error!and CLI stderr regardless of--langflag.src/paths.rs:85-89— Portuguese error message"caminho '{}' não possui componente pai válido"inAppError::Validationtranslated to"path '{}' has no valid parent component".src/main.rs:227— Portuguesetracing::warn!("recebido sinal de shutdown...")translated to"shutdown signal received; waiting for current command to finish gracefully". Tracing logs are required to be English regardless of locale.src/commands/purge.rs:21— Portuguese doc comment"[DEPRECATED em v2.0.0]"translated to"[DEPRECATED in v2.0.0]".src/commands/purge.rs:70-71— Portuguese warning string"--older-than-seconds está deprecado..."(emitted in JSONwarningsfield) translated to"--older-than-seconds is deprecated; use --retention-days in v2.0.0+". JSON output must be language-neutral.src/commands/purge.rs:123— Portugueseanyhow!("erro de relógio do sistema: {err}")translated to"system clock error: {err}".src/commands/purge.rs:192-193— Portuguese warning"falha ao limpar vec_chunks..."(in JSONwarnings) translated to"failed to clean vec_chunks for memory_id {memory_id}: {err}".src/commands/purge.rs:198-201— Portuguese warning"falha ao limpar vec_memories..."(in JSONwarnings) translated to"failed to clean vec_memories for memory_id {memory_id}: {err}".src/main.rs:265— Removed duplicatetracing::error!(error = %e)that emitted localized error string into structured logs (line 266emit_error(&e.localized_message())already handles user-visible output). Eliminates the i18n→tracing leakage where Portuguese error payloads were polluting EN-only log channels.
Fixed (Security — Path Traversal & Unsafe Audit)
src/paths.rs:60—validate_pathnow usesPath::components().any(|c| c == Component::ParentDir)instead of substring.contains(".."), preventing both false positives on filenames containing..(e.g.,..config) and potential bypass via non-standard path encodings.src/extraction.rs:271— Added comprehensiveSAFETY:comment tounsafe { VarBuilder::from_mmaped_safetensors(...) }documenting the three soundness invariants (file not concurrently modified, mmaped region lifetime tracking, safetensors format validation).src/storage/connection.rs:14-21— AddedSAFETY:comment tounsafe { rusqlite::ffi::sqlite3_auto_extension(...) }documenting FFI ABI compatibility, transmute layout invariants, and single-call invocation guarantee.src/paths.rs(6 SAFETY comments in tests) — Translated from Portuguese ("SAFETY: testes marcados com #[serial] garantem ausência de concorrência.") to English ("SAFETY: tests are annotated with #[serial], guaranteeing single-threaded execution.").
Added (UX Improvements)
list --include-deletedflag to surface soft-deleted memories. Without this flag,forgetfollowed bylistwould create a workflow dead-end where soft-deleted entries became invisible.history --no-bodyflag to omit version body content from the JSON response. Useful for memories with large body content where only metadata/version sequence is needed.MemoryType::DocumentandMemoryType::Notevariants added to the--typeenum (remember,list,recall). Documentation-style content no longer needs to abuse theReferencetype.help =text added to ~10 previously bare flags (--namespace,--limit,--offset,--format,--db,--include-deleted,--no-body) acrosslist,history, and other subcommands.- README Quick Start now explicitly documents that
sqlite-graphrag initis the first required command and thatgraphrag.sqliteis created in the current working directory by default.
Changed (Schema & UX)
--jsonflag is now hidden in 21 subcommands via#[arg(long, hide = true)]. The flag was a no-op (JSON is the default output format) but appeared in--helpcausing confusion. The flag remains accepted for backward compatibility with tools that pass it explicitly.historyJSON response:metadatafield type changed fromString(raw JSON-encoded) toserde_json::Value(parsed object), aligning withreadwhich already exposed it asValue. Consumers parsingmetadataas a JSON string must now read it as an object directly. Empty/invalid metadata defaults to{}.historyJSON response:bodyfield is nowOption<String>(omitted when--no-bodyis set). When the field is present (default), the existing schema is unchanged.Cargo.tomlexcludelist:/CLAUDE.md,/AGENTS.md,/MEMORY.mdrewritten without leading/for idiomatic relative-path semantics matching cargo conventions.
Notes
- This is a patch release focused on policy compliance and UX fixes detected in the v1.0.28 audit (
/tmp/sqlite-graphrag-audit/reports/audit-v1.0.28.md). - One JSON schema change:
history.metadatafrom string to object. Consumers that parsedmetadataas a string must now read it as an object. All other JSON contracts (commands, fields, exit codes) remain unchanged. - Empirically validated against real Markdown documents from a 495-file corpus during the v1.0.28 audit. CRUD cycle (init → remember → recall → read → edit → forget → purge) verified end-to-end.
[1.0.28] - 2026-04-28
Changed
- Enforces the English-only Language Policy across the entire codebase. All
///and//!doc comments, alltracing::*!log strings, and all identifiers (functions, statics, modules, enum variants, test names) outsidesrc/i18n.rstranslation tables are now in English. PT-BR strings remain only inLanguage::Portuguesebranches insidei18n::errors_msg,i18n::validation, anderrors::to_string_pt(). Language::Portuguesenum variant renamed toLanguage::Portuguese(CLI aliasespt,pt-br,pt-BR,portugues,portuguesepreserved for backward compatibility).IDIOMA_GLOBALstatic renamed toGLOBAL_LANGUAGE(src/i18n.rs).FUSO_GLOBALstatic renamed toGLOBAL_TZ(src/tz.rs).- ~30 PT-named functions renamed to English equivalents in
src/i18n.rsandsrc/tz.rs(e.g.,formatar_iso→format_iso,epoch_para_iso→epoch_to_iso,memoria_nao_encontrada→memory_not_found,nome_kebab→name_kebab,validacaomodule →validation,errosmodule →errors_msg). - 32 internal
mod testestest modules renamed tomod testsfor consistency with Rust convention. - All call-sites in
src/commands/*.rsand tests propagated to use the renamed identifiers.
Added
//!crate-level documentation in 37 modules that previously lacked it:src/cli.rs,src/main.rs,src/extraction.rs,src/embedder.rs,src/daemon.rs,src/output.rs,src/paths.rs,src/chunking.rs,src/graph.rs,src/namespace.rs,src/parsers/mod.rs,src/tokenizer.rs,src/storage/{connection,urls,chunks,versions,mod}.rs,src/pragmas.rs, and 22 handlers insrc/commands/.language-checkCI job in.github/workflows/ci.ymlthat fails the build when Portuguese diacritics are detected in///,//!,tracing::*!calls, or#[error(...)]attributes — automated guardrail against regression.
Documentation
- Two broken intra-doc links (
[Cli],[TextEmbedding]) fixed insrc/lib.rsandsrc/embedder.rs(surfaced whencargo doc -D warningswas first run with the new doc coverage).
Notes
- This is a non-breaking change for the CLI and JSON contracts: subcommand names, flags, env vars, exit codes, and JSON field names remain unchanged. Internal Rust identifiers were renamed but the crate is a binary, not a library consumed via
pub use. - 65 files changed, +872/-715 lines. All 9 cargo gates pass (fmt, clippy, test, doc, audit, deny, publish dry-run, package list, llvm-cov).
[1.0.27] - 2026-04-28
Added
CURRENT_SCHEMA_VERSION: u32 = 8constant insrc/constants.rswith unit test that asserts equality with the count ofV*.sqlmigration files.output::emit_errorandoutput::emit_error_i18nfunctions centralizing stderr error output (Pattern 5: ÚNICO ponto de I/O emoutput.rs).nextesttest-groups configuration in.config/nextest.tomlto serialize cross-binary tests sharing the daemon socket and model cache. Eliminatescontract_15_linkflake observed since v1.0.24.
Changed
- README EN+PT (
Graph Schemasection) now listsentity_typeas exactly 13 values (was 10) — addsorganization,location,dateintroduced in V008 schema migration of v1.0.25. init --helpdocstring documents path resolution precedence (--db>SQLITE_GRAPHRAG_DB_PATH>SQLITE_GRAPHRAG_HOME> cwd).src/commands/recall.rsgraph-distance comment clarified: it remains a hop-count proxy (1.0 - 1.0/(hop+1)), real cosine distance is reserved for v1.0.28 (forward-dated reference fixed).- All 6
eprintln!calls insrc/main.rsmigrated tooutput::emit_error*to enforce Pattern 5.
Documentation
SQLITE_GRAPHRAG_LOG_FORMATnow documented in the env-var table of README EN+PT (was implemented since v1.0.x but undocumented).- README
unlinkrow corrected from the non-existent--relationship-idflag to the actual--from --to --relationflags. The previous documentation could mislead agents into rejecting valid invocations. docs/MIGRATION.mdanddocs/MIGRATION.pt-BR.mdversion reference updated from v1.0.17 to v1.0.27 (3 occurrences each).docs/HOW_TO_USE.mdand `docs/HOW_...
v1.0.28
Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog,
and this project adheres to Semantic Versioning.
[Unreleased]
[1.0.28] - 2026-04-28
Changed
- Enforces the English-only Language Policy across the entire codebase. All
///and//!doc comments, alltracing::*!log strings, and all identifiers (functions, statics, modules, enum variants, test names) outsidesrc/i18n.rstranslation tables are now in English. PT-BR strings remain only inLanguage::Portuguesebranches insidei18n::errors_msg,i18n::validation, anderrors::to_string_pt(). Language::Portuguesenum variant renamed toLanguage::Portuguese(CLI aliasespt,pt-br,pt-BR,portugues,portuguesepreserved for backward compatibility).IDIOMA_GLOBALstatic renamed toGLOBAL_LANGUAGE(src/i18n.rs).FUSO_GLOBALstatic renamed toGLOBAL_TZ(src/tz.rs).- ~30 PT-named functions renamed to English equivalents in
src/i18n.rsandsrc/tz.rs(e.g.,formatar_iso→format_iso,epoch_para_iso→epoch_to_iso,memoria_nao_encontrada→memory_not_found,nome_kebab→name_kebab,validacaomodule →validation,errosmodule →errors_msg). - 32 internal
mod testestest modules renamed tomod testsfor consistency with Rust convention. - All call-sites in
src/commands/*.rsand tests propagated to use the renamed identifiers.
Added
//!crate-level documentation in 37 modules that previously lacked it:src/cli.rs,src/main.rs,src/extraction.rs,src/embedder.rs,src/daemon.rs,src/output.rs,src/paths.rs,src/chunking.rs,src/graph.rs,src/namespace.rs,src/parsers/mod.rs,src/tokenizer.rs,src/storage/{connection,urls,chunks,versions,mod}.rs,src/pragmas.rs, and 22 handlers insrc/commands/.language-checkCI job in.github/workflows/ci.ymlthat fails the build when Portuguese diacritics are detected in///,//!,tracing::*!calls, or#[error(...)]attributes — automated guardrail against regression.
Documentation
- Two broken intra-doc links (
[Cli],[TextEmbedding]) fixed insrc/lib.rsandsrc/embedder.rs(surfaced whencargo doc -D warningswas first run with the new doc coverage).
Notes
- This is a non-breaking change for the CLI and JSON contracts: subcommand names, flags, env vars, exit codes, and JSON field names remain unchanged. Internal Rust identifiers were renamed but the crate is a binary, not a library consumed via
pub use. - 65 files changed, +872/-715 lines. All 9 cargo gates pass (fmt, clippy, test, doc, audit, deny, publish dry-run, package list, llvm-cov).
[1.0.27] - 2026-04-28
Added
CURRENT_SCHEMA_VERSION: u32 = 8constant insrc/constants.rswith unit test that asserts equality with the count ofV*.sqlmigration files.output::emit_errorandoutput::emit_error_i18nfunctions centralizing stderr error output (Pattern 5: ÚNICO ponto de I/O emoutput.rs).nextesttest-groups configuration in.config/nextest.tomlto serialize cross-binary tests sharing the daemon socket and model cache. Eliminatescontract_15_linkflake observed since v1.0.24.
Changed
- README EN+PT (
Graph Schemasection) now listsentity_typeas exactly 13 values (was 10) — addsorganization,location,dateintroduced in V008 schema migration of v1.0.25. init --helpdocstring documents path resolution precedence (--db>SQLITE_GRAPHRAG_DB_PATH>SQLITE_GRAPHRAG_HOME> cwd).src/commands/recall.rsgraph-distance comment clarified: it remains a hop-count proxy (1.0 - 1.0/(hop+1)), real cosine distance is reserved for v1.0.28 (forward-dated reference fixed).- All 6
eprintln!calls insrc/main.rsmigrated tooutput::emit_error*to enforce Pattern 5.
Documentation
SQLITE_GRAPHRAG_LOG_FORMATnow documented in the env-var table of README EN+PT (was implemented since v1.0.x but undocumented).- README
unlinkrow corrected from the non-existent--relationship-idflag to the actual--from --to --relationflags. The previous documentation could mislead agents into rejecting valid invocations. docs/MIGRATION.mdanddocs/MIGRATION.pt-BR.mdversion reference updated from v1.0.17 to v1.0.27 (3 occurrences each).docs/HOW_TO_USE.mdanddocs/HOW_TO_USE.pt-BR.mdlinkrecipe examples corrected to use--from/--toinstead of the non-existent--source/--targetflags.
Fixed
- Formatting drift in
tests/doc_contract_integration.rs:669resolved viacargo fmt --all(multi-line array → single-line as expected by rustfmt).
Notes
- Investigation of the audit P1 finding
tokenizer.rs:101-103 std::fs::read in async pathconcluded false positive:get_tokenizerandget_model_max_lengthare called only fromsrc/commands/remember.rs:389-391insidepub fn run()which is synchronous. Nospawn_blockingwrap is required. The blocking I/O is appropriate for the synchronous CLI command path. - Two
advisory-not-detectedwarnings fromcargo denyfor ignored advisoriesRUSTSEC-2024-0436(paste) andRUSTSEC-2025-0119(number_prefix) were observed but kept indeny.toml— they protect against re-introduction via fastembed's transitive deps if upstream regresses. A scheduled cleanup is deferred to v1.0.28 after explicit verification ofcargo treeconfirming the deps are no longer present.
[1.0.26] - 2026-04-28
Added
SQLITE_GRAPHRAG_HOMEenv var for setting the base directory forgraphrag.sqlite(precedence:--db>SQLITE_GRAPHRAG_DB_PATH>SQLITE_GRAPHRAG_HOME> cwd).- README sample JSON output for
remembershowingextracted_entities,extracted_relationships, andurls_persistedfields. - Expanded exit-code table with sub-causes for exit 1 (Validation error or runtime failure).
Changed
- README clarifies that GraphRAG entity extraction runs by default in
remember(use--skip-extractionto disable per call). - Renamed reference to "automatic ingestion" in README to disambiguate "daemon autostart" from "automatic entity extraction".
Fixed
- Daemon
handled_embed_requestscounter now correctly reports the cumulative count afterinitautospawn (was returning 0 since v1.0.24 due to a per-connection local counter shadowing the shared accumulator). - Test
contract_15_linkaligned with the actuallink --jsonoutput keys (action,from,to,relation,weight,namespace); the obsolete expectations ofsource/targetnumeric IDs were stale since v1.0.24.
[1.0.25] - 2026-04-28
Added
recall --all-namespacesflag searches across all namespaces in a single query (P0-1).- BERT NER now emits
organization(B-ORG),location(B-LOC), anddate(B-DATE)
entity types aligned with V008 schema migration. Previous releases mapped ORG→project,
LOC→concept, and discarded DATE entirely (P0-2 + V008 alignment). - Schema migration V008:
entities.typeCHECK constraint expanded to includeorganization,
location,date. Additive migration; existing rows are preserved unchanged. - BRAND_NAME_REGEX captures CamelCase organization names such as "OpenAI", "PostgreSQL",
"ChatGPT" that BERT NER frequently misclassifies (P0-2). - Portuguese monosyllabic verb false-positive filter ("Lê", "Vê", "Cá", etc.) for BERT
outputs below confidence threshold 0.85 (P0-2). - SECTION_MARKER_REGEX filters text fragments like "Etapa 3", "Fase 1", "Passo 2",
"Seção 4", "Capítulo 1" from entity extraction (P0-4). - 12 new ALL_CAPS_STOPWORDS:
API,CAPÍTULO,CLI,ETAPA,FASE,HTTP,HTTPS,
JWT,LLM,PASSO,REST,UI,URL(P0-4). - README documents
graph traverse|stats|entitiessubcommands with flags table (P1-A).
Changed
recall.graph_matches[].distancenow reflects graph hop count via proxy
1.0 - 1.0 / (hop + 1). Previous releases used0.0placeholder. Real cosine
distance is reserved for v1.0.26 (P1-M).merge_and_deduplicatelongest-wins logic rewritten with composite key
entity_type + name_lcand bidirectional substring containment. Resolves
"Sonne"/"Sonnet" duplication and "Open"/"Paper" truncation issues (P0-3).Cargo.tomlversion bumped from1.0.24to1.0.25.
Fixed
is_valid_entity_typenow accepts new V008 typesorganization,location,date(P0-A) — without this fix,rememberwould reject any entity emitted by the V008-aligned IOB mapping with exit 1.augment_versioned_model_namesregex no longer captures Portuguese section markers like "Etapa 3" or "Fase 1" (P0-B) — defense-in-depth filter applied after augmentation and insideiob_to_entities.flush().remember --namelonger than 80 bytes now returns exit code 6 (LimitExceeded)
instead of exit 1 (Validation). Restores the exit code contract used by
orchestrating agents (P1-J).
Notes
recall.graph_matches[].distanceis approximate; semantic cosine distance reserved for v1.0.26.- Entity and relationship caps (30 and 50 respectively) remain silent in v1.0.25;
explicit--limit-entities/--limit-relationsflags planned for v1.0.26.
[1.0.24] - 2026-04-27
Added
- BERT NER batch inference via
predict_batchreduces per-document latency on multi-doc workloads (Phase 3 perf). - SQLITE_BUSY and SQLITE_LOCKED retry with exponential backoff in
with_busy_retry; avoids spurious exit 10 on WAL-mode contention (Phase 3). spawn_blockingwarm-up for daemon BERT model init prevents blocking the async executor during startup (Phase 3).- Schema migration V007:
memory_urlstable with indexes; URLs extracted from BERT NER are now persisted separately instead of leaking into the entity graph (Phase 2). src/storage/urls.rsCRUD module providingupsert_urls,get_urls_for_memoryanddelete_urls_for_memory(Phase 2).RememberResponse.urls_persisted: usizefield re...
v1.0.27
Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog,
and this project adheres to Semantic Versioning.
[Unreleased]
[1.0.27] - 2026-04-28
Added
CURRENT_SCHEMA_VERSION: u32 = 8constant insrc/constants.rswith unit test that asserts equality with the count ofV*.sqlmigration files.output::emit_errorandoutput::emit_error_i18nfunctions centralizing stderr error output (Pattern 5: ÚNICO ponto de I/O emoutput.rs).nextesttest-groups configuration in.config/nextest.tomlto serialize cross-binary tests sharing the daemon socket and model cache. Eliminatescontract_15_linkflake observed since v1.0.24.
Changed
- README EN+PT (
Graph Schemasection) now listsentity_typeas exactly 13 values (was 10) — addsorganization,location,dateintroduced in V008 schema migration of v1.0.25. init --helpdocstring documents path resolution precedence (--db>SQLITE_GRAPHRAG_DB_PATH>SQLITE_GRAPHRAG_HOME> cwd).src/commands/recall.rsgraph-distance comment clarified: it remains a hop-count proxy (1.0 - 1.0/(hop+1)), real cosine distance is reserved for v1.0.28 (forward-dated reference fixed).- All 6
eprintln!calls insrc/main.rsmigrated tooutput::emit_error*to enforce Pattern 5.
Documentation
SQLITE_GRAPHRAG_LOG_FORMATnow documented in the env-var table of README EN+PT (was implemented since v1.0.x but undocumented).- README
unlinkrow corrected from the non-existent--relationship-idflag to the actual--from --to --relationflags. The previous documentation could mislead agents into rejecting valid invocations. docs/MIGRATION.mdanddocs/MIGRATION.pt-BR.mdversion reference updated from v1.0.17 to v1.0.27 (3 occurrences each).docs/HOW_TO_USE.mdanddocs/HOW_TO_USE.pt-BR.mdlinkrecipe examples corrected to use--from/--toinstead of the non-existent--source/--targetflags.
Fixed
- Formatting drift in
tests/doc_contract_integration.rs:669resolved viacargo fmt --all(multi-line array → single-line as expected by rustfmt).
Notes
- Investigation of the audit P1 finding
tokenizer.rs:101-103 std::fs::read in async pathconcluded false positive:get_tokenizerandget_model_max_lengthare called only fromsrc/commands/remember.rs:389-391insidepub fn run()which is synchronous. Nospawn_blockingwrap is required. The blocking I/O is appropriate for the synchronous CLI command path. - Two
advisory-not-detectedwarnings fromcargo denyfor ignored advisoriesRUSTSEC-2024-0436(paste) andRUSTSEC-2025-0119(number_prefix) were observed but kept indeny.toml— they protect against re-introduction via fastembed's transitive deps if upstream regresses. A scheduled cleanup is deferred to v1.0.28 after explicit verification ofcargo treeconfirming the deps are no longer present.
[1.0.26] - 2026-04-28
Added
SQLITE_GRAPHRAG_HOMEenv var for setting the base directory forgraphrag.sqlite(precedence:--db>SQLITE_GRAPHRAG_DB_PATH>SQLITE_GRAPHRAG_HOME> cwd).- README sample JSON output for
remembershowingextracted_entities,extracted_relationships, andurls_persistedfields. - Expanded exit-code table with sub-causes for exit 1 (Validation error or runtime failure).
Changed
- README clarifies that GraphRAG entity extraction runs by default in
remember(use--skip-extractionto disable per call). - Renamed reference to "automatic ingestion" in README to disambiguate "daemon autostart" from "automatic entity extraction".
Fixed
- Daemon
handled_embed_requestscounter now correctly reports the cumulative count afterinitautospawn (was returning 0 since v1.0.24 due to a per-connection local counter shadowing the shared accumulator). - Test
contract_15_linkaligned with the actuallink --jsonoutput keys (action,from,to,relation,weight,namespace); the obsolete expectations ofsource/targetnumeric IDs were stale since v1.0.24.
[1.0.25] - 2026-04-28
Added
recall --all-namespacesflag searches across all namespaces in a single query (P0-1).- BERT NER now emits
organization(B-ORG),location(B-LOC), anddate(B-DATE)
entity types aligned with V008 schema migration. Previous releases mapped ORG→project,
LOC→concept, and discarded DATE entirely (P0-2 + V008 alignment). - Schema migration V008:
entities.typeCHECK constraint expanded to includeorganization,
location,date. Additive migration; existing rows are preserved unchanged. - BRAND_NAME_REGEX captures CamelCase organization names such as "OpenAI", "PostgreSQL",
"ChatGPT" that BERT NER frequently misclassifies (P0-2). - Portuguese monosyllabic verb false-positive filter ("Lê", "Vê", "Cá", etc.) for BERT
outputs below confidence threshold 0.85 (P0-2). - SECTION_MARKER_REGEX filters text fragments like "Etapa 3", "Fase 1", "Passo 2",
"Seção 4", "Capítulo 1" from entity extraction (P0-4). - 12 new ALL_CAPS_STOPWORDS:
API,CAPÍTULO,CLI,ETAPA,FASE,HTTP,HTTPS,
JWT,LLM,PASSO,REST,UI,URL(P0-4). - README documents
graph traverse|stats|entitiessubcommands with flags table (P1-A).
Changed
recall.graph_matches[].distancenow reflects graph hop count via proxy
1.0 - 1.0 / (hop + 1). Previous releases used0.0placeholder. Real cosine
distance is reserved for v1.0.26 (P1-M).merge_and_deduplicatelongest-wins logic rewritten with composite key
entity_type + name_lcand bidirectional substring containment. Resolves
"Sonne"/"Sonnet" duplication and "Open"/"Paper" truncation issues (P0-3).Cargo.tomlversion bumped from1.0.24to1.0.25.
Fixed
is_valid_entity_typenow accepts new V008 typesorganization,location,date(P0-A) — without this fix,rememberwould reject any entity emitted by the V008-aligned IOB mapping with exit 1.augment_versioned_model_namesregex no longer captures Portuguese section markers like "Etapa 3" or "Fase 1" (P0-B) — defense-in-depth filter applied after augmentation and insideiob_to_entities.flush().remember --namelonger than 80 bytes now returns exit code 6 (LimitExceeded)
instead of exit 1 (Validation). Restores the exit code contract used by
orchestrating agents (P1-J).
Notes
recall.graph_matches[].distanceis approximate; semantic cosine distance reserved for v1.0.26.- Entity and relationship caps (30 and 50 respectively) remain silent in v1.0.25;
explicit--limit-entities/--limit-relationsflags planned for v1.0.26.
[1.0.24] - 2026-04-27
Added
- BERT NER batch inference via
predict_batchreduces per-document latency on multi-doc workloads (Phase 3 perf). - SQLITE_BUSY and SQLITE_LOCKED retry with exponential backoff in
with_busy_retry; avoids spurious exit 10 on WAL-mode contention (Phase 3). spawn_blockingwarm-up for daemon BERT model init prevents blocking the async executor during startup (Phase 3).- Schema migration V007:
memory_urlstable with indexes; URLs extracted from BERT NER are now persisted separately instead of leaking into the entity graph (Phase 2). src/storage/urls.rsCRUD module providingupsert_urls,get_urls_for_memoryanddelete_urls_for_memory(Phase 2).RememberResponse.urls_persisted: usizefield reporting how many URL entries landed inmemory_urls(Phase 2).RememberResponse.relationships_truncated: boolfield indicating whether the relationships payload was capped atmax_relationships_per_memory(Phase 4).namespace_initialpersisted inschema_metaoninit;purgeresolves contextually viaSQLITE_GRAPHRAG_NAMESPACE(Phase 4 P1-A/P1-C).- Positional and flag arguments in
read,forget,history,edit,rename; e.g.sqlite-graphrag read my-noteis equivalent tosqlite-graphrag read --name my-note(Phase 4 P1-B). - Stopwords list expanded with 17 new entries:
ACEITE,ACK,ACL,BORDA,CHECKLIST,COMPLETED,CONFIRME,DEVEMOS,DONE,FIXED,NEGUE,PENDING,PLAN,PODEMOS,RECUSE,TOKEN,VAMOS(Phase 2 P0-3). - NFKC unicode normalization in
merge_and_deduplicateprevents near-duplicate entities caused by composed vs decomposed Unicode forms (Phase 2 P1-E). - Regression tests for
graphtraverse exit 4 when the database is absent (Phase 1 P0-7). - Regression tests for positional-plus-flag argument equivalence in
read,forget,history,edit,rename(Phase 4 P1-B).
Changed
ReadResponse.metadatais nowserde_json::Valueinstead ofString; agents receive a structured object directly without a secondJSON.parsecall (Phase 5 P2-A).LinkResponsesimplified: redundantsourceandtargetfields removed;LinkArgsno longer accepts--source/--targetflag aliases (Phase 4 P1-O).purgeno longer defaults namespace to"global"; resolves viaSQLITE_GRAPHRAG_NAMESPACEor explicit--namespace(Phase 4 P1-C).recall --precisebehavior is now documented and internally useseffective_k = 100000for exhaustive KNN (Phase 1 P0-6).init --modelnow uses the typedEmbeddingModelChoiceenum validated at parse time (Phase 1 P0-8).main.rsRAM measurement usesResultpropagation instead ofexpect(Phase 1 P1-G).- Daemon warm-up model load moved into
spawn_blockingto avoid blocking the Tokio executor (Phase 3 P1-I). augment_versioned_model_namesregex extended to recognizeGPT-4o,Claude 4 Sonnet,Llama 3 Pro,Mixtral 8x7Bpatterns (Phase 5 P2-D).extend_with_numeric_suffixnow accepts alphanumeric suffixes (e.g.v2,3b,7B) in addition to purely numeric ones (Phase 5 P2-E).- Graph entity serialization uses
Vec::new()instead ofOption<Vec>so theentitiesfield is always an array, nevernull(Phase 5 P2-C). --typeargu...