Skip to content

chore(v5): phase 1+2 — safety-gate scaffolding (T001-T007)#178

Merged
thebtf merged 6 commits into
mainfrom
feat/v5-cleanup
Apr 17, 2026
Merged

chore(v5): phase 1+2 — safety-gate scaffolding (T001-T007)#178
thebtf merged 6 commits into
mainfrom
feat/v5-cleanup

Conversation

@thebtf
Copy link
Copy Markdown
Owner

@thebtf thebtf commented Apr 17, 2026

Phase 1+2 of engram v5.0.0 BREAKING cleanup

Safety-gate scaffolding for the 14-PR v5.0.0 cleanup sequence.
No production code changes. No schema changes. Scripts + CI step only.

Commits

  • 9499403 T006a: scripts/safety-gate.sh (307 LOC) + mock tests (360 LOC) + baseline.json + README
  • d658b8f bugfix: absent mismatch_count field (real vault API only emits when > 0) treated as 0
  • 5599381 T006b: scripts/safety-gate.ps1 (180 LOC) + Pester v5 tests (216 LOC) + README append
  • df66dd4 T007: .github/workflows/test.yml safety-gate self-check step (+40 lines)

Manual verification

All three local smokes on safety-gate.ps1 against live production vault:

  1. Real vault → exit 0, SAFETY-GATE: OK (fingerprint=aa78e55cf896508c, credentials=13, mismatches=0)
  2. Missing ENGRAM_API_TOKEN → exit 2, config error
  3. Unreachable server (http://127.0.0.1:1) → exit 2, connection error

Baseline recorded (cleanup-v5-decisions.md §S16)

  • Vault fingerprint: aa78e55cf896508c + 13 credentials
  • Entity counts: issues=78, api_tokens=3, documents=0, content=0, versioned_documents=1
  • pg_dump + FalkorDB snapshots stored offsite with SHAs

Invariant preservation (NFR-2)

Vault + content table UNCHANGED in this PR — pure tooling, no migrations, no store/handler changes.

Scope deliberately tight

This PR lands ONLY Phase 1 (safety baseline) + Phase 2 (safety automation). First invariant-touching PR is v5-1 (drop orphan join tables) which follows this one.

Next

After merge: PR-v5-1 (drop session_observation_injections + injection_log). See .agent/specs/engram-v5-baseline/changes/CR-001-initial-scope/tasks.md Phase 3.


Spec: .agent/specs/engram-v5-baseline/spec.md · Plan: .agent/specs/engram-v5-baseline/plan.md · Decisions: .agent/reports/cleanup-v5-decisions.md S1–S16

Summary by CodeRabbit

  • Новые возможности

    • Добавлена новая ступень CI для само-проверки целостности сценариев безопасности.
  • Документация

    • Добавлена подробная документация по запуску и параметрам проверки безопасности и базовым данным.
  • Тестирование

    • Добавлены кроссплатформенные тесты и тестовый каркас для эмуляции ответов сервисов.
  • Служебное

    • Внедрена поддержка выполнения проверок на Windows, Linux и macOS; добавлены эталонные данные для сравнения.

@thebtf
Copy link
Copy Markdown
Owner Author

thebtf commented Apr 17, 2026

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 17, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a safety gate mechanism to verify system invariants during the engram v5 cleanup process. It includes a Bash script for comprehensive checks, a PowerShell script for Windows environments, a baseline configuration file, and corresponding test suites. The review feedback highlights several areas for improvement in the Bash script, specifically regarding robust JSON parsing with jq, ensuring exit code consistency for network and format errors to match documentation, and correctly handling multi-character string joining in the final report.

Comment thread scripts/safety-gate.sh Outdated
Comment thread scripts/safety-gate.sh Outdated
Comment thread scripts/safety-gate.sh Outdated
Comment thread scripts/safety-gate.sh Outdated
Comment thread scripts/safety-gate.sh Outdated
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 17, 2026

Warning

Rate limit exceeded

@thebtf has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 0 minutes and 45 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 0 minutes and 45 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 23646294-5512-4d83-a963-69e1480345e1

📥 Commits

Reviewing files that changed from the base of the PR and between 011e1b8 and 495eea9.

📒 Files selected for processing (1)
  • scripts/safety-gate.sh

Walkthrough

Добавлен кросс‑платформенный "safety-gate": Bash и PowerShell скрипты, baseline JSON, тесты (Bash harness + Pester) и CI‑шаг для синтаксической и платформо‑специфичной проверки скриптов; реализованы проверки Vault API и опциональные Postgres‑проверки через Docker.

Changes

Cohort / File(s) Summary
CI Integration
.github/workflows/test.yml
Добавлен CI шаг "Safety gate self-check (scripts integrity)" который валидирует scripts/safety-gate-baseline.json (jq/Python) и запускает платформо‑специфичные проверочные скрипты (pwsh -NoProfile -File scripts/safety-gate.ps1 -Help на Windows, bash scripts/safety-gate_test.sh на Linux/macOS).
Documentation
scripts/README.md
Новая документация по workflow safety‑gate: запуск, флаги, переменные окружения (включая SAFETY_GATE_SKIP_PG), поведение фаз, exit‑коды, требования и примечания об отличиях PowerShell vs Bash.
Baseline Data
scripts/safety-gate-baseline.json
Новый baseline JSON с captured_at, server_url, fingerprint, ожидаемыми/пороговыми значениями счётчиков и Postgres‑настройками (контейнер/пользователь/БД).
PowerShell (Windows)
scripts/safety-gate.ps1, scripts/safety-gate.Tests.ps1
Новый PowerShell 7+ скрипт с параметрами (Baseline, Phase, Help) для проверки Vault API invariants (на Windows всегда пропускаются Postgres‑проверки); сопутствующий Pester v5 тестовый набор с моками Invoke‑RestMethod и сценариями ошибок/совпадений.
Bash (Unix)
scripts/safety-gate.sh, scripts/safety-gate_test.sh
Новый Bash скрипт с CLI (--baseline, --phase, --skip-pg) реализует Gate1 (Vault API) и опционально Gate2 (Postgres через Docker) с проверкой exact/min bounds; тест‑хarness с моками curl/docker и набором unit‑тестов.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Gate as safety-gate.sh
    participant Baseline as baseline.json
    participant Vault as Vault API
    participant Docker as Docker/Postgres

    User->>Gate: Запуск (ENGRAM_API_TOKEN, --baseline, --phase)
    Gate->>Baseline: Загрузить baseline
    Baseline-->>Gate: expected fingerprint, counts, thresholds

    rect rgba(200,150,255,0.5)
    Note over Gate,Vault: Gate 1 — проверка Vault API
    Gate->>Vault: GET /api/vault/status (Bearer token)
    Vault-->>Gate: fingerprint, credential_count, mismatch_count
    Gate->>Gate: Сравнить с baseline (фаза: pre/post)
    end

    rect rgba(100,200,150,0.5)
    Note over Gate,Docker: Gate 2 — опциональные проверки БД
    alt --skip-pg не указан
        Gate->>Docker: docker exec psql -> проверить наличие таблицы credentials
        Docker-->>Gate: существование таблицы / counts
        Gate->>Docker: SELECT COUNT(*) для каждой сущности
        Docker-->>Gate: фактические счётчики
        Gate->>Gate: Сравнить exact/min с baseline
    else --skip-pg указан
        Gate->>Gate: Пропустить Gate 2
    end
    end

    Gate->>User: Вывести результат и exit код (0/1/2)
Loading
sequenceDiagram
    participant User
    participant GatePS as safety-gate.ps1
    participant Baseline as baseline.json
    participant Vault as Vault API

    User->>GatePS: Запуск (ENGRAM_API_TOKEN, -Phase)
    GatePS->>Baseline: Загрузить baseline
    Baseline-->>GatePS: ожидаемые значения

    rect rgba(200,150,255,0.5)
    Note over GatePS,Vault: Проверка invariants (Windows — только Vault)
    GatePS->>Vault: Invoke‑RestMethod /api/vault/status
    alt Успешно
        Vault-->>GatePS: fingerprint, credential_count, mismatch_count
        GatePS->>GatePS: Сравнить с baseline, проверить mismatch_count ≤ max
    else Сеть/ошибка
        GatePS->>GatePS: exit 2 (unable to reach)
    end
    end

    GatePS->>User: Печать результата + exit код (0/1/2)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 В поле кода я прыгаю в такт,

baseline держу под лапкой в пакт,
Скрипт бдит, Vault шлёт ответ,
Инвариант храню — и тихо свет.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 36.84% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly references the safety-gate scaffolding and specific phases (1+2) and task numbers (T001-T007), clearly summarizing the main changes of adding tooling and scripts for engram v5 cleanup.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/v5-cleanup

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/safety-gate.sh`:
- Around line 164-170: The failure branch for the VAULT_RESPONSE curl call
currently exits with code 1 but must use exit code 2 to match the
documented/PowerShell semantics for configuration/precondition errors; update
the error handler in the curl block that sets VAULT_RESPONSE (the curl against
"${SERVER_URL}/api/vault/status" using ENGRAM_API_TOKEN) to call exit 2 instead
of exit 1 and keep the existing stderr messages intact.
- Around line 222-229: The use of an associative array via "declare -A
ENTITY_TABLES" is incompatible with macOS's default Bash 3.2; replace that
construct with a pair of indexed arrays (e.g., ENTITY_KEYS and ENTITY_VALUES) or
a single flat indexed array (key,value,key,value) and update all code that
references ENTITY_TABLES to iterate by index (or by stepping two at a time) to
map keys to values; specifically, replace the ENTITY_TABLES
declaration/population with the new indexed arrays and modify any loops or
lookups that use ENTITY_TABLES[...] (search for "ENTITY_TABLES" and the loop
that processes it) to use the indexed access pattern so the script works on Bash
3.2 without associative arrays.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0a171b2e-87c9-4976-9eb1-f41c5bf683dc

📥 Commits

Reviewing files that changed from the base of the PR and between 648ae09 and df66dd4.

📒 Files selected for processing (7)
  • .github/workflows/test.yml
  • scripts/README.md
  • scripts/safety-gate-baseline.json
  • scripts/safety-gate.Tests.ps1
  • scripts/safety-gate.ps1
  • scripts/safety-gate.sh
  • scripts/safety-gate_test.sh

Comment thread scripts/safety-gate.sh
Comment thread scripts/safety-gate.sh Outdated
- Use // empty / // 0 guards on all jq baseline loads (null-string risk)
- Change curl network-error exit 1 -> 2 (config/prereq error per contract)
- Add 2>/dev/null || true to jq vault-response parses (non-JSON safety)
- Change missing-fields response exit 1 -> 2 (prereq error, not invariant)
- Replace declare -A with parallel indexed arrays (Bash 3.2 / macOS compat)
- Update iteration loop to indexed access on ENTITY_KEYS/ENTITY_TABLES
- Replace IFS-based array join with printf (multi-char separator correctness)

Local smokes re-verified: safety-gate.ps1 against live vault exits 0.
bash -n syntax check: clean.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/safety-gate.sh`:
- Around line 75-77: Текущее присвоение PHASE="${1#--phase=}" принимает любое
значение; добавьте проверку после установки PHASE (и аналогично в блоке на
123-126), сравнивая PHASE с набором допустимых значений (например список фаз,
используемых в case/if в скрипте), и при несоответствии вывести понятную ошибку
и завершить выполнение (exit 1); используйте имя переменной PHASE и форму
аргумента --phase= для поиска места правки и убедитесь, что поведение по
умолчанию не продолжается при недопустимом значении.
- Around line 110-118: Validate the baseline JSON before extracting fields and
ensure required fields are present: first run a safe jq check to test JSON
validity (e.g., use jq -e '.' on "${BASELINE_FILE}" and handle non-zero exit
without breaking set -euo pipefail), then extract values for SERVER_URL,
VAULT_FP, VAULT_CRED_COUNT, VAULT_MISMATCH_MAX, PG_CONTAINER, PG_USER, PG_DB,
and POST_US3_CRED_COUNT with fallbacks (use "|| true" or test for empty) and
explicitly validate that mandatory fields like SERVER_URL, PG_CONTAINER,
PG_USER, and PG_DB are non-empty, exiting with a clear error if they are
missing; mirror the existing pattern used for vault responses (lines 178–184) to
avoid jq returning code 4 from invalid JSON.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0ce53531-0482-45cf-a035-bc49cbc9a911

📥 Commits

Reviewing files that changed from the base of the PR and between df66dd4 and 011e1b8.

📒 Files selected for processing (1)
  • scripts/safety-gate.sh

Comment thread scripts/safety-gate.sh
Comment thread scripts/safety-gate.sh
CodeRabbit Major findings (07:38 UTC post-initial-fix review):

1. --phase=X accepted any value silently. Added case/esac validator
   for allowed set {pre-us3|post-us3|auto}; unknown → die_config.

2. Baseline JSON not validated before jq -r extraction. Corrupt JSON
   makes jq return exit 4, which under set -euo pipefail bypasses our
   0/1/2 exit-code contract. Added jq -e preflight check.

3. Required baseline fields not checked for emptiness after load.
   Added [[ -n ]] guards for server_url, vault.fingerprint,
   vault.credential_count, and (when --skip-pg not set) postgres.{container,user,database}.

All 3 fixes stay within die_config (exit 2 = config error) per contract.
@thebtf thebtf merged commit e9c275d into main Apr 17, 2026
8 checks passed
thebtf added a commit that referenced this pull request Apr 17, 2026
… US1)

Migrations:
- 083_drop_session_observation_injections (was 081 in spec; IDs 081/082 taken)
- 084_drop_injection_log (was 082 in spec)
Both: DROP TABLE IF EXISTS with Rollback re-CREATE to original schema.

Go code removal (10 files, -625/+83 = net -542 LOC):
- internal/db/gorm/injection_log_store.go: all 7 methods + struct gone
- internal/db/gorm/scoring_store.go: sessionObservationInjection + 2 methods gone
- internal/worker/handlers_context.go: async LogInjections goroutine gone
- internal/worker/handlers_maintenance.go: cleanupInjectionLog + "injection_log" from derivedTables gone
- internal/worker/handlers_scoring.go: RecordSessionInjections call + handleGetSessionInjectedObservations + handleMarkCited gone
- internal/worker/service.go: /mark-cited and /injected-observations routes gone
- internal/worker/retrieval_helpers.go: lookupDiversityScores production path returns nil; test hook preserved
- internal/maintenance/hit_rate.go: analyzeHitRate stubbed to (0, nil) — full maintenance drop in US7
- internal/maintenance/service.go: adjustAdaptiveThresholds stubbed to (0, nil) — full maintenance drop in US7

Known stubs (remove in US7 when maintenance subsystem drops):
- Service.analyzeHitRate(ctx) → (0, nil)
- Service.adjustAdaptiveThresholds(ctx) → (0, nil)
Both have explanatory comments; callers in maintenance scheduler are in
US7 scope. Cleaner than dangling references for the window between
US1 and US7.

Verification (run locally in worktree off e9c275d):
- rg 'session_observation_injections|injection_log' --type go: only historical
  migration entries + CHANGELOG references; zero live-code hits
- go build ./...: clean
- go vet ./...: clean
- go test -short ./...: 40+ packages, all ok, 0 failures

Invariants preserved (not touched by this PR):
- Vault: internal/crypto/*, handlers_vault.go, tools_credential.go, VaultView.vue
- content table, credentials observation rows, all 6 static entities
- scripts/safety-gate.* (just landed in #178)
- spec/plan/tasks artifacts under .agent/specs/engram-v5-baseline/
thebtf added a commit that referenced this pull request Apr 17, 2026
… US1) (#179)

* chore(v5): drop orphan join tables + citation-tracking code (PR-v5-1, US1)

Migrations:
- 083_drop_session_observation_injections (was 081 in spec; IDs 081/082 taken)
- 084_drop_injection_log (was 082 in spec)
Both: DROP TABLE IF EXISTS with Rollback re-CREATE to original schema.

Go code removal (10 files, -625/+83 = net -542 LOC):
- internal/db/gorm/injection_log_store.go: all 7 methods + struct gone
- internal/db/gorm/scoring_store.go: sessionObservationInjection + 2 methods gone
- internal/worker/handlers_context.go: async LogInjections goroutine gone
- internal/worker/handlers_maintenance.go: cleanupInjectionLog + "injection_log" from derivedTables gone
- internal/worker/handlers_scoring.go: RecordSessionInjections call + handleGetSessionInjectedObservations + handleMarkCited gone
- internal/worker/service.go: /mark-cited and /injected-observations routes gone
- internal/worker/retrieval_helpers.go: lookupDiversityScores production path returns nil; test hook preserved
- internal/maintenance/hit_rate.go: analyzeHitRate stubbed to (0, nil) — full maintenance drop in US7
- internal/maintenance/service.go: adjustAdaptiveThresholds stubbed to (0, nil) — full maintenance drop in US7

Known stubs (remove in US7 when maintenance subsystem drops):
- Service.analyzeHitRate(ctx) → (0, nil)
- Service.adjustAdaptiveThresholds(ctx) → (0, nil)
Both have explanatory comments; callers in maintenance scheduler are in
US7 scope. Cleaner than dangling references for the window between
US1 and US7.

Verification (run locally in worktree off e9c275d):
- rg 'session_observation_injections|injection_log' --type go: only historical
  migration entries + CHANGELOG references; zero live-code hits
- go build ./...: clean
- go vet ./...: clean
- go test -short ./...: 40+ packages, all ok, 0 failures

Invariants preserved (not touched by this PR):
- Vault: internal/crypto/*, handlers_vault.go, tools_credential.go, VaultView.vue
- content table, credentials observation rows, all 6 static entities
- scripts/safety-gate.* (just landed in #178)
- spec/plan/tasks artifacts under .agent/specs/engram-v5-baseline/

* fix(pr-179): complete migration 084 rollback (Gemini finding)

Rollback was missing schema added by migration 066 (learning memory v3):
- cited BOOLEAN DEFAULT false column
- idx_injection_log_session_cited index on (session_id, cited)

Without these, a post-rollback database would lack schema that existed
immediately before DROP, breaking downstream migration replay.

* fix(pr-179): address CodeRabbit nitpicks + stale references

CodeRabbit-flagged:
- Delete injection_log_store.go tombstone (was 3-line package stub referencing dropped table; G1 no-stubs compliance)
- internal/maintenance/service.go: Task 20 + Task 24 comments updated to reflect no-op state + US7 handoff pointer

Additional stale references found during review sweep:
- internal/mcp/tools_recall.go: user-facing 'no hit rate analytics data' message replaced (injection_log no longer exists)
- internal/worker/handlers_data.go: same constant message replaced
- internal/worker/reaper/reaper.go: FK audit comment updated to note v5 US1 drop

---------

Co-authored-by: Kirill Turanskiy <thebtf@users.noreply.github.com>
@thebtf thebtf deleted the feat/v5-cleanup branch May 7, 2026 06:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant