Add web4-governance plugin for AI governance with R6 workflow#20448
Add web4-governance plugin for AI governance with R6 workflow#20448dp-web4 wants to merge 5 commits intoanthropics:mainfrom
Conversation
7a344de to
1408cf8
Compare
Comment for PR #20448Clarification: Scope, Foundations, and PositioningThanks to everyone reviewing this PR. Based on feedback from external reviewers, I wanted to clarify a few points about what this plugin is (and isn't), and where it fits in the broader landscape. What This IsThe core contribution isn't any single element (audit logs, policy gates, trust metrics), but the combination of:
...implemented as a developer-portable, hook-based plugin rather than a platform-locked or enterprise-only system. What This Isn'tTo be explicit about scope:
We're building governance infrastructure, not claiming to solve alignment. Foundational ResearchThis plugin implements concepts from the Web4 trust-native architecture. For deeper context on trust tensors, entity witnessing, coherence metrics, and the broader theoretical framework, see: Web4 Whitepaper: https://dp-web4.github.io/web4/ The whitepaper covers:
How This Fits the Big PictureWeb4 Architecture provides the theoretical foundation — trust-native societies for humans and AI. Governance Tiers define implementation depth:
Runtime Implementations demonstrate portability:
Competitive ContextFor reviewers familiar with the space:
Our lane: lightweight, open, agent-native, intent-aware. SummaryThis is missing infrastructure, not speculative architecture. Happy to address specific questions or concerns. Related: A parallel implementation exists for Moltbot using the same R6 framework, demonstrating portability across runtimes. |
b0e3d68 to
b590d13
Compare
…1-4) Web4 governance plugin for Claude Code hooks — structured audit trails, trust tensors, entity witnessing, policy gating, and event streaming. Tiers: observational audit (T1), policy presets and rate limiting (T1.5), signing and persistent witnesses (T2), multi-target extraction (T3), event stream monitoring (T4). See plugins/web4-governance/README.md for full documentation. PR: anthropics#20448 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
5709a7c to
8fb33f6
Compare
…1-4) Web4 governance plugin for Claude Code hooks — structured audit trails, trust tensors, entity witnessing, policy gating, and event streaming. Tiers: observational audit (T1), policy presets and rate limiting (T1.5), signing and persistent witnesses (T2), multi-target extraction (T3), event stream monitoring (T4). See plugins/web4-governance/README.md for full documentation. PR: anthropics#20448 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
A corrupt session file (e.g. invalid control characters from an interrupted write) caused json.load() to raise JSONDecodeError on every subsequent tool call. Claude Code reported each as 'PreToolUse:Bash hook error / Failed with non-blocking status code: Traceback...' — noisy and wasted tokens. Patch: catch JSONDecodeError + OSError, rename the bad file to *.json.corrupt for forensics, fall through to lazy session re-init. Future corruptions self-heal in one tool call instead of polluting every subsequent invocation. Also removed the deprecated 'warn-git-push-no-pat' rule from ~/.web4/policies/ — PAT auth is deprecated, all dp-web4 remotes are SSH. The warning was firing on every git push command; signal-to-noise was zero. Kept the other 5 safety rules intact. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…bust too Earlier patch only covered pre_tool_use.py's load path. Two more leaks: 1. post_tool_use.py had the same unprotected json.load(session_file) -> crashed and reported 'PostToolUse:Bash hook error' on every tool call when the session file was bad. 2. Both hooks did non-atomic 'with open(f, w)' + json.dump. If hook A is reading while hook B is writing, the reader can see a half-written file -> JSONDecodeError -> bad-file quarantine. THIS WAS THE ACTUAL ROOT CAUSE OF THE ORIGINAL CORRUPTION. Fixing the read path stopped the crash but the corruption-on-write race kept generating new bad files. Patch: add try/except (json.JSONDecodeError, OSError) -> quarantine to post_tool_use.load_session. Make all session writes atomic (write .tmp, os.replace -> session.json) in both hooks. The race is closed at the source, and any pre-existing corrupt files self-heal via quarantine. Quarantined two more pre-existing corrupt session files manually. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When a Web4 host LCT exists at ~/.web4/{hostname}/lct.json (bootstrapped
via web4_fleet_bootstrap.py), each session records a host_lct_witness
entry on session-start. This is the reverse direction of the fleet
bootstrap's own scan — the bootstrap records sibling identity systems
present on the host; sessions record the host LCT they started under.
Bidirectional witness graph for multi-factor identity.
Witness != vouch. The session is recording observation, not endorsement.
Cross-system convergence is the trust signal; divergence between
sessions on the same host is diagnostic.
Changes:
- WitnessRecord gains optional host_lct_fingerprint + salience_axis
fields. Old records load without them (backward-compat).
- PolicyRegistry.witness_host_lct(session_id, host_lct_id, fingerprint,
salience_axis) — new method. Persists host_lct_witness records to
~/.web4/witnesses.jsonl alongside the existing session_witness and
decision_witness types.
- session_start hook discovers ~/.web4/{hostname}/lct.json. If found,
calls witness_host_lct and embeds {lct_id, fingerprint, machine,
entity_type, observed_at} into the session JSON's host_lct_witness
field. If not found (most installs), the field is None and sessions
proceed normally.
Salience-aware fingerprinting: the host LCT fingerprint is computed
salience-side by web4_fleet_bootstrap.py over the host LCT's
identity-stable fields. Routine ticks don't drift; only real identity
changes do. Each witness record carries salience_axis documenting what
the fingerprint hashes over, so the witness graph is self-describing.
Tests: 27/27 pass (25 existing + 2 new — test_witness_host_lct and
test_witness_host_lct_multiple_sessions). Validated end-to-end against
CBP's real host LCT 83810b44-2289-4c14-854f-ae5114f747cf.
Plugin manifest 1.0.0 → 1.1.0 (additive feature, semver minor bump).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Author email normalized to dp@metalinxx.io (single canonical contact) - README: replace "no external dependencies / no network calls" with the accurate version (cryptography for Ed25519 signing; opt-in git fetch during git-push divergence checks) - Add requirements.txt declaring cryptography>=41.0 (plugin still gracefully degrades if missing, but the dependency is now explicit) - Remove meta-process markdown files that don't belong in the plugin: - web4-governance-issue.md (was the original issue body, in repo root) - plugins/web4-governance/PR_DESCRIPTION.md - plugins/web4-governance/FEATURE_REQUEST.md - plugins/web4-governance/HOOK_STDERR_NOTE.md - Plugin documentation files retained: README, EVENT_STREAM_API, PRESETS, docs/RUST_CORE_PROPOSAL — those are real reference docs Test status: 91 passed, 0 failed against fresh ledger.db (pre-existing schema drift in long-lived local ~/.web4/ledger.db can cause 4 failures on legacy installs; not affected by this cleanup). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
6e41d8a to
36b8983
Compare
Refresh: rebased + cleaned for reviewRebased against current main and cleaned up for review. The diff is now plugin-only — 44 files, 13,064 additions, 0 deletions. The earlier 15K-line / 65-file appearance was stale-base artifact from upstream workflow refactors that landed since this PR was opened in January; that's resolved by the rebase. Cleanup in
Test status: 91 passed, 0 failed against a fresh Two contextual data points worth reading alongside this PR1. ARC-AGI-3 result. Same Claude Opus 4.6 anyone can rent today, given a Web4-style governance frame (identity + scoped authority + real-time policy evaluation + cryptographic audit), scored 94.85% on ARC-AGI-3 — a benchmark where the same model in default context scores 0%. Public scorecard. No fine-tuning, no weight changes. The structure made the difference. This PR is the developer-portable surface of that same structure — the part that lets any Claude Code session capture R6 records and apply policy gates without hardware bindings. 2. Framing relative to the Microsoft Agent Governance Toolkit (April 2026). That toolkit is runtime policy enforcement — governance for what agents do. This PR is the upstream identity and accountability ontology — governance for what agents are. They are complementary, not competing. A runtime governance toolkit consumes an identity ontology underneath; that's the layer this PR contributes to Claude Code specifically. Happy to decompose this into smaller PRs by capability (R6 audit logging, policy hooks, hash-chain provenance, MCP witnessing) if that helps reviewability — say the word and I'll split it. |
Web4 Governance Plugin for Claude Code
Lightweight AI governance with T3 trust tensors, entity witnessing, and R6 audit trails.
Note: "web4" = trust-native internet infrastructure for the AI agent era (cryptographic provenance, verifiable
accountability). Generic descriptor, not a trademark claim.
R6 = Rules + Role + Request + Reference + Resource → Result (structured audit record format)
Features
-Entity Trust - T3/V3 tensors (6D each) for MCP servers, agents, references
-Witnessing - Bidirectional trust flow through observation
-R6 Workflow - Formal intent→action→result with hash-linked provenance
-Rust Backend - (auto Python fallback)
-Trust Decay - Unused entities decay toward neutral over time
Components
governance/ - Trust tensors, witnessing, R6 ledger, session management
hooks/ - session_start, pre/post_tool_use, heartbeat
web4-trust-core/ - Rust crate with PyO3 + WASM bindings
Test Plan
Entity trust + witnessing (12 tests passing)
Rust backend verification + Python fallback
Real session integration
See README.md for full documentation.