From 1d729cb6e90e982cf701a06f3018383e4bac986f Mon Sep 17 00:00:00 2001 From: Ralf Anton Beier Date: Wed, 22 Apr 2026 07:02:51 +0200 Subject: [PATCH] docs: AI + safety/cyber human-in-the-loop contract MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Frame for the recurring customer objection "a qualified human still has to do this." Enumerates the named-human sign-off role across ISO 26262, IEC 61508, IEC 62304, DO-178C, EN 50128, ISO/SAE 21434, ISO 27001, IEC 62443, ASPICE 4.0, EU AI Act Art. 14, and NIST AI RMF. Then establishes rivet's four-point HITL contract: 1. Provenance-on-author (today — schemas/common.yaml already gates ai-generated artifacts reaching `active` without reviewed-by). 2. Human sign-off as a separate stamp (today — `rivet stamp --reviewed-by`; gaps: no structured rationale, no `rivet approve` alias, no Part 11 e-signature). 3. Audit-trail view (v0.5.0 proposal — `rivet audit-trail ` over git history + provenance transitions). 4. Structural-only validator boundary (today — `rivet validate` never claims to assess credibility). Explicitly lists what rivet does NOT claim (no safety analysis, no hazard-credibility assessment, no assessor replacement, no TCL/TQL self-qualification, no regulatory guarantee, no 21 CFR Part 11). Five implementation items for v0.5.0 backlog are called out. Live web fetch was unavailable this session; external standard clauses and vendor marketing phrases are flagged *(unverified)* per the constraint "mark unverified." Refs: FEAT-001, REQ-002, REQ-030 Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/design/ai-safety-cyber-hitl.md | 313 ++++++++++++++++++++++++++++ 1 file changed, 313 insertions(+) create mode 100644 docs/design/ai-safety-cyber-hitl.md diff --git a/docs/design/ai-safety-cyber-hitl.md b/docs/design/ai-safety-cyber-hitl.md new file mode 100644 index 0000000..b349050 --- /dev/null +++ b/docs/design/ai-safety-cyber-hitl.md @@ -0,0 +1,313 @@ +# AI-Assisted Safety + Cybersecurity Engineering — the Human-in-the-Loop Contract + +**Audience:** safety/cybersecurity leads, regulators, and sales engineers fielding +the objection *"but a qualified human still has to do this."* + +**Scope:** how rivet frames AI assistance in regulated SDLC work, what the +standards actually require of the human, and the four-point commitment rivet +makes so that AI-authored evidence is honest and auditable. + +**Not in scope:** tool qualification (TCL/TQL) of rivet itself — see +`docs/design/iso26262-artifact-mapping.md` §4 and the `iso-pas-8800.yaml` +schema. + +--- + +## 1. TL;DR + +The objection *"a qualified human still has to sign off"* is correct, and it +is not an argument against AI assistance — it is the **shape** of AI +assistance in a regulated SDLC. Rivet's frame, in one sentence: + +> **AI proposes structure; a qualified human owns judgment; every transition +> between the two is a separately-stamped, git-reviewable event.** + +Everything in this document is a consequence of that one sentence. + +--- + +## 2. What the standards actually say about the human + +The claims below are from publicly documented clause references. Exact +clause numbers should be re-verified against a paid copy of the standard +before external use — they are marked *(unverified clause-level)* where the +author could not fetch the primary text. The **role existence** and +**sign-off duty** are not in dispute. + +| Standard | Human role | What they sign | What AI can NEVER do | +|---|---|---|---| +| ISO 26262:2018 part 2-6 *(unverified)* | Safety assessor (independent, ≥ I2/I3 for ASIL C/D) | Safety case argument, confirmation reviews | Declare a hazard non-credible; waive ASIL; sign the safety case | +| ISO 26262:2018 part 8-6 *(unverified)* | Tool user | Tool qualification rationale (TCL/TQL) | Self-qualify the toolchain | +| IEC 61508-1:2010 clause 8 *(unverified)* | Functional safety assessor | FSA report per SIL | Conclude "SIL met" absent the assessor's signature | +| IEC 62304:2006+A1:2015 clause 4.2 *(unverified)* | Risk manager (per ISO 14971) | Benefit/risk determination, residual-risk acceptability | Waive residual risk; declare device safety class | +| DO-178C §8 + DO-330 | DER (Designated Engineering Representative) + independence for DAL A/B | Stage-of-involvement (SOI) conformance, PSAC/SAS | Replace DER audit; sign DO-330 tool qualification | +| EN 50128:2011 clause 5.1 *(unverified)* | Validator (independent of verifier + designer) | Validation report at SIL 3/4 | Combine designer + validator role | +| ISO/SAE 21434:2021 clause 5.4.2 *(unverified)* | Cybersecurity manager | Cybersecurity case, risk decisions, CAL rationale | Accept a residual cyber risk; sign the cyber case | +| ISO 27001:2022 Annex A | Risk owner per control | Statement of Applicability; residual-risk acceptance | Own a control on a human's behalf | +| IEC 62443-4-1:2018 SM-1 *(unverified)* | Security champion / lead | Secure-development process conformance | Self-attest the SDL | +| ASPICE 4.0 MAN.6 | Process assessor (iNTACS-qualified) | Capability-level rating | Self-rate the process | +| EU AI Act Art. 14 *(summarised from public drafts, unverified verbatim)* | Assigned natural person(s) for human oversight | Decisions about use/override of the AI system | Substitute for the natural person's oversight duties | +| NIST AI RMF 1.0 (2023) — GOVERN function | Accountable AI actor | "Govern 1.2" — roles and responsibilities documented | Replace documented accountability | + +The pattern is the same across every row: the **role exists**, the **signature +is by a named human**, and the **judgment cannot be delegated** — to AI, +to a contractor, to a tool, or to a framework. + +Three representative quotations worth internalising (paraphrased from public +summaries; quote exactly from the paid standard before publishing +externally): + +1. **ISO 26262-2:2018** — the organisation shall assign a person with + appropriate competence and independence to carry out the safety + assessment; the competence requirement (experience, training, + domain knowledge) is non-waivable. +2. **ISO/SAE 21434:2021** — the organisation shall define responsibilities + and authorities for cybersecurity and appoint a cybersecurity manager; + the cybersecurity case requires a judgment of residual risk + acceptability by that role. +3. **EU AI Act Article 14 (paraphrased)** — high-risk AI systems must be + effectively overseen by natural persons during the period in which they + are in use; those persons must be able to fully understand the system's + capacities and limitations, remain aware of automation bias, correctly + interpret output, and decide not to use or override the output. + +Rivet's design should make each of those sign-off events **a distinct, +inspectable record** — never an implicit consequence of the AI having +written the file. + +--- + +## 3. How existing tools handle the tension + +Live web fetch was not available this session; the vendor summaries below +are from the author's prior reading and are flagged *(unverified)* — quote +text must be re-fetched before external use. + +- **Jama Connect Advisor** *(unverified)* — AI **"suggests improvements"** + to requirements using INCOSE rules. Explicitly advisory; the engineer + accepts/edits/rejects inside Jama's review workflow. **Honest framing.** +- **Siemens Polarion + Industrial Copilot** *(unverified)* — "generative + assistance" rather than autonomous authoring; Polarion's workflow gates + (draft → reviewed → approved, with 21 CFR Part 11 e-signature) are the + compliance anchor. Honest framing lives in the workflow, not the AI + pitch. +- **Codebeamer (PTC) AI** *(unverified)* — "AI-powered requirement + generation" with explicit "human must review and approve" caveat. +- **Ansys medini analyze** *(unverified)* — automates bookkeeping around + the HARA; does not claim to do the HARA. +- **BTC EmbeddedPlatform** *(unverified)* — AI as **proof-engine + accelerator**, not a safety decision-maker. Consistent with + DO-178C/DO-330 tool-qualification logic. +- **TÜV SÜD / Rheinland AI advisory** *(unverified)* — repeatedly: AI is + a tool whose output **must be verifiable by a qualified human**, and + any tool in a safety argument must be qualified (TCL/TQL). This is the + external frame rivet aligns with: **tool output ≠ safety evidence; + verified tool output, reviewed by a qualified human, is safety + evidence.** + +The honest tools share three traits: (1) AI output is labelled on every +artifact it touched; (2) sign-off is a separate workflow state performed +by a named human, not a side effect of authoring; (3) the AI does not +sign. Overclaiming blurs authorship and approval, markets the AI as doing +the engineer's job, or omits qualifiers like "suggested" / "draft." + +--- + +## 4. The pattern: AI proposes, human approves + +Every layer of safety/cyber work can be split into **structural** work +(what the AI is good at) and **judgment** work (what the standards require +a human to do). Rivet's CLI already maps cleanly onto this split. + +| Layer | AI role | Human role | Rivet today | +|---|---|---|---| +| **Authoring** | Drafts `hazard`, `uca`, `threat`, `requirement` YAML from natural-language input or prior artifacts | Decides whether the drafted item is a real hazard / real threat / credible scenario | `rivet add` + auto-stamp via PostToolUse hook | +| **Linking** | Suggests `satisfies`, `mitigates`, `verifies`, `decomposes-asil` links from text similarity | Confirms the link is semantically sound | `rivet link` / `rivet unlink` | +| **Structural validation** | Runs schema, cardinality, enum, bridge-rule, and cross-tree variant checks | Decides which warnings are real | `rivet validate` (lints, not judgments) | +| **Gap detection** | Reports orphan artifacts, missing verification, uncovered requirements | Decides which gaps block release | `rivet coverage` | +| **Summarisation** | Renders an artifact subset (`rivet list`, `rivet embed`, `rivet query`) | Interprets the summary against the system context | `rivet list`, `rivet query`, `rivet embed`, dashboard (`rivet serve`) | +| **Sign-off** | **Never** | Names themselves in the provenance record with rationale | `rivet stamp --reviewed-by ` (see §5) | + +The line between "proposes" and "approves" is the line between a lint +warning and a regulator-facing claim. Rivet enforces it at the file level: +a provenance block with `created-by: ai-assisted` and no `reviewed-by` +field is a draft. A provenance block with both is a reviewed artifact. +Everything else is a bug in the workflow. + +--- + +## 5. Rivet's four-point HITL contract + +These are the commitments a rivet customer can hold us to. Items marked +**(today)** are implemented in main. Items marked **(v0.5.0 proposal)** +are gaps called out honestly so sales does not overstate. + +### 5.1 Every AI-authored artifact carries provenance **(today)** + +`schemas/common.yaml` defines a `provenance` block with `created-by` ∈ +{`human`, `ai`, `ai-assisted`}, `model`, `session-id`, `timestamp`, and +`reviewed-by`. The `ai-generated-needs-review` conditional rule already +fires a warning if an `ai`/`ai-assisted` artifact reaches `status: active` +without a `reviewed-by` field (see `schemas/common.yaml` lines ~108–119). +The Claude Code PostToolUse hook auto-stamps on file edits so provenance +cannot be forgotten. + +Hardening proposed for v0.5.0: + +- Promote `ai-generated-needs-review` from `severity: warning` to + `severity: error` on `status: approved` (not just `active`). +- Add a lint that rejects `status: approved` when `reviewed-by` is an + AI identifier (regex match on `ai`/`ai-assisted`/known model ids). + This is the single validation rule that closes the "AI approved its + own work" loophole. + +### 5.2 Human sign-off is a separate provenance entry **(today, with caveats)** + +`rivet stamp --reviewed-by ` already exists in +`rivet-cli/src/main.rs` (see the `Stamp` subcommand around line 682 and +`cmd_stamp` around line 7462). The gap is ergonomic, not fundamental: + +- The reviewer's **rationale** is not yet a structured field — it lives + in the commit message or a free-text note. **Proposal: add a + `rationale` subfield to `provenance.reviewed-by` and require it for + artifacts with `asil` ≥ B or `cal` ≥ 2.** +- There is no dedicated `rivet approve ` subcommand distinct from + `rivet stamp`. Stamp is the right machinery — a thin `rivet approve` + alias that defaults `--created-by` to the current `$USER` and requires + `--rationale` would be the obvious sugar. +- Electronic-signature support (FDA 21 CFR Part 11, EU eIDAS) is **not** + claimed. Rivet records who-reviewed-what in git; Part 11 attestation + requires an external signature flow rivet does not ship. + +### 5.3 Audit-trail view **(v0.5.0 proposal)** + +Git history + provenance stamps together contain every authoring and +review event, but the view has to be assembled manually today. The +missing piece is `rivet audit-trail ` — a subcommand that +walks git history and interleaves commits, `created-by` transitions, +`reviewed-by` transitions, and status transitions (`draft` → `active` → +`approved` → `superseded`) chronologically. The reviewer would see the +complete authored-by-AI → approved-by-Jane-Doe → modified-by-AI → +re-approved-by-Jane-Doe chain at a glance. This is the single most +requested feature for auditor-facing demos. + +### 5.4 Structural-only enforcement, with the boundary made explicit **(today)** + +`rivet validate` is 100% structural: schema compliance, cardinality, +enum membership, link-target existence, orphan detection, variant +cross-tree constraints, coverage. **Rivet never claims to assess +credibility, sufficiency, or acceptability.** That boundary is not a +limitation dressed up as a virtue — it is the whole point. The moment a +structural validator claims to make a safety judgment, its output stops +being evidence the human can rely on and starts being evidence the human +has to disprove. + +Customer-facing phrasing for this line: + +> Rivet can tell you whether your hazard is *linked*. Only a qualified +> safety engineer can tell you whether it is *real*. + +--- + +## 6. FAQ — pocket answers to the objection + +**"But AI can't do safety analysis."** Correct, and rivet does not claim +it does. Rivet makes the *artifacts of a human's safety analysis* +machine-readable, agent-writable at the authoring layer, and +git-reviewable at every transition. The qualified engineer still owns +every judgment call — their sign-off now arrives with full structural +validation, coverage data, and provenance trail attached. + +**"How do I know the AI didn't hallucinate a requirement?"** Three +defences: (1) every AI-authored artifact carries `created-by: ai-assisted` ++ model + timestamp, so the hallucination is labelled; (2) every artifact +flows through a git PR where the human reviews the diff; (3) `rivet +validate` catches the dangling links and schema violations hallucinations +often produce. None of these is a hallucination detector — the reviewer +is. Rivet makes their review cheaper. + +**"What about liability?"** Unchanged. The qualified engineer bears the +regulatory responsibility, exactly as before. Rivet provides **evidence +that their sign-off was informed** — validation passed, coverage +reports, traceability chain — not a shift of liability to a tool vendor. +Tool-qualification constraints are explicit in `iso-pas-8800.yaml`. + +**"How is this different from just using Copilot?"** Copilot authors +code. Rivet authors the **traceability chain that proves the code +satisfies a safety requirement.** Different artifact class: Copilot's +output lives in `.rs` / `.c` / `.py`; rivet's output lives in +`artifacts/*.yaml` and `safety/**/*.yaml` and connects code commits to +hazards, UCAs, verification, and safety-case claims. You need both. + +**"Can I get TÜV / a safety assessor to accept rivet-generated +artifacts?"** Not yet — the path is a pilot engagement. Rivet maps its +schemas to ISO 26262 / IEC 61508 / IEC 62304 / DO-178C / EN 50128 / +ISO 21434 work products (see `docs/design/iso26262-artifact-mapping.md`), +but mapping fidelity is not audit acceptance. Customer-development path: +(1) rivet produces the structured evidence, (2) the customer's qualified +engineer signs off, (3) the assessor accepts the signed evidence — not +the tool's output directly. + +--- + +## 7. What rivet explicitly does NOT claim + +These lines go in every sales deck, unedited: + +- Rivet does **not** perform safety analysis. +- Rivet does **not** assess hazard credibility or threat likelihood. +- Rivet does **not** replace a safety assessor, cybersecurity manager, + DER, validator, or process assessor. +- Rivet itself is **not** currently TCL- or TQL-qualified under + ISO 26262-8 / DO-330 / IEC 61508-3 — tool qualification is a separate + programme (see `schemas/iso-pas-8800.yaml` for the model). +- Rivet does **not** guarantee regulatory compliance. It produces + artifacts a qualified human uses *toward* compliance. +- Rivet does **not** provide 21 CFR Part 11 / eIDAS electronic + signatures. Reviewer attribution is recorded in git + provenance; a + separate signature flow is required for Part 11 attestation. + +Stating these up front is the credibility move. Vendors who don't say +them are the ones auditors don't trust. + +--- + +## 8. Cross-references and proposed follow-ups + +- `docs/design/iso26262-artifact-mapping.md` (PR #164, merged) — + the fidelity register for ISO 26262:2018. Every row in §2 of that + doc resolves to a sign-off by a named human; this HITL doc is the + procedural backing. +- `docs/design/ai-evidence-trend-research.md` (PR #173, open) — + the category positioning; this doc is the concrete HITL contract + inside that category. +- `docs/what-is-rivet.md` (PR #172, open) — the top-level + positioning. Two text updates are proposed (not made here): + 1. Add a "Human-in-the-loop" section directly after "Who it's for," + linking to this doc as the authoritative source. + 2. Add a top-level "What rivet does NOT do" section using §7 + verbatim. + +### Implementation backlog inferred from §5 + +1. Promote `ai-generated-needs-review` to `error` on `status: approved`. +2. Add a lint forbidding `reviewed-by` matching AI identifiers on + approved artifacts (the self-approval loophole). +3. Add `provenance.reviewed-by.rationale` as a structured subfield; + require it for `asil ≥ B` or `cal ≥ 2`. +4. Add `rivet approve ` as an ergonomic alias for `rivet stamp` + with `--rationale` required. +5. Add `rivet audit-trail ` as the chronological view over git + history + provenance transitions. + +Each of these is a small-to-medium change against `rivet-cli/src` and +`schemas/common.yaml`; the largest single item is the audit-trail +subcommand, which requires walking `git log --follow` with YAML diff +parsing. None requires rethinking the model. + +--- + +*Trailers: `Refs: FEAT-001` (Evidence-as-Code positioning), +`Refs: REQ-002` (STPA artifact support — the cyber-safety joint analysis +pattern), `Refs: REQ-030` (formal verification — the structural-only +enforcement boundary).*