-
Notifications
You must be signed in to change notification settings - Fork 22
docs: ADR-005 through ADR-013 (feedback, flags, knowledge, DoD, security, recovery, conflicts, knowledge stack, validation pyramid) #144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
c2f3b71
docs: ADR-005 feedback loop, ADR-008 definition of done, ADR-009 secu…
scottschreckengaust 567b020
docs: ADR-006 feature flags, ADR-007 knowledge acquisition, ADR-010 e…
scottschreckengaust d7b318d
docs(ADR-003): GraphQL dependency graph as authoritative, assignments…
scottschreckengaust c0fc069
docs(AGENTS.md): add governance directive — issue required before imp…
scottschreckengaust 2115040
Merge branch 'main' into feat/adr-005-008-009
scottschreckengaust 4d33352
docs: recover ADR-012, ADR-013, and ADR-003 enforcement (lost in rebase)
scottschreckengaust 89d402f
docs: address PR #144 review feedback (ADR-006, -007, -009)
scottschreckengaust 07b4e9e
Merge branch 'main' into feat/adr-005-008-009
scottschreckengaust e937898
Merge branch 'main' into feat/adr-005-008-009
scottschreckengaust 5904ae3
docs: address reviewer feedback — status, decoupling, transition clause
scottschreckengaust eb9f7fa
Merge branch 'main' into feat/adr-005-008-009
krokoko File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,68 @@ | ||
| # ADR-005: Feedback loop — PR reviews propagate to issues and ADRs | ||
|
|
||
| **Status:** proposed | ||
| **Date:** 2026-05-19 | ||
|
|
||
| ## Context | ||
|
|
||
| PR review comments are addressed locally (fix the code) but systemic issues they reveal are not propagated upstream. A reviewer says "this approach is wrong" but the issue still says "use this approach." ADRs are treated as immutable when they should be living decisions that evolve with implementation experience. | ||
|
|
||
| Without a feedback protocol, review insights are lost, issue bodies rot, and architectural mistakes persist across stacked PR chains. | ||
|
|
||
| ## Decision | ||
|
|
||
| ### Review comment classification | ||
|
|
||
| | Type | Action | Propagates to | | ||
| |------|--------|---------------| | ||
| | Nit (style, naming) | Fix in PR | Nothing | | ||
| | Bug (logic error) | Fix in PR | Nothing (unless systemic) | | ||
| | Design concern | Pause PR; evaluate | Issue body | | ||
| | Architecture challenge | Pause PR; escalate | ADR (supersede? amend?) | | ||
| | Scope question | Clarify | Issue body | | ||
| | Blocker (won't approve as-is) | Pause PR | Issue body | | ||
|
|
||
| ### Upstream propagation | ||
|
|
||
| When a review surfaces a design concern or architecture challenge: | ||
|
|
||
| 1. **Pause** — Do not force-merge. Do not continue stacked PRs above this one. | ||
| 2. **Assess** — Does this invalidate the issue's approach? The ADR's decision? | ||
| 3. **Propagate** — Update the relevant upstream document (issue body, ADR, stacked PR dependents). | ||
| 4. **Resolve** — Revise the approach, defend with evidence, or cancel the work. | ||
| 5. **Resume** — Once resolved, unblock the PR and dependents. | ||
|
|
||
| ### ADR evolution | ||
|
|
||
| | Trigger | Response | | ||
| |---------|----------| | ||
| | Implementation reveals the decision doesn't work | New RFC proposing a successor ADR | | ||
| | Reviewer challenges the architectural premise | `**UNRESOLVED**` on the issue; pause | | ||
| | New information makes the decision obsolete | Successor ADR with `Supersedes: ADR-NNN` | | ||
| | Decision works but needs refinement | Amend via PR (minor, no new ADR) | | ||
|
|
||
| Never silently ignore a challenged decision. | ||
|
|
||
| ### Stacked PR chain revision | ||
|
|
||
| When feedback on PR N invalidates PRs N+1 through N+M: | ||
| 1. Comment on all affected PRs | ||
| 2. Do not rebase dependent PRs until the base is stable | ||
| 3. If architectural: re-evaluate whether the remaining stack is valid | ||
| 4. If redesign needed: close dependent PRs, revise issue, re-plan | ||
|
|
||
| ## Consequences | ||
|
|
||
| - (+) Review insights propagate to architectural decisions | ||
| - (+) Issue bodies stay current with implementation learnings | ||
| - (+) ADRs evolve rather than silently becoming outdated | ||
| - (+) Stacked PR chains have a defined recovery protocol | ||
| - (-) Adds process overhead to reviews (classification step) | ||
| - (-) Pausing stacked chains delays delivery | ||
| - (!) Requires discipline to actually propagate feedback upstream | ||
|
|
||
| ## References | ||
|
|
||
| - Issue #136 — full RFC with open questions | ||
| - ADR-003 — governance (issue body as source of truth) | ||
| - ADR-001 — stacked PRs (chain revision protocol) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,82 @@ | ||
| # ADR-006: Feature flags for concurrent development | ||
|
|
||
| **Status:** proposed | ||
| **Date:** 2026-05-19 | ||
|
|
||
| ## Context | ||
|
|
||
| Multiple agents working on related features in the same area must serialize — one waits for the other to merge. Incomplete features either block the main branch or require long-lived branches that diverge. SRE needs kill switches without reverting commits. | ||
|
|
||
| Feature flags enable trunk-based development where incomplete work merges safely behind toggles, and concurrent contributors avoid blocking each other. | ||
|
|
||
| ## Decision | ||
|
|
||
| ### When to use flags | ||
|
|
||
| | Situation | Use a flag? | | ||
| |-----------|-------------| | ||
| | Feature spans multiple PRs, incomplete state is unsafe | Yes | | ||
| | Two contributors touch the same module for different purposes | Yes | | ||
| | SRE needs a kill switch for a new capability | Yes | | ||
| | Simple refactor with no behavioral change | No | | ||
| | Bug fix | No | | ||
| | One-PR feature, complete on merge | No | | ||
|
|
||
| ### Flag ownership | ||
|
|
||
| - Every flag has an owner (the issue that introduced it) | ||
| - Every flag has an expiration (the issue/PR that removes it) | ||
| - Flags without a removal plan are rejected in review | ||
|
|
||
| ### Separation of concerns | ||
|
|
||
| - **Planners** decide which features get flags (issue/RFC level) | ||
| - **Implementors** add/use flags in code (PR level) | ||
| - **SRE/operators** toggle flags in production (runtime level) | ||
| - **No self-approval** — the person who introduces a flag cannot approve its removal | ||
|
|
||
| ### Flag lifecycle | ||
|
|
||
| 1. **Proposed** — issue identifies the need for a flag | ||
| 2. **Introduced** — PR adds the flag (default: off) | ||
| 3. **Active** — feature behind flag is in development | ||
| 4. **Verified** — feature complete, flag toggled on in testing | ||
| 5. **Permanent** — flag removed, feature is always-on (or removed entirely) | ||
|
|
||
| ### Lifecycle metadata | ||
|
|
||
| Each flag must track: | ||
|
|
||
| | Field | Required | Source | | ||
| |-------|----------|--------| | ||
| | Flag name | Yes | Code constant | | ||
| | Purpose / linked issue | Yes | Issue reference | | ||
| | First merge date | Yes | Auto from git log | | ||
| | Max lifetime | Yes | Declared at creation (default: 4 weeks) | | ||
| | Expected removal date | Yes | first_merge + max_lifetime | | ||
| | Actual removal date | — | Auto when flag deleted | | ||
| | Days active | — | Computed | | ||
|
|
||
| ### Maximum lifetime | ||
|
|
||
| Flags must be removed within the declared max lifetime (default: 4 weeks) of the feature being verified. The max lifetime can be overridden per-flag with justification in the issue. Stale flags are treated as technical debt and surfaced in periodic reviews. | ||
|
|
||
| ### Mechanism constraint | ||
|
|
||
| Flags MUST be resolvable at synth time for infrastructure flags and at runtime for behavior flags. The specific storage mechanism (CDK context, DynamoDB, SSM Parameter Store, env vars) is context-dependent and follows from this split — it is not prescribed by this ADR. | ||
|
|
||
| ## Consequences | ||
|
|
||
| - (+) Concurrent work proceeds without blocking | ||
| - (+) Trunk-based development: main stays deployable | ||
| - (+) SRE can disable features without code changes | ||
| - (+) Partial features merge safely | ||
| - (-) Flag management overhead | ||
| - (-) Combinatorial testing complexity if many flags exist simultaneously | ||
| - (!) Maximum lifetime must be enforced or flags accumulate indefinitely | ||
|
|
||
| ## References | ||
|
|
||
| - Issue #137 — full RFC with open questions on mechanism (CDK context vs. DynamoDB vs. env vars) | ||
| - ADR-003 — governance (flag introduction requires approval) | ||
| - ADR-005 — feedback loop (reviewer may flag-gate a feature during review) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,79 @@ | ||
| # ADR-007: Knowledge acquisition through progressive failure | ||
|
|
||
| **Status:** proposed | ||
| **Date:** 2026-05-19 | ||
|
|
||
| ## Context | ||
|
|
||
| Agents with fresh context (tabula rasa) attempt to follow documentation and hit gaps they cannot resolve. These gaps are silently worked around (agent asks a human) rather than systematically fixed. The system cannot self-improve its onboarding because failures are not captured. | ||
|
|
||
| Knowledge acquisition starts from zero. Each iteration creates the roadmap to better knowledge by discovering gaps through actual failures. | ||
|
|
||
| ## Decision | ||
|
|
||
| ### Zero-context execution attempts | ||
|
|
||
| Periodically, an agent with no project memory attempts to follow guides end-to-end. The agent follows ONLY what is written — no inference, no training data knowledge, no asking colleagues. | ||
|
|
||
| ### Failure capture protocol | ||
|
|
||
| At each failure point, the agent: | ||
| 1. **Stops** — does not attempt to work around or guess | ||
| 2. **Documents** — creates an issue: which document, which step, what was missing | ||
| 3. **Continues** — attempts the next step (if possible) to find additional gaps | ||
|
|
||
| ### Retrospectives | ||
|
|
||
| After completing a task, project milestone, or sprint, agents produce a retrospective artifact: | ||
| - What worked well (patterns to repeat) | ||
| - What failed or caused friction (patterns to avoid) | ||
| - Actionable experiments for future workflows | ||
|
|
||
| Retrospectives are a first-class knowledge artifact — they feed into documentation improvements, inform ADR amendments, and surface systemic issues that individual task failures cannot. | ||
|
|
||
| ### Knowledge artifacts (interim) | ||
|
|
||
| Until documentation meets ADR-004, agents may create ephemeral artifacts: | ||
| - Semantic indices of the codebase (call graphs, dependency maps) | ||
| - Annotated walkthroughs of successful executions | ||
| - "What I learned" summaries after completing a task | ||
| - Retrospectives (see above) | ||
|
|
||
| These are scaffolding that informs documentation improvements, not documentation themselves. | ||
|
|
||
| ### Maturity model | ||
|
|
||
| | Level | State | Agent capability | | ||
| |-------|-------|-----------------| | ||
| | 0 | No docs | Cannot start; files issue for missing docs | | ||
| | 1 | Partial docs | Follows docs, stops at gaps, files issues | | ||
| | 2 | Complete docs (ADR-004) | Completes end-to-end without help | | ||
| | 3 | Self-improving | Detects drift between docs and code, auto-files issues | | ||
|
|
||
| ### The self-improvement loop | ||
|
|
||
| ``` | ||
| Agent starts fresh → follows docs → hits failure → | ||
| files issue → issue gets fixed → next agent goes further → | ||
| hits next failure → files issue → ... | ||
| until end-to-end works from zero context | ||
| ``` | ||
|
|
||
| This runs continuously because code changes outpace documentation and different agent implementations fail at different points. | ||
|
|
||
| ## Consequences | ||
|
|
||
| - (+) Documentation gaps become bugs with reproduction steps | ||
| - (+) Priority ordering emerges naturally (most common failures surface first) | ||
| - (+) The system self-improves without human identification of gaps | ||
| - (+) Creates a natural definition of "docs are done" (Level 2 achieved) | ||
| - (-) Generates issue volume that needs triage | ||
| - (-) Requires periodic investment in zero-context test runs | ||
| - (!) The gap between Level 1 and Level 2 may be large — patience required | ||
|
|
||
| ## References | ||
|
|
||
| - Issue #138 — full RFC with open questions | ||
| - ADR-004 — defines the quality target (tabula rasa test) | ||
| - ADR-003 — governance for issues filed by failing agents | ||
| - ADR-008 — Level 4 Definition of Done depends on this protocol | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.