Expert Panel Review: .dip Format Support — Consolidated Report

# Expert Panel Review: .dip Format Support in Tracker

**Date:** 2026-03-20
**Reviewers:** 11 expert subagents (Architecture, API Design, Documentation, Security, Error Handling, Compatibility, DSL Design, Test Quality, OpenAI Researcher, Grad Student AI Researcher, Martin Fowler)

---

## Bugs Fixed This Session

Four silent data-loss bugs were found and fixed in `pipeline/dippin_adapter.go`:

| What was lost | Adapter wrote | Engine/Handler reads | Status |
|---|---|---|---|
| Model selection | `model` | `llm_model` | **FIXED** |
| Provider selection | `provider` | `llm_provider` | **FIXED** |
| Fallback on retry exhaustion | `fallback_target` | `fallback_retry_target` | **FIXED** |
| Pipeline goal ($goal, fidelity, summarization) | *not mapped at all* | `graph.goal` via `g.Attrs["goal"]` | **FIXED** |

Also fixed: indentation bug in `extractRetryAttrs`.

All tests pass after fixes.

---

## CRITICAL (must fix)

| # | Issue | Sources | Details |
|---|-------|---------|---------|
| 1 | ~~`model`/`provider` key mismatch~~ | Test Quality, Compatibility | **FIXED** — adapter now writes `llm_model`/`llm_provider` (#13) |
| 2 | ~~`fallback_target` key mismatch~~ | Martin Fowler | **FIXED** — adapter now writes `fallback_retry_target` (#13) |
| 3 | ~~`workflow.Goal` not mapped~~ | DSL Design | **FIXED** — adapter now maps `workflow.Goal` to `g.Attrs["goal"]` (#14) |
| 4 | **`DippinValidated` flag bypasses 5 structural validation checks** | Architecture, Security, Martin Fowler, Grad Student, Error Handling, API Design | Any code can set `g.DippinValidated = true` and skip cycle detection, reachability, start/exit verification. Dippin validates the IR but tracker skips validating the *converted Graph*. (#4) |
| 5 | **Command injection via `tool_command`** | Security, OpenAI Researcher | `sh -c <command>` with no sanitization, sandboxing, or allowlist. Full process privileges. (#16) |
| 6 | **Library writes to `os.Stderr`** | API Design, Martin Fowler, Error Handling | `tracker.go` prints deprecation warnings and diagnostics directly to stderr. Library consumers cannot capture/suppress. (#7) |
| 7 | **Goal-gate retry has no termination bound** | Grad Student | No goal-gate-specific counter; `clearDownstream` doesn't reset retry counts on this path. (#15) |
| 8 | **Nested `retry` block not supported by parser** | DSL Design, Scenario Tests | `complex.dip` uses block syntax the parser can't handle. (#3) |
| 9 | **Edge bracket syntax `[label: ...]` silently misparsed** | DSL Design | Parser has no bracket token; annotations are silently swallowed. (#18) |
| 10 | **No pipeline-level token budget or cost ceiling** | OpenAI Researcher | Misconfigured pipelines can consume unbounded API spend. (#17) |

## IMPORTANT (should fix soon)

| # | Issue | Sources | Details |
|---|-------|---------|---------|
| 11 | **Graph model is anemic (Primitive Obsession)** — all node config in `map[string]string` | Martin Fowler, OpenAI Researcher, Architecture, Grad Student | Typed IR gets downgraded to string map; root cause of bugs #1-#3. (#19) |
| 12 | **Code duplication between CLI and library** — parse chain and LLM client builder | Martin Fowler, Architecture, Compatibility | `main.go` and `tracker.go` duplicate the parse-validate-convert chain and `buildClient`. (#6) |
| 13 | **`--format` flag not parsed for validate/simulate** | Compatibility, Scenario Tests | `parseFlags` returns early for these modes, skipping flag registration. (#1) |
| 14 | **No migration tooling** — deprecation warning says "migrate to .dip" but no converter exists | Compatibility, Documentation | 13 DOT examples with no automated path forward. (#12) |
| 15 | **E2E comparison test doesn't verify attribute equivalence** | Test Quality | Only checks node count, shapes, handlers; misses attribute keys/values. (#11) |
| 16 | **Parallel fan-out: no concurrency limit, no configurable aggregation** | OpenAI Researcher, Grad Student | All branches launch simultaneously. Only "any success" strategy. (#27) |
| 17 | **Parallel branch context updates lost** — never merged back into parent context | Grad Student, OpenAI Researcher | Silent data loss. Fan-in nodes don't receive branch outputs via context. (#20) |
| 18 | **Consensus pipelines run sequentially instead of in parallel** | OpenAI Researcher | Multi-model nodes see each other's outputs instead of reasoning independently. (#26) |
| 19 | **Condition parser is fragile** — naive `strings.Split` on `||`/`&&`, no quoting | Grad Student, DSL Design | Values containing operators misinterpreted; `=` vs `==` ambiguity. (#21) |
| 20 | **No formal .dip grammar/spec** | Documentation, DSL Design | Language defined only by Go implementation. (#12) |
| 21 | **Error messages lack file/line context** | Error Handling, Documentation | Adapter errors don't include `irNode.Source` location. (#33) |
| 22 | **No CLI tests for .dip files** | Test Quality | validate/simulate test files use only DOT. (#11) |
| 23 | **`complex.dip` fixture never used in any test** | Test Quality | Rich test data wasted. (#11) |
| 24 | **No diamond (conditional) node kind in .dip** | DSL Design, Architecture | Can't express pure conditional branch without invoking LLM. (#22) |
| 25 | **`NodeIO` (reads/writes) defined in IR but dropped by adapter** | DSL Design | Useful for validation, documentation. (#25) |
| 26 | **`auto_status` parsing is brittle** — regex for `STATUS:success/fail/retry` in freeform text | OpenAI Researcher, Grad Student | LLMs can hallucinate STATUS lines; no structured output mode. (#23) |
| 27 | **`last_response` single-slot relay loses intermediate node outputs** | OpenAI Researcher, Grad Student | "Telephone game" problem; full chain of reasoning lost. (#24) |
| 28 | **Windows build failure** — `Setpgid`/`syscall.Kill` POSIX-only, no build constraints | Compatibility | Pre-existing, not .dip-specific. (#28) |

## MINOR (backlog)

| # | Issue | Sources | Details |
|---|-------|---------|---------|
| 29 | Subgraph handler declared but not implemented | Grad Student, DSL Design | (#35) |
| 30 | No jitter on retry backoff | OpenAI Researcher | (#29) |
| 31 | Subgraph params serialization non-deterministic | Error Handling | (#8) |
| 32 | Fidelity truncation is character-based, not token-based | OpenAI Researcher | (#34) |
| 33 | Tab vs space indentation conflated in parser | DSL Design | (#3) |
| 34 | `knownModelProviders` linter catalog is stale | DSL Design | (#36) |
| 35 | Human gates have no input timeout | OpenAI Researcher | (#30) |
| 36 | O(E) edge lookup on every call | OpenAI Researcher, Martin Fowler | (#31) |
| 37 | Flat context store with no scoping | Grad Student, OpenAI Researcher | (#32) |
| 38 | Errors use `fmt.Errorf` without `%w` | Error Handling | (#33) |
| 39 | Parallel/fan_in edge boilerplate redundant with node declarations | DSL Design | (#37) |
| 40 | Nil pointer risk on IR Node/Edge slice elements | Error Handling | (#38) |

## DEFERRED (noted, not blocking)

| # | Issue | Sources | Reason |
|---|-------|---------|--------|
| 41 | No distributed execution | OpenAI Researcher | Architectural scope |
| 42 | No persistent agent memory across runs | OpenAI Researcher | Feature request |
| 43 | No dynamic graph modification at runtime | OpenAI Researcher | Design choice |
| 44 | No prompt optimization/evaluation harness | Grad Student | Research feature |
| 45 | No structured I/O typing (Pydantic-like) | Grad Student | Research feature |

---

## Cross-Reference: Issues Flagged by 3+ Experts

| Issue | Count | Experts |
|-------|:---:|---------|
| `DippinValidated` bypass | 6 | Architecture, Security, Martin Fowler, Grad Student, Error Handling, API Design |
| Stringly-typed Graph model / Primitive Obsession | 4 | Martin Fowler (x2), OpenAI Researcher, Architecture |
| Code duplication CLI/library | 3 | Martin Fowler, Architecture, Compatibility |
| Library writes to stderr | 3 | API Design, Martin Fowler, Error Handling |
| Parallel context loss | 3 | Grad Student (x2), OpenAI Researcher |

---

## Strengths Noted by Reviewers

- **Fidelity degradation chain** (full -> summary:high -> ... -> truncate) — unique among agent frameworks (OpenAI Researcher, Grad Student)
- **CSS-like model stylesheets** — elegant approach to batch model configuration (Grad Student)
- **Handler registry pattern** — clean decoupling of execution from traversal (Martin Fowler)
- **Checkpoint/resume** — well-designed for operational resilience (Martin Fowler, Grad Student)
- **Human gate abstraction** — clean `Interviewer` interface with multiple implementations (OpenAI Researcher)
- **Multi-provider support** — native routing to Anthropic, OpenAI, Gemini per-node (OpenAI Researcher)
- **Retry/backoff system** — named policies with composable overrides (OpenAI Researcher)

---

## Competitive Positioning

| Capability | Tracker/.dip | LangGraph | CrewAI | AutoGen | DSPy |
|---|---|---|---|---|---|
| Graph-based orchestration | Yes (DAG) | Yes (cyclic) | Linear | Message-passing | Linear |
| Multi-provider | Native | Via adapters | Via adapters | Via adapters | Via LM classes |
| Parallel execution | Yes | Yes | Yes | Yes | No |
| Human-in-the-loop | Yes | Yes | No | Yes | No |
| Checkpoint/resume | Yes | Yes | No | No | No |
| Typed state | No | Yes | Partial | Partial | Yes |
| Context compression | Yes (unique) | No | No | No | No |
| Prompt optimization | No | No | No | No | Yes |

---

## Recommended Priority Order

1. ~~Fix attribute key mismatches~~ **DONE** (#13, #14)
2. Remove `DippinValidated` flag — always run tracker's own validation (#4)
3. Extract shared parse chain into `pipeline` package (#6)
4. Extract shared LLM client builder (#6)
5. Add CLI tests for .dip files (#11)
6. Parse `--format` flag for validate/simulate (#1)
7. Plan migration from `map[string]string` to typed node configs (#19)
8. Add pipeline-level cost/token budget (#17)
9. Fix parallel branch context merge (#20)
10. Add migration tooling (DOT -> DIP converter) (#12)

What was lost	Adapter wrote	Engine/Handler reads	Status
Model selection	`model`	`llm_model`	FIXED
Provider selection	`provider`	`llm_provider`	FIXED
Fallback on retry exhaustion	`fallback_target`	`fallback_retry_target`	FIXED
Pipeline goal ($goal, fidelity, summarization)	not mapped at all	`graph.goal` via `g.Attrs["goal"]`	FIXED

#	Issue	Sources	Details
1	~~`model`/`provider` key mismatch~~	Test Quality, Compatibility	FIXED — adapter now writes `llm_model`/`llm_provider` (#13)
2	~~`fallback_target` key mismatch~~	Martin Fowler	FIXED — adapter now writes `fallback_retry_target` (#13)
3	~~`workflow.Goal` not mapped~~	DSL Design	FIXED — adapter now maps `workflow.Goal` to `g.Attrs["goal"]` (#14)
4	`DippinValidated` flag bypasses 5 structural validation checks	Architecture, Security, Martin Fowler, Grad Student, Error Handling, API Design	Any code can set `g.DippinValidated = true` and skip cycle detection, reachability, start/exit verification. Dippin validates the IR but tracker skips validating the converted Graph. (#4)
5	Command injection via `tool_command`	Security, OpenAI Researcher	`sh -c <command>` with no sanitization, sandboxing, or allowlist. Full process privileges. (#16)
6	Library writes to `os.Stderr`	API Design, Martin Fowler, Error Handling	`tracker.go` prints deprecation warnings and diagnostics directly to stderr. Library consumers cannot capture/suppress. (#7)
7	Goal-gate retry has no termination bound	Grad Student	No goal-gate-specific counter; `clearDownstream` doesn't reset retry counts on this path. (#15)
8	Nested `retry` block not supported by parser	DSL Design, Scenario Tests	`complex.dip` uses block syntax the parser can't handle. (#3)
9	Edge bracket syntax `[label: ...]` silently misparsed	DSL Design	Parser has no bracket token; annotations are silently swallowed. (#18)
10	No pipeline-level token budget or cost ceiling	OpenAI Researcher	Misconfigured pipelines can consume unbounded API spend. (#17)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expert Panel Review: .dip Format Support — Consolidated Report #39

Expert Panel Review: .dip Format Support in Tracker

Bugs Fixed This Session

CRITICAL (must fix)

IMPORTANT (should fix soon)

MINOR (backlog)

DEFERRED (noted, not blocking)

Cross-Reference: Issues Flagged by 3+ Experts

Strengths Noted by Reviewers

Competitive Positioning

Recommended Priority Order

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

#	Issue	Sources	Details
11	Graph model is anemic (Primitive Obsession) — all node config in `map[string]string`	Martin Fowler, OpenAI Researcher, Architecture, Grad Student	Typed IR gets downgraded to string map; root cause of bugs #1-#3. (#19)
12	Code duplication between CLI and library — parse chain and LLM client builder	Martin Fowler, Architecture, Compatibility	`main.go` and `tracker.go` duplicate the parse-validate-convert chain and `buildClient`. (#6)
13	`--format` flag not parsed for validate/simulate	Compatibility, Scenario Tests	`parseFlags` returns early for these modes, skipping flag registration. (#1)
14	No migration tooling — deprecation warning says "migrate to .dip" but no converter exists	Compatibility, Documentation	13 DOT examples with no automated path forward. (#12)
15	E2E comparison test doesn't verify attribute equivalence	Test Quality	Only checks node count, shapes, handlers; misses attribute keys/values. (#11)
16	Parallel fan-out: no concurrency limit, no configurable aggregation	OpenAI Researcher, Grad Student	All branches launch simultaneously. Only "any success" strategy. (#27)
17	Parallel branch context updates lost — never merged back into parent context	Grad Student, OpenAI Researcher	Silent data loss. Fan-in nodes don't receive branch outputs via context. (#20)
18	Consensus pipelines run sequentially instead of in parallel	OpenAI Researcher	Multi-model nodes see each other's outputs instead of reasoning independently. (#26)
19	Condition parser is fragile — naive `strings.Split` on `		`/`&&`, no quoting
20	No formal .dip grammar/spec	Documentation, DSL Design	Language defined only by Go implementation. (#12)
21	Error messages lack file/line context	Error Handling, Documentation	Adapter errors don't include `irNode.Source` location. (#33)
22	No CLI tests for .dip files	Test Quality	validate/simulate test files use only DOT. (#11)
23	`complex.dip` fixture never used in any test	Test Quality	Rich test data wasted. (#11)
24	No diamond (conditional) node kind in .dip	DSL Design, Architecture	Can't express pure conditional branch without invoking LLM. (#22)
25	`NodeIO` (reads/writes) defined in IR but dropped by adapter	DSL Design	Useful for validation, documentation. (#25)
26	`auto_status` parsing is brittle — regex for `STATUS:success/fail/retry` in freeform text	OpenAI Researcher, Grad Student	LLMs can hallucinate STATUS lines; no structured output mode. (#23)
27	`last_response` single-slot relay loses intermediate node outputs	OpenAI Researcher, Grad Student	"Telephone game" problem; full chain of reasoning lost. (#24)
28	Windows build failure — `Setpgid`/`syscall.Kill` POSIX-only, no build constraints	Compatibility	Pre-existing, not .dip-specific. (#28)

#	Issue	Sources	Details
29	Subgraph handler declared but not implemented	Grad Student, DSL Design	(#35)
30	No jitter on retry backoff	OpenAI Researcher	(#29)
31	Subgraph params serialization non-deterministic	Error Handling	(#8)
32	Fidelity truncation is character-based, not token-based	OpenAI Researcher	(#34)
33	Tab vs space indentation conflated in parser	DSL Design	(#3)
34	`knownModelProviders` linter catalog is stale	DSL Design	(#36)
35	Human gates have no input timeout	OpenAI Researcher	(#30)
36	O(E) edge lookup on every call	OpenAI Researcher, Martin Fowler	(#31)
37	Flat context store with no scoping	Grad Student, OpenAI Researcher	(#32)
38	Errors use `fmt.Errorf` without `%w`	Error Handling	(#33)
39	Parallel/fan_in edge boilerplate redundant with node declarations	DSL Design	(#37)
40	Nil pointer risk on IR Node/Edge slice elements	Error Handling	(#38)

#	Issue	Sources	Reason
41	No distributed execution	OpenAI Researcher	Architectural scope
42	No persistent agent memory across runs	OpenAI Researcher	Feature request
43	No dynamic graph modification at runtime	OpenAI Researcher	Design choice
44	No prompt optimization/evaluation harness	Grad Student	Research feature
45	No structured I/O typing (Pydantic-like)	Grad Student	Research feature

Issue	Count	Experts
`DippinValidated` bypass	6	Architecture, Security, Martin Fowler, Grad Student, Error Handling, API Design
Stringly-typed Graph model / Primitive Obsession	4	Martin Fowler (x2), OpenAI Researcher, Architecture
Code duplication CLI/library	3	Martin Fowler, Architecture, Compatibility
Library writes to stderr	3	API Design, Martin Fowler, Error Handling
Parallel context loss	3	Grad Student (x2), OpenAI Researcher

Capability	Tracker/.dip	LangGraph	CrewAI	AutoGen	DSPy
Graph-based orchestration	Yes (DAG)	Yes (cyclic)	Linear	Message-passing	Linear
Multi-provider	Native	Via adapters	Via adapters	Via adapters	Via LM classes
Parallel execution	Yes	Yes	Yes	Yes	No
Human-in-the-loop	Yes	Yes	No	Yes	No
Checkpoint/resume	Yes	Yes	No	No	No
Typed state	No	Yes	Partial	Partial	Yes
Context compression	Yes (unique)	No	No	No	No
Prompt optimization	No	No	No	No	Yes

Expert Panel Review: .dip Format Support — Consolidated Report #39

Description

Expert Panel Review: .dip Format Support in Tracker

Bugs Fixed This Session

CRITICAL (must fix)

IMPORTANT (should fix soon)

MINOR (backlog)

DEFERRED (noted, not blocking)

Cross-Reference: Issues Flagged by 3+ Experts

Strengths Noted by Reviewers

Competitive Positioning

Recommended Priority Order

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions