Skip to content

Expert Panel Review: .dip Format Support — Consolidated Report #39

@harperreed

Description

@harperreed

Expert Panel Review: .dip Format Support in Tracker

Date: 2026-03-20
Reviewers: 11 expert subagents (Architecture, API Design, Documentation, Security, Error Handling, Compatibility, DSL Design, Test Quality, OpenAI Researcher, Grad Student AI Researcher, Martin Fowler)


Bugs Fixed This Session

Four silent data-loss bugs were found and fixed in pipeline/dippin_adapter.go:

What was lost Adapter wrote Engine/Handler reads Status
Model selection model llm_model FIXED
Provider selection provider llm_provider FIXED
Fallback on retry exhaustion fallback_target fallback_retry_target FIXED
Pipeline goal ($goal, fidelity, summarization) not mapped at all graph.goal via g.Attrs["goal"] FIXED

Also fixed: indentation bug in extractRetryAttrs.

All tests pass after fixes.


CRITICAL (must fix)

# Issue Sources Details
1 model/provider key mismatch Test Quality, Compatibility FIXED — adapter now writes llm_model/llm_provider (#13)
2 fallback_target key mismatch Martin Fowler FIXED — adapter now writes fallback_retry_target (#13)
3 workflow.Goal not mapped DSL Design FIXED — adapter now maps workflow.Goal to g.Attrs["goal"] (#14)
4 DippinValidated flag bypasses 5 structural validation checks Architecture, Security, Martin Fowler, Grad Student, Error Handling, API Design Any code can set g.DippinValidated = true and skip cycle detection, reachability, start/exit verification. Dippin validates the IR but tracker skips validating the converted Graph. (#4)
5 Command injection via tool_command Security, OpenAI Researcher sh -c <command> with no sanitization, sandboxing, or allowlist. Full process privileges. (#16)
6 Library writes to os.Stderr API Design, Martin Fowler, Error Handling tracker.go prints deprecation warnings and diagnostics directly to stderr. Library consumers cannot capture/suppress. (#7)
7 Goal-gate retry has no termination bound Grad Student No goal-gate-specific counter; clearDownstream doesn't reset retry counts on this path. (#15)
8 Nested retry block not supported by parser DSL Design, Scenario Tests complex.dip uses block syntax the parser can't handle. (#3)
9 Edge bracket syntax [label: ...] silently misparsed DSL Design Parser has no bracket token; annotations are silently swallowed. (#18)
10 No pipeline-level token budget or cost ceiling OpenAI Researcher Misconfigured pipelines can consume unbounded API spend. (#17)

IMPORTANT (should fix soon)

# Issue Sources Details
11 Graph model is anemic (Primitive Obsession) — all node config in map[string]string Martin Fowler, OpenAI Researcher, Architecture, Grad Student Typed IR gets downgraded to string map; root cause of bugs #1-#3. (#19)
12 Code duplication between CLI and library — parse chain and LLM client builder Martin Fowler, Architecture, Compatibility main.go and tracker.go duplicate the parse-validate-convert chain and buildClient. (#6)
13 --format flag not parsed for validate/simulate Compatibility, Scenario Tests parseFlags returns early for these modes, skipping flag registration. (#1)
14 No migration tooling — deprecation warning says "migrate to .dip" but no converter exists Compatibility, Documentation 13 DOT examples with no automated path forward. (#12)
15 E2E comparison test doesn't verify attribute equivalence Test Quality Only checks node count, shapes, handlers; misses attribute keys/values. (#11)
16 Parallel fan-out: no concurrency limit, no configurable aggregation OpenAI Researcher, Grad Student All branches launch simultaneously. Only "any success" strategy. (#27)
17 Parallel branch context updates lost — never merged back into parent context Grad Student, OpenAI Researcher Silent data loss. Fan-in nodes don't receive branch outputs via context. (#20)
18 Consensus pipelines run sequentially instead of in parallel OpenAI Researcher Multi-model nodes see each other's outputs instead of reasoning independently. (#26)
19 Condition parser is fragile — naive strings.Split on ` /&&`, no quoting
20 No formal .dip grammar/spec Documentation, DSL Design Language defined only by Go implementation. (#12)
21 Error messages lack file/line context Error Handling, Documentation Adapter errors don't include irNode.Source location. (#33)
22 No CLI tests for .dip files Test Quality validate/simulate test files use only DOT. (#11)
23 complex.dip fixture never used in any test Test Quality Rich test data wasted. (#11)
24 No diamond (conditional) node kind in .dip DSL Design, Architecture Can't express pure conditional branch without invoking LLM. (#22)
25 NodeIO (reads/writes) defined in IR but dropped by adapter DSL Design Useful for validation, documentation. (#25)
26 auto_status parsing is brittle — regex for STATUS:success/fail/retry in freeform text OpenAI Researcher, Grad Student LLMs can hallucinate STATUS lines; no structured output mode. (#23)
27 last_response single-slot relay loses intermediate node outputs OpenAI Researcher, Grad Student "Telephone game" problem; full chain of reasoning lost. (#24)
28 Windows build failureSetpgid/syscall.Kill POSIX-only, no build constraints Compatibility Pre-existing, not .dip-specific. (#28)

MINOR (backlog)

# Issue Sources Details
29 Subgraph handler declared but not implemented Grad Student, DSL Design (#35)
30 No jitter on retry backoff OpenAI Researcher (#29)
31 Subgraph params serialization non-deterministic Error Handling (#8)
32 Fidelity truncation is character-based, not token-based OpenAI Researcher (#34)
33 Tab vs space indentation conflated in parser DSL Design (#3)
34 knownModelProviders linter catalog is stale DSL Design (#36)
35 Human gates have no input timeout OpenAI Researcher (#30)
36 O(E) edge lookup on every call OpenAI Researcher, Martin Fowler (#31)
37 Flat context store with no scoping Grad Student, OpenAI Researcher (#32)
38 Errors use fmt.Errorf without %w Error Handling (#33)
39 Parallel/fan_in edge boilerplate redundant with node declarations DSL Design (#37)
40 Nil pointer risk on IR Node/Edge slice elements Error Handling (#38)

DEFERRED (noted, not blocking)

# Issue Sources Reason
41 No distributed execution OpenAI Researcher Architectural scope
42 No persistent agent memory across runs OpenAI Researcher Feature request
43 No dynamic graph modification at runtime OpenAI Researcher Design choice
44 No prompt optimization/evaluation harness Grad Student Research feature
45 No structured I/O typing (Pydantic-like) Grad Student Research feature

Cross-Reference: Issues Flagged by 3+ Experts

Issue Count Experts
DippinValidated bypass 6 Architecture, Security, Martin Fowler, Grad Student, Error Handling, API Design
Stringly-typed Graph model / Primitive Obsession 4 Martin Fowler (x2), OpenAI Researcher, Architecture
Code duplication CLI/library 3 Martin Fowler, Architecture, Compatibility
Library writes to stderr 3 API Design, Martin Fowler, Error Handling
Parallel context loss 3 Grad Student (x2), OpenAI Researcher

Strengths Noted by Reviewers

  • Fidelity degradation chain (full -> summary:high -> ... -> truncate) — unique among agent frameworks (OpenAI Researcher, Grad Student)
  • CSS-like model stylesheets — elegant approach to batch model configuration (Grad Student)
  • Handler registry pattern — clean decoupling of execution from traversal (Martin Fowler)
  • Checkpoint/resume — well-designed for operational resilience (Martin Fowler, Grad Student)
  • Human gate abstraction — clean Interviewer interface with multiple implementations (OpenAI Researcher)
  • Multi-provider support — native routing to Anthropic, OpenAI, Gemini per-node (OpenAI Researcher)
  • Retry/backoff system — named policies with composable overrides (OpenAI Researcher)

Competitive Positioning

Capability Tracker/.dip LangGraph CrewAI AutoGen DSPy
Graph-based orchestration Yes (DAG) Yes (cyclic) Linear Message-passing Linear
Multi-provider Native Via adapters Via adapters Via adapters Via LM classes
Parallel execution Yes Yes Yes Yes No
Human-in-the-loop Yes Yes No Yes No
Checkpoint/resume Yes Yes No No No
Typed state No Yes Partial Partial Yes
Context compression Yes (unique) No No No No
Prompt optimization No No No No Yes

Recommended Priority Order

  1. Fix attribute key mismatches DONE (fix(pipeline): attribute key mismatch — model/provider/fallback_target silently ignored for .dip files #13, fix(pipeline): workflow.Goal silently dropped by dippin adapter #14)
  2. Remove DippinValidated flag — always run tracker's own validation (fix(pipeline): DippinValidated bypasses tracker validation without safety contract #4)
  3. Extract shared parse chain into pipeline package (refactor: extract shared dip parse/validate logic and LLM client builder #6)
  4. Extract shared LLM client builder (refactor: extract shared dip parse/validate logic and LLM client builder #6)
  5. Add CLI tests for .dip files (test(pipeline): strengthen .dip test coverage — comparison test, edge cases, CLI fixtures #11)
  6. Parse --format flag for validate/simulate (fix(cli): validate/simulate subcommands ignore --format flag #1)
  7. Plan migration from map[string]string to typed node configs (refactor(pipeline): Graph model uses map[string]string for all config (Primitive Obsession) #19)
  8. Add pipeline-level cost/token budget (fix(pipeline): no pipeline-level token budget or cost ceiling #17)
  9. Fix parallel branch context merge (fix(pipeline): parallel branch context updates lost on fan-in #20)
  10. Add migration tooling (DOT -> DIP converter) (docs: no deprecation timeline for DOT, no .dip format spec, no migration tooling #12)

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2: mediumImportant but not blockingarea/adapterdippin_adapter.go — IR to Graph bridgedocumentationImprovements or additions to documentation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions