diff --git a/CLAUDE.md b/CLAUDE.md index 81583389..759ef703 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -13,7 +13,7 @@ commands/flow-code/*.md → Slash command definitions (user-invocable entry poi skills/*/SKILL.md → Skill implementations (loaded by Skill tool, never Read directly) agents/*.md → Subagent definitions (research scouts, worker, plan-sync, etc.) bin/flowctl → Rust binary (built from flowctl/ workspace) -flowctl/ → Rust Cargo workspace (4 crates: core, db, service, cli) +flowctl/ → Rust Cargo workspace (2 crates: core, cli) hooks/hooks.json → Ralph workflow guards (active when FLOW_RALPH=1) docs/ → Architecture docs, CI examples ``` @@ -52,7 +52,7 @@ bash scripts/ralph_e2e_short_rp_test.sh All tests create temp directories and clean up after themselves. They must NOT be run from the plugin repo root (safety check enforced). -**Storage runtime**: State is stored in JSON/JSONL files in the `.flow/` directory, readable by any tool. The `flowctl-db` crate provides synchronous file-based storage with no external database dependencies. +**Storage runtime**: All state is JSON/JSONL files in `.flow/`, readable by any tool (MCP, Read, Grep). No database, no async runtime. The `json_store` module in `flowctl-core` handles all file I/O. ## Code Quality diff --git a/docs/adr-guide.md b/docs/adr-guide.md index 9fd29fa6..1accdc2f 100644 --- a/docs/adr-guide.md +++ b/docs/adr-guide.md @@ -19,7 +19,7 @@ Use `docs/decisions/` with sequential numbering: ``` docs/decisions/ - ADR-001-use-libsql-for-storage.md + ADR-001-json-file-based-storage.md ADR-002-wave-checkpoint-execution-model.md ADR-003-teams-file-locking-protocol.md ``` diff --git a/docs/references/performance-checklist.md b/docs/references/performance-checklist.md deleted file mode 100644 index 936141a5..00000000 --- a/docs/references/performance-checklist.md +++ /dev/null @@ -1,152 +0,0 @@ -# Performance Checklist - -Quick reference for performance in flow-code. Rust backend with libSQL, async runtime. - -## Measurement Commands - -```bash -# Rust benchmarks (add benches/ directory with criterion) -cargo bench - -# Criterion (statistical benchmarking) -cargo bench --bench my_benchmark - -# Flamegraph (CPU profiling) -cargo install flamegraph -cargo flamegraph --bin flowctl -- status --json - -# Memory profiling (Linux) -valgrind --tool=massif target/release/flowctl status --json - -# Binary size -ls -lh target/release/flowctl -cargo bloat --release --crates - -# Compile time -cargo build --release --timings -``` - -| Tool | What It Measures | When to Use | -|---|---|---| -| `cargo bench` (criterion) | Function-level latency | Before/after optimization | -| `cargo flamegraph` | CPU hotspots | Investigating slow commands | -| `cargo bloat` | Binary size by crate | When binary grows unexpectedly | -| `valgrind --tool=massif` | Heap allocations | Investigating memory usage | -| `hyperfine` | CLI command latency | Comparing command performance | - -## Database Anti-Patterns (libSQL) - -| Anti-Pattern | Impact | Fix | -|---|---|---| -| N+1 queries | Linear DB round-trips | Use JOINs or batch queries | -| Unbounded queries | Memory exhaustion, timeouts | Always use LIMIT, paginate | -| Missing indexes | Slow reads as data grows | Add indexes on filtered/sorted columns | -| No connection pooling | Connection overhead per query | Reuse connections | -| SELECT * | Fetches unused columns | Select only needed columns | -| String concatenation in SQL | Injection risk + no query cache | Use parameterized queries | -| No transactions for batch ops | Partial failures, inconsistency | Wrap in BEGIN/COMMIT | - -## Backend CLI Checklist - -| Area | Target | How to Verify | -|---|---|---| -| Command startup | < 50ms | `hyperfine 'flowctl status --json'` | -| Task list (100 tasks) | < 200ms | Benchmark with test data | -| Database open | < 20ms | Profile with `tracing` spans | -| JSON serialization | < 5ms for typical output | `cargo bench` | -| Binary size | < 20MB release | `ls -lh target/release/flowctl` | -| Memory usage (idle) | < 10MB | `valgrind` or `/usr/bin/time -v` | - -## Rust-Specific Performance - -### Async/Await Patterns - -| Do | Don't | -|---|---| -| Use `tokio::spawn` for concurrent I/O | Block the async runtime with sync I/O | -| Use `tokio::task::spawn_blocking` for CPU work | Run CPU-heavy code in async context | -| Use `tokio::join!` for parallel awaits | Await sequentially when independent | -| Use bounded channels for backpressure | Use unbounded channels (memory leak risk) | - -```rust -// Good: parallel async operations -let (tasks, epics) = tokio::join!( - db.list_tasks(), - db.list_epics() -); - -// Bad: sequential when independent -let tasks = db.list_tasks().await?; -let epics = db.list_epics().await?; // Waits unnecessarily -``` - -### Memory Patterns - -| Do | Don't | -|---|---| -| Use `&str` over `String` where possible | Clone strings unnecessarily | -| Use `Cow` for maybe-owned data | Allocate when borrowing suffices | -| Pre-allocate with `Vec::with_capacity` | Push to vec without size hint | -| Use iterators over collecting into vecs | `collect()` when you can chain | -| Use `Arc` for shared read-only data | Clone large structs across threads | - -```rust -// Good: pre-allocate -let mut results = Vec::with_capacity(tasks.len()); - -// Bad: repeated reallocation -let mut results = Vec::new(); // Grows 1, 2, 4, 8... -``` - -### String Performance - -| Do | Don't | -|---|---| -| Use `write!` or `format!` once | Repeated `push_str` / `+` concatenation | -| Use `String::with_capacity` for building | Build strings without size hint | -| Return `impl Display` for lazy formatting | Format eagerly when output may not be used | -| Use `serde_json::to_writer` for streaming | Serialize to String then write | - -## Tokio Runtime - -| Setting | Recommendation | -|---|---| -| Runtime flavor | `current_thread` for CLI (lower overhead) | -| Worker threads | Default for server, 1 for CLI tools | -| Blocking threads | Default (512), reduce for CLI | -| Task budget | Let tokio manage; avoid manual yielding | - -```rust -// CLI tool: single-threaded runtime (faster startup) -#[tokio::main(flavor = "current_thread")] -async fn main() { ... } - -// Server: multi-threaded (default) -#[tokio::main] -async fn main() { ... } -``` - -## Performance Anti-Patterns - -| Anti-Pattern | Impact | Fix | -|---|---|---| -| Blocking async runtime | Starves other tasks | Use `spawn_blocking` | -| N+1 database queries | Linear latency growth | Batch or JOIN queries | -| Cloning large structs | Memory + CPU waste | Use references or `Arc` | -| Unbounded collections | Memory exhaustion | Add limits, use streaming | -| Sync file I/O in async | Blocks executor thread | Use `tokio::fs` | -| Excessive logging in hot path | I/O overhead | Use log levels, sample | -| Compiling in debug mode for perf tests | 10-100x slower | Always `--release` | -| Not reusing DB connections | Connection overhead | Use connection pool | -| Serializing then parsing JSON | Double work | Pass typed structs | -| Formatting strings nobody reads | CPU waste | Lazy formatting with `Display` | - -## Profiling Workflow - -| Step | Command | Purpose | -|---|---|---| -| 1. Baseline | `hyperfine 'flowctl status --json'` | Measure current perf | -| 2. Profile | `cargo flamegraph --bin flowctl -- status` | Find hotspots | -| 3. Optimize | Edit code targeting hotspot | Reduce cost | -| 4. Verify | Re-run baseline command | Confirm improvement | -| 5. Benchmark | `cargo bench` | Prevent regression | diff --git a/docs/references/security-checklist.md b/docs/references/security-checklist.md deleted file mode 100644 index a44bff1a..00000000 --- a/docs/references/security-checklist.md +++ /dev/null @@ -1,141 +0,0 @@ -# Security Checklist - -Quick reference for security in flow-code. CLI tool handling local state, no web server. - -## Pre-Commit Secret Scan - -```bash -# Check staged files for secrets before committing -git diff --cached | grep -iE \ - "password\s*=|secret\s*=|api_key\s*=|token\s*=|private_key|BEGIN RSA|BEGIN EC" - -# Patterns to catch -grep -rn "AKIA[0-9A-Z]{16}" # AWS access keys -grep -rn "sk-[a-zA-Z0-9]{48}" # OpenAI API keys -grep -rn "ghp_[a-zA-Z0-9]{36}" # GitHub personal tokens -grep -rn "xoxb-[0-9]{10,}" # Slack bot tokens -``` - -## Files to Never Commit - -| File/Pattern | Reason | Mitigation | -|---|---|---| -| `.env` | Contains secrets | Add to `.gitignore` | -| `*.pem`, `*.key` | Private keys | Add to `.gitignore` | -| `.flow/` | Runtime state (may contain tokens) | Add to `.gitignore` | -| `credentials.json` | Service account keys | Add to `.gitignore` | -| `*.upstream` | Backup files with potential secrets | Add to `.gitignore` | - -Commit `.env.example` with placeholder values instead of `.env`. - -## Secrets Management - -| Do | Don't | -|---|---| -| Use environment variables for secrets | Hardcode secrets in source | -| Commit `.env.example` with placeholders | Commit `.env` with real values | -| Use `--json` flag for machine parsing | Parse human-readable output with secrets | -| Rotate secrets after exposure | Ignore leaked credentials | -| Scope API keys to minimum permissions | Use admin keys for all operations | - -## OWASP Top 10 Quick Reference - -| # | Vulnerability | Prevention | -|---|---|---| -| 1 | Broken Access Control | Auth checks on every endpoint, ownership verification | -| 2 | Cryptographic Failures | HTTPS, strong hashing, no secrets in code | -| 3 | Injection | Parameterized queries, input validation | -| 4 | Insecure Design | Threat modeling, spec-driven development | -| 5 | Security Misconfiguration | Minimal permissions, audit deps, secure defaults | -| 6 | Vulnerable Components | `cargo audit`, `cargo deny`, minimal deps | -| 7 | Auth Failures | Strong passwords, rate limiting, session management | -| 8 | Data Integrity Failures | Verify updates/dependencies, signed artifacts | -| 9 | Logging Failures | Log security events, don't log secrets | -| 10 | SSRF | Validate/allowlist URLs, restrict outbound requests | - -## Dependency Audit (Rust) - -```bash -# Audit for known vulnerabilities -cargo audit - -# Deny specific licenses or advisories -cargo deny check - -# Check for outdated dependencies -cargo outdated - -# Minimal dependency tree (fewer deps = smaller attack surface) -cargo tree | wc -l -``` - -| Tool | Purpose | When to Run | -|---|---|---| -| `cargo audit` | CVE database check | Pre-commit, CI | -| `cargo deny` | License + advisory policy | CI pipeline | -| `cargo outdated` | Stale dependency check | Weekly/monthly | -| `cargo tree` | Dependency graph audit | When adding deps | - -## Input Validation - -| Rule | Example | -|---|---| -| Validate at boundaries only | CLI args, JSON input, file reads | -| Use allowlists, not denylists | Accept known-good, reject everything else | -| Constrain string lengths | Task titles: 1-500 chars | -| Validate file paths | No `..` traversal, stay within `.flow/` | -| Sanitize output for shells | Escape special chars in generated commands | -| Parse, don't validate | Deserialize into typed structs, not raw strings | - -```rust -// Good: parse into typed struct at boundary -let task: TaskInput = serde_json::from_str(&input)?; - -// Bad: pass raw string around and validate later -fn process(raw: &str) { /* who validates this? */ } -``` - -## Error Handling - -| Context | Approach | -|---|---| -| CLI output (user-facing) | Clear error message, no stack traces | -| JSON output (`--json`) | Structured error: `{"error": "message"}` | -| Internal logging | Full context, file/line, but no secrets | -| Panics | Catch at boundaries with `catch_unwind` or `?` | - -```rust -// Good: generic error to user, detail in logs -eprintln!("Error: failed to open database"); -tracing::error!("DB open failed: {:?} at {:?}", err, path); - -// Bad: expose internals -eprintln!("Error: {}", err); // May leak path, SQL, etc. -``` - -## SQL Safety (libSQL) - -| Do | Don't | -|---|---| -| Use parameterized queries | Concatenate user input into SQL | -| Use `?` placeholders | Use `format!()` for query building | -| Validate types before query | Trust raw input as safe | -| Use transactions for multi-step ops | Execute multiple queries without atomicity | - -```rust -// Good: parameterized -conn.execute("INSERT INTO tasks (title) VALUES (?1)", [&title]).await?; - -// Bad: string concatenation -conn.execute(&format!("INSERT INTO tasks (title) VALUES ('{}')", title), ()).await?; -``` - -## File System Security - -| Rule | Rationale | -|---|---| -| Validate paths stay within `.flow/` | Prevent directory traversal | -| Use temp dirs for scratch work | Don't pollute user directories | -| Set restrictive permissions on state files | Protect `.flow/` data | -| Clean up temp files on exit | No stale sensitive data | -| Don't follow symlinks blindly | Potential symlink attacks | diff --git a/docs/references/testing-patterns.md b/docs/references/testing-patterns.md deleted file mode 100644 index 8b192a85..00000000 --- a/docs/references/testing-patterns.md +++ /dev/null @@ -1,167 +0,0 @@ -# Testing Patterns Reference - -Quick reference for testing patterns in flow-code. Rust backend, Bash scripts, Markdown skills. - -## Test Pyramid - -| Layer | Coverage | What to Test | Tools | -|-------|----------|--------------|-------| -| Unit | ~80% | Pure functions, logic, parsing, serialization | `cargo test`, `#[test]` | -| Integration | ~15% | DB queries, CLI commands, cross-crate calls | `cargo test --test`, temp dirs | -| E2E | ~5% | Full workflows (smoke tests, ralph loops) | `bash scripts/smoke_test.sh` | - -## Test Structure (Arrange-Act-Assert) - -```rust -#[test] -fn creates_task_with_default_status() { - // Arrange: set up test data - let input = TaskInput { title: "Test".into(), ..Default::default() }; - - // Act: perform the action - let result = create_task(input); - - // Assert: verify outcome - assert_eq!(result.status, Status::Todo); - assert!(!result.id.is_empty()); -} -``` - -## Rust Test Patterns - -| Pattern | Usage | Example | -|---------|-------|---------| -| `#[test]` | Basic unit test | `fn test_parse() { assert_eq!(parse("x"), Ok("x")); }` | -| `#[should_panic]` | Expected panic | `#[should_panic(expected = "empty")] fn test_empty()` | -| `#[tokio::test]` | Async test | `async fn test_db_query() { ... }` | -| `assert_eq!` | Equality check | `assert_eq!(actual, expected)` | -| `assert!` | Boolean check | `assert!(result.is_ok())` | -| `assert_ne!` | Inequality check | `assert_ne!(id1, id2)` | -| Test modules | Organize tests | `#[cfg(test)] mod tests { use super::*; }` | -| `#[ignore]` | Skip slow tests | Run with `cargo test -- --ignored` | - -## Running Tests - -```bash -# All tests -cd flowctl && cargo test --all - -# Specific crate -cargo test -p flowctl-core -cargo test -p flowctl-db -cargo test -p flowctl-cli - -# Single test by name -cargo test test_create_task - -# With output -cargo test -- --nocapture - -# Bash smoke/e2e tests -bash scripts/smoke_test.sh -bash scripts/ci_test.sh -bash scripts/teams_e2e_test.sh -``` - -## Mock Boundaries - -| Mock These | Don't Mock These | -|------------|-----------------| -| Database calls (libSQL) | Internal utility functions | -| HTTP/network requests | Business logic | -| File system operations | Data transformations | -| External API calls | Validation functions | -| Time/Date (`SystemTime`) | Pure functions | -| Environment variables | Parsing/serialization | - -## Prove-It Pattern (Bug Fixes) - -| Step | Action | Gate | -|------|--------|------| -| 1. Red | Write a test that reproduces the bug | Test MUST fail | -| 2. Confirm | Run test suite, verify only new test fails | No other regressions | -| 3. Fix | Implement the minimal fix | Keep scope tight | -| 4. Green | Run test suite, verify all tests pass | New test now passes | -| 5. Commit | Include test + fix in same commit | Evidence-based fix | - -## Test Naming Conventions - -```rust -#[cfg(test)] -mod tests { - use super::*; - - // Pattern: action_condition_expected_result - #[test] - fn parse_valid_input_returns_ok() { ... } - - #[test] - fn parse_empty_string_returns_error() { ... } - - #[test] - fn create_task_assigns_unique_id() { ... } -} -``` - -## Integration Test Structure - -```rust -// tests/integration_test.rs (separate file in tests/ directory) -use tempfile::TempDir; - -#[tokio::test] -async fn full_task_lifecycle() { - // Setup: temp directory for .flow/ state - let tmp = TempDir::new().unwrap(); - let db = Database::open(tmp.path()).await.unwrap(); - - // Create - let task = db.create_task("Test task").await.unwrap(); - assert_eq!(task.status, "todo"); - - // Transition - db.start_task(&task.id).await.unwrap(); - let updated = db.get_task(&task.id).await.unwrap(); - assert_eq!(updated.status, "in_progress"); - - // Complete - db.complete_task(&task.id, "Done").await.unwrap(); - let done = db.get_task(&task.id).await.unwrap(); - assert_eq!(done.status, "done"); -} -``` - -## Bash Test Patterns - -```bash -# Temp directory isolation -TMPDIR=$(mktemp -d) -trap "rm -rf $TMPDIR" EXIT - -# Assert command succeeds -$FLOWCTL create-epic "test" --json || { echo "FAIL: create-epic"; exit 1; } - -# Assert output contains expected string -OUTPUT=$($FLOWCTL show fn-1 --json) -echo "$OUTPUT" | grep -q '"status":"todo"' || { echo "FAIL: status"; exit 1; } - -# Assert command fails (expected) -if $FLOWCTL done nonexistent 2>/dev/null; then - echo "FAIL: should have failed"; exit 1 -fi -``` - -## Test Anti-Patterns - -| Anti-Pattern | Problem | Better Approach | -|---|---|---| -| Testing implementation details | Breaks on refactor | Test inputs/outputs only | -| Shared mutable state between tests | Tests pollute each other | Use temp dirs, fresh DB per test | -| No assertions (test runs = passes) | False confidence | Every test needs `assert!` | -| Hardcoded paths in tests | Breaks on other machines | Use `TempDir`, relative paths | -| Skipping tests to pass CI | Hides real bugs | Fix or delete the test | -| Testing third-party code | Wastes time | Mock the boundary | -| Overly broad assertions (`is_ok()`) | Doesn't catch regressions | Assert specific values | -| Giant test functions (50+ lines) | Hard to diagnose failures | One assertion per concept | -| Test depends on execution order | Flaky in parallel | Each test self-contained | -| Sleeping for async (`thread::sleep`) | Slow and flaky | Use proper async awaiting | diff --git a/docs/token-optimization-audit.md b/docs/token-optimization-audit.md index 79fc4f61..d83cd5fe 100644 --- a/docs/token-optimization-audit.md +++ b/docs/token-optimization-audit.md @@ -200,7 +200,7 @@ Per-spec waste: ~250 chars of pure duplication. 8-task epic → ~2KB redundancy. | # | Optimization | File | Priority | |---|---|---|:---:| -| 21 | Remove `file_path` YAML field (derivable from file's own path) | `flowctl-db/spec_writer.rs` | P0 | +| 21 | Remove `file_path` YAML field (derivable from file's own path) | `flowctl-core/json_store.rs` | P0 | | 22 | Remove `schema_version: 1` from every spec (only needed during migration) | spec_writer | P0 | | 23 | Timestamps: drop microsecond precision (`…313450Z` → `…Z`) | spec_writer | P0 | | 24 | H1 heading should not repeat full ID (`# Title` not `# fn-15-….3 Title`) | spec_writer | P0 | @@ -521,7 +521,7 @@ TaskCompleted: task-completed - `OutputOpts` global flags: `flowctl/crates/flowctl-cli/src/output.rs` - `is_terminal()` detection: `flowctl/crates/flowctl-cli/src/output.rs:37` - `init_compact()` called once: `flowctl/crates/flowctl-cli/src/main.rs:451` -- Spec frontmatter schema: `flowctl/crates/flowctl-db/src/spec_writer.rs` +- Spec frontmatter schema: `flowctl/crates/flowctl-core/src/json_store.rs` - ID format: `flowctl/crates/flowctl-core/src/` (no dedicated resolver yet) ### E. Epic Stub