From 1f7cdea8a79f16d715cdd75df5108562477237b9 Mon Sep 17 00:00:00 2001 From: GiggleLiu Date: Thu, 5 Mar 2026 09:59:49 +0800 Subject: [PATCH 1/9] update refs --- .claude/CLAUDE.md | 1 + .claude/skills/check-issue/SKILL.md | 424 ++++++++++++++++++ .claude/skills/check-issue/references.md | 287 ++++++++++++ .../2026-03-05-check-issue-skill-design.md | 60 +++ references/rules-karp.md | 138 ++++++ 5 files changed, 910 insertions(+) create mode 100644 .claude/skills/check-issue/SKILL.md create mode 100644 .claude/skills/check-issue/references.md create mode 100644 docs/plans/2026-03-05-check-issue-skill-design.md create mode 100644 references/rules-karp.md diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md index dbd7a3aed..955610862 100644 --- a/.claude/CLAUDE.md +++ b/.claude/CLAUDE.md @@ -12,6 +12,7 @@ Rust library for NP-hard problem reductions. Implements computational problems w - [write-model-in-paper](skills/write-model-in-paper/SKILL.md) -- Write or improve a problem-def entry in the Typst paper. Covers formal definition, background, example with visualization, and algorithm list. - [write-rule-in-paper](skills/write-rule-in-paper/SKILL.md) -- Write or improve a reduction-rule entry in the Typst paper. Covers complexity citation, self-contained proof, detailed example, and verification. - [release](skills/release/SKILL.md) -- Create a new crate release. Determines version bump from diff, verifies tests/clippy, then runs `make release`. +- [check-issue](skills/check-issue/SKILL.md) -- Quality gate for `[Rule]` and `[Model]` issues. Checks usefulness, non-triviality, correctness of literature, and writing quality. Posts structured report and adds failure labels. - [meta-power](skills/meta-power/SKILL.md) -- Batch-resolve all open `[Model]` and `[Rule]` issues autonomously: plan, implement, review, fix CI, merge — in dependency order (models first). ## Commands diff --git a/.claude/skills/check-issue/SKILL.md b/.claude/skills/check-issue/SKILL.md new file mode 100644 index 000000000..b9656fbec --- /dev/null +++ b/.claude/skills/check-issue/SKILL.md @@ -0,0 +1,424 @@ +--- +name: check-issue +description: Use when reviewing a [Rule] or [Model] GitHub issue for quality before implementation — checks usefulness, non-triviality, correctness of literature claims, and writing quality +--- + +# Check Issue + +Quality gate for `[Rule]` and `[Model]` GitHub issues. Runs 4 checks (adapted per issue type) and posts a structured report as a GitHub comment. Adds labels for failures but does NOT close issues. + +## Invocation + +``` +/check-issue +``` + +## Process + +```dot +digraph check_issue { + rankdir=TB; + "Fetch issue" [shape=box]; + "Detect issue type" [shape=diamond]; + "Run Rule checks" [shape=box]; + "Run Model checks" [shape=box]; + "Unknown type: stop" [shape=box, style=filled, fillcolor="#ffcccc"]; + "Compose report" [shape=box]; + "Post comment + add labels" [shape=box]; + + "Fetch issue" -> "Detect issue type"; + "Detect issue type" -> "Run Rule checks" [label="[Rule]"]; + "Detect issue type" -> "Run Model checks" [label="[Model]"]; + "Detect issue type" -> "Unknown type: stop" [label="other"]; + "Run Rule checks" -> "Compose report"; + "Run Model checks" -> "Compose report"; + "Compose report" -> "Post comment + add labels"; +} +``` + +### Step 0: Fetch and Parse Issue + +```bash +gh issue view --json title,body,labels +``` + +- Detect issue type from title: `[Rule]` or `[Model]` +- If neither, stop with message: "This skill only checks [Rule] and [Model] issues." + +--- + +# Part A: Rule Issue Checks + +Applies when the title contains `[Rule]`. + +## Rule Check 1: Usefulness (fail label: `Useless`) + +**Goal:** Is this reduction novel or does it improve on an existing one? + +1. Parse **Source** and **Target** problem names from the issue body. + +2. Resolve problem aliases (the issue may say "MIS" but `pred` needs "MaximumIndependentSet"): + ```bash + pred show --json 2>/dev/null + pred show --json 2>/dev/null + ``` + If either fails, try the full name. If still fails, **Warn** (unknown problem — may be a new model). + +3. Check existing path: + ```bash + pred path --json + ``` + +4. Decision: + - **No path exists** → **Pass** (novel reduction) + - **Multi-hop path only** (2+ hops) → **Pass** with note: "Currently reachable in N hops; a direct rule adds value" + - **Direct path exists** (1 hop) → compare overhead: + - Parse the proposed overhead from the issue's "Size Overhead" table + - Get existing overhead from `pred show --json` → `reduces_from[]` where source matches + - If proposed overhead is **strictly lower** on at least one dimension (and not higher on any) → **Pass** ("improves existing reduction") + - If overhead is **equal or higher** on all dimensions → **Fail** + - If overhead comparison is ambiguous (different dimensions, incomparable expressions) → **Warn** with explanation + +5. Check **Motivation** field: if empty, placeholder, or just "enables X" without explaining *why this path matters* → **Warn** + +--- + +## Rule Check 2: Non-trivial (fail label: `Trivial`) + +**Goal:** Does the reduction involve genuine structural transformation? + +Read the "Reduction Algorithm" section and flag as **Fail** if: + +- **Variable substitution only:** The mapping is a 1-to-1 relabeling (e.g., `x_i → 1 - x_i` for complement problems). A valid reduction must construct new constraints, objectives, or graph structure. +- **Subtype coercion:** The reduction merely casts to a more general type (e.g., SimpleGraph → HyperGraph) with no structural change to the problem instance. +- **Same-problem identity:** Reducing between variants of the same problem with no insight (e.g., `MIS` → `MIS` by setting all weights to 1). +- **Insufficient detail:** The algorithm is a hand-wave ("map variables accordingly", "follows from the definition") — not a step-by-step procedure a programmer could implement. This is also a **Fail**. + +If the construction involves gadgets, penalty terms, auxiliary variables, or non-trivial structural transformation → **Pass**. + +--- + +## Rule Check 3: Correctness (fail label: `Wrong`) + +**Goal:** Are the cited references real and do they support the claims? + +### 3a: Extract References + +Parse all literature citations from the issue body — paper titles, author names, years, DOIs, arxiv IDs, textbook references. + +### 3b: Cross-check Against Project Knowledge Base + +First, read `references.md` (in this skill's directory) for quick fact-checking — it contains known complexity bounds, key results, and established reductions with BibTeX keys. If the issue's claims contradict known facts in this file → **Fail** immediately. + +Then read `docs/paper/references.bib` for the full bibliography. If a cited paper is already in the bibliography: +- Verify the claim in the issue matches what the paper actually says +- If the issue cites a result from a known paper but misquotes it → **Fail** + +### 3c: External Verification (best-effort fallback chain) + +For each reference NOT in the bibliography: + +1. **Try arxiv MCP** (if available): search by title/author/arxiv ID +2. **Try Semantic Scholar MCP** (if available): search by title/DOI +3. **Fall back to WebSearch + WebFetch**: search for the paper, fetch abstract + +For each reference, verify: +- Paper exists with matching title, authors, and year +- The specific theorem/result cited actually appears in the paper +- Cross-check the claim against at least one other source (survey paper, textbook, or independent reference) + +If any cited fact **cannot be verified** (paper not found, claim not in paper) → **Fail** with specifics. + +### 3d: Better Algorithm Discovery + +While searching, if you find: +- A **more recent paper** that supersedes the cited reference +- A **lower overhead** construction for the same reduction +- A **different approach** that achieves better bounds + +→ Include in the report as a **Recommendation** (not a failure). Example: "Note: Smith et al. (2024) improve on this with O(n) overhead instead of O(n^2)." + +--- + +## Rule Check 4: Well-written (fail label: `PoorWritten`) + +**Goal:** Is the issue clear, complete, and implementable? + +### 4a: Information Completeness + +Check all template sections are present and substantive (not placeholder text): + +| Section | Required content | +|---------|-----------------| +| Source | Valid problem name | +| Target | Valid problem name | +| Motivation | Why this reduction matters | +| Reference | At least one literature citation | +| Reduction Algorithm | Complete step-by-step procedure | +| Size Overhead | Table with code metric names and formulas | +| Validation Method | How to verify correctness | +| Example | Concrete worked instance | + +Missing or placeholder sections → list them as **Fail** items. + +### 4b: Algorithm Completeness + +The reduction algorithm must be a **complete, step-by-step procedure** detailed enough for a programmer to implement: +- Every step numbered and unambiguous +- All intermediate values defined +- No gaps ("similarly for the remaining variables") +- Solution extraction must be clear from the variable mapping + +If the algorithm is a high-level sketch rather than an implementable procedure → **Fail**. + +### 4c: Symbol and Notation Consistency + +- **All symbols must be defined before first use.** E.g., if the algorithm references `n`, there must be a prior line "let n = |V|". +- **Symbols must be consistent** between sections. If the algorithm defines `G = (V, E)` but the overhead table uses `N` for vertex count without defining it → **Fail**. +- **Code metric names** in the overhead table must match actual getter methods on the target problem (e.g., `num_vertices`, `num_edges`, `num_vars`, `num_clauses`). Check with `pred show --json` → `size_fields`. + +### 4d: Example Quality + +- **Non-trivial**: Must have enough structure to exercise the reduction meaningfully (not just 2 vertices) +- **Brute-force solvable**: Small enough to verify by hand or with `pred solve` +- **Fully worked**: Shows the source instance, the reduction construction step by step, and the target instance — not just "apply the reduction to get..." + +--- + +# Part B: Model Issue Checks + +Applies when the title contains `[Model]`. + +## Model Check 1: Usefulness (fail label: `Useless`) + +**Goal:** Does this problem add value to the reduction graph? + +1. Parse the **problem name** from the issue body ("Name" field under Definition). + +2. Check if the problem already exists: + ```bash + pred show --json 2>/dev/null + ``` + If it succeeds, the problem **already exists** → **Fail** ("Problem already implemented"). + +3. Check **Motivation** field: + - Is there a concrete use case? (quantum computing, network design, scheduling, etc.) + - Does it mention what reductions this problem enables? A problem without any planned reduction rules is an orphan node. + - If motivation is empty, placeholder, or vague → **Warn** + +4. Check **How to solve** section: + - At least one solver method must be checked (brute-force, ILP reduction, or other) + - If no solver path is identified → **Warn** ("No solver means reduction rules can't be verified") + +--- + +## Model Check 2: Non-trivial (fail label: `Trivial`) + +**Goal:** Is this genuinely a distinct problem, not a repackaging of an existing one? + +Flag as **Fail** if: + +- **Isomorphic to existing problem:** The definition is mathematically equivalent to a problem already in the codebase under a different name (e.g., proposing "Maximum Weight Clique" when `MaximumClique` with weights already exists). +- **Trivial variant:** The proposed problem is just an existing problem restricted to a specific graph type or weight type that could be handled by adding a variant to the existing model (e.g., "MIS on bipartite graphs" is a variant, not a new problem). +- **Trivial renaming:** Same feasibility constraints and objective, different name. + +Check against existing problems: +```bash +pred list --json +``` + +If the problem has a genuinely different feasibility constraint or objective function from all existing problems → **Pass**. + +--- + +## Model Check 3: Correctness (fail label: `Wrong`) + +**Goal:** Are the definition, complexity claims, and references accurate? + +### 3a: Definition Correctness + +- Verify the formal definition is mathematically well-formed +- Check that feasibility constraints and objective are clearly separated +- Verify the variable domain matches the problem semantics (binary for selection, k-ary for coloring, etc.) + +### 3b: Complexity Verification + +The issue claims a best-known exact algorithm with a specific time bound. Verify: +- The cited paper/algorithm actually exists (use same fallback chain as Rule Check 3) +- The time bound matches what the paper claims +- For polynomial-time problems: verify they are indeed polynomial (not NP-hard) +- For NP-hard problems: verify the exponential base is correct (e.g., 1.1996^n for MIS, not 2^n) +- **If a better algorithm is found** during search → add as a **Recommendation** in the comment + +### 3c: Cross-check Against Project Knowledge Base + +Same process as Rule Check 3b — first read `references.md` (in this skill's directory) for quick fact-checking against known complexity bounds and results. Then read `docs/paper/references.bib` and verify claims against known papers. + +### 3d: External Verification + +Same fallback chain as Rule Check 3c: +1. arxiv MCP → 2. Semantic Scholar MCP → 3. WebSearch + WebFetch + +Verify each reference exists and supports the claims made in the issue. + +--- + +## Model Check 4: Well-written (fail label: `PoorWritten`) + +**Goal:** Is the issue clear, complete, and implementable? + +### 4a: Information Completeness + +Check all template sections are present and substantive: + +| Section | Required content | +|---------|-----------------| +| Motivation | Concrete use case and graph connectivity | +| Name | Valid problem name following naming conventions | +| Reference | At least one literature citation | +| Definition | Formal: input, feasibility constraints, objective | +| Variables | Count, per-variable domain, semantic meaning | +| Schema | Type name, variants, field table | +| Complexity | Best known algorithm with citation | +| How to solve | At least one solver method checked | +| Example Instance | Concrete instance with known solution | + +Missing or placeholder sections → list them as **Fail** items. + +### 4b: Definition Completeness + +The formal definition must be **precise and implementable**: +- Input structure clearly specified (graph, formula, matrix, etc.) +- Feasibility constraints stated as mathematical conditions +- Objective (if optimization) stated as what to maximize/minimize +- All quantifiers explicit ("for all edges (u,v) in E" not "adjacent vertices don't share colors") + +### 4c: Symbol and Notation Consistency + +- **All symbols defined before first use.** If the definition uses `G = (V, E)`, the Variables section should reference `V` consistently. +- **Symbols consistent across sections.** The Schema field descriptions must match symbols in the Definition. +- **Naming conventions:** optimization problems must use `Maximum`/`Minimum` prefix. Check against CLAUDE.md naming rules. + +### 4d: Example Quality + +- **Non-trivial**: Enough vertices/variables to exercise constraints meaningfully (not just a triangle) +- **Known optimal solution provided**: Must state the optimal value, not just the instance +- **Detailed enough for paper**: This example will appear in the paper — it needs to be illustrative + +--- + +# Step 2: Compose and Post Report + +### Report Format + +Post a single GitHub comment. The table adapts to the issue type: + +**For [Rule] issues:** + +````markdown +## Issue Quality Check — Rule + +| Check | Result | Details | +|-------|--------|---------| +| Usefulness | ✅ Pass | No existing direct reduction Source → Target | +| Non-trivial | ✅ Pass | Gadget construction with penalty terms | +| Correctness | ❌ Fail | Paper "Smith 2020" not found on arxiv or Semantic Scholar | +| Well-written | ⚠️ Warn | Symbol `m` used in overhead table but not defined in algorithm | + +**Overall: 2 passed, 1 failed, 1 warning** + +--- + +### Usefulness +[Detailed explanation] + +### Non-trivial +[Detailed explanation] + +### Correctness +[Per-reference verification results, any better algorithms found] + +### Well-written +[Specific items to fix] + +#### Recommendations +- [Better algorithms or papers discovered] +- [Suggestions for improving the issue] +```` + +**For [Model] issues:** + +````markdown +## Issue Quality Check — Model + +| Check | Result | Details | +|-------|--------|---------| +| Usefulness | ✅ Pass | Novel problem not yet in reduction graph | +| Non-trivial | ✅ Pass | Distinct feasibility constraints from existing problems | +| Correctness | ⚠️ Warn | Complexity bound not independently verified | +| Well-written | ❌ Fail | Missing Variables section; symbol `K` undefined | + +**Overall: 2 passed, 1 warning, 1 failed** + +--- + +### Usefulness +[Detailed explanation] + +### Non-trivial +[Detailed explanation] + +### Correctness +[Per-reference verification, complexity check results, better algorithms found] + +### Well-written +[Specific items to fix] + +#### Recommendations +- [Better complexity bounds discovered] +- [Suggestions for improving the issue] +```` + +### Label Application + +```bash +# Only add labels for FAILED checks (not warnings) +gh issue edit --add-label "Useless" # if Check 1 failed +gh issue edit --add-label "Trivial" # if Check 2 failed +gh issue edit --add-label "Wrong" # if Check 3 failed +gh issue edit --add-label "PoorWritten" # if Check 4 failed +``` + +**Never close the issue.** Labels and comments only. + +### Comment Posting + +```bash +gh issue comment --body "$(cat <<'EOF' + +EOF +)" +``` + +--- + +## Tool Fallback Chain for Literature + +| Priority | Tool | Use for | +|----------|------|---------| +| 1 | arxiv MCP | arxiv papers (search by ID, title, author) | +| 2 | Semantic Scholar MCP | DOI lookup, citation graphs, abstracts | +| 3 | WebSearch | General paper search, cross-referencing | +| 4 | WebFetch | Fetch specific paper pages for claim verification | + +If an MCP tool is not available, skip to the next in the chain. All checks should be possible with just WebSearch + WebFetch as a baseline. + +--- + +## Common Mistakes + +- **Don't fail on warnings.** Only add labels for definitive failures. Ambiguous cases get warnings. +- **Don't close issues.** This skill labels and comments only. +- **Don't hallucinate paper content.** If you can't find a paper, say "not found" — don't guess what it might contain. +- **Match problem names carefully.** Issues may use aliases (MIS, MVC, SAT) that need resolution via `pred show`. +- **Check the right template.** `[Rule]` and `[Model]` issues have different sections — don't check for "Reduction Algorithm" on a Model issue. diff --git a/.claude/skills/check-issue/references.md b/.claude/skills/check-issue/references.md new file mode 100644 index 000000000..1b551170b --- /dev/null +++ b/.claude/skills/check-issue/references.md @@ -0,0 +1,287 @@ +# Quick Reference: Known Facts for Issue Fact-Checking + +Use this file to cross-check claims in `[Rule]` and `[Model]` issues against established results. Built from the Related Projects in README.md. + +--- + +## 1. Karp DSL (PLDI 2022) + +**Source:** [REA1/karp](https://github.com/REA1/karp) — A Racket DSL for writing and testing Karp reductions between NP-complete problems. +**Paper:** Zhang, Hartline & Dimoulas, "Karp: A Language for NP Reductions", PLDI 2022. + +### Karp's 21 NP-Complete Problems + +All 21 problems from Karp (1972), which the DSL supports: + +| # | Problem | Category | +|---|---------|----------| +| 1 | Satisfiability (SAT) | Logic | +| 2 | 0-1 Integer Programming | Mathematical Programming | +| 3 | Clique | Graph Theory | +| 4 | Set Packing | Sets and Partitions | +| 5 | Vertex Cover | Graph Theory | +| 6 | Set Covering | Sets and Partitions | +| 7 | Feedback Node Set | Graph Theory | +| 8 | Feedback Arc Set | Graph Theory | +| 9 | Directed Hamiltonian Circuit | Graph Theory | +| 10 | Undirected Hamiltonian Circuit | Graph Theory | +| 11 | 3-SAT | Logic | +| 12 | Chromatic Number (Graph Coloring) | Graph Theory | +| 13 | Clique Cover | Graph Theory | +| 14 | Exact Cover | Sets and Partitions | +| 15 | Hitting Set | Sets and Partitions | +| 16 | Steiner Tree | Network Design | +| 17 | 3-Dimensional Matching | Sets and Partitions | +| 18 | Knapsack | Mathematical Programming | +| 19 | Job Sequencing | Scheduling | +| 20 | Partition | Sets and Partitions | +| 21 | Max Cut | Graph Theory | + +### Karp's Reduction Tree (source → target) + +``` +SAT +├── 3-SAT +│ ├── Chromatic Number (3-Coloring) +│ ├── Clique +│ │ ├── Set Packing +│ │ └── Vertex Cover +│ │ ├── Feedback Node Set +│ │ │ └── Directed Hamiltonian Circuit +│ │ │ └── Undirected Hamiltonian Circuit +│ │ └── Set Covering +│ │ ├── Steiner Tree +│ │ └── Hitting Set +│ └── Exact Cover +│ ├── 3-Dimensional Matching +│ └── Knapsack +│ ├── Job Sequencing +│ └── Partition +├── 0-1 Integer Programming +├── Clique Cover +├── Feedback Arc Set +└── Max Cut +``` + +**Key reductions:** +- SAT → 3-SAT (clause splitting with auxiliary variables) +- 3-SAT → Clique (clause-variable gadget) +- Clique → Vertex Cover (complement graph: VC = n - Clique) +- Clique → Set Packing (clique edges as sets) +- Vertex Cover → Set Covering (edges as universe, vertices as sets) +- Vertex Cover → Feedback Node Set +- Exact Cover → 3-Dimensional Matching +- Exact Cover → Knapsack +- SAT → Max Cut + +--- + +## 2. Complexity Zoo + +**Source:** [complexityzoo.net](https://complexityzoo.net/) — Comprehensive catalog of 550+ complexity classes (Scott Aaronson). + +### Key Complexity Classes + +| Class | Description | +|-------|-------------| +| **P** | Deterministic polynomial time | +| **NP** | Nondeterministic polynomial time; "yes" certificates verifiable in poly time | +| **co-NP** | Complements of NP problems | +| **PSPACE** | Polynomial space (contains NP) | +| **EXP** | Exponential time | +| **BPP** | Bounded-error probabilistic polynomial time | +| **BQP** | Bounded-error quantum polynomial time | +| **PH** | Polynomial hierarchy | +| **APX** | Problems with constant-factor approximation | +| **MAX SNP** | Syntactically defined optimization class | + +### Canonical NP-Complete Problems (from Complexity Zoo) + +- **SAT** (Boolean satisfiability) — the first NP-complete problem (Cook 1971) +- **3-Colorability** — Can vertices be colored with 3 colors, no adjacent same color? +- **Hamiltonian Cycle** — Does a cycle visiting each vertex exactly once exist? +- **Traveling Salesperson** — Is there a tour within distance T? +- **Maximum Clique** — Do k mutually-adjacent vertices exist? +- **Subset Sum** — Does a subset sum to exactly x? + +### Key Class Relationships + +- P vs NP: Open problem; unequal relative to random oracles +- NP = co-NP iff PH collapses +- NP ⊆ PSPACE (Savitch's theorem) +- If NP ⊆ P/poly then PH collapses to Σ₂P + +--- + +## 3. Compendium of NP Optimization Problems + +**Source:** [csc.kth.se/tcs/compendium](https://www.csc.kth.se/tcs/compendium/) — Online catalog of NP optimization problems with approximability results (Crescenzi & Kann). + +### Problem Categories + +| Code | Category | Problem Count | +|------|----------|---------------| +| GT | Graph Theory | ~60 | +| ND | Network Design | ~30 | +| SP | Sets and Partitions | ~20 | +| SR | Storage and Retrieval | ~10 | +| SS | Sequencing and Scheduling | ~20 | +| MP | Mathematical Programming | ~15 | +| AN | Algebra and Number Theory | ~10 | +| GP | Games and Puzzles | ~5 | +| LO | Logic | ~10 | +| AL | Automata and Language Theory | ~5 | +| PO | Program Optimization | ~5 | +| MS | Miscellaneous | ~10 | + +### Graph Theory: Covering and Partitioning + +| Problem | Type | +|---------|------| +| Minimum Vertex Cover | Min | +| Minimum Dominating Set | Min | +| Maximum Domatic Partition | Max | +| Minimum Edge Dominating Set | Min | +| Minimum Independent Dominating Set | Min | +| Minimum Graph Coloring (Chromatic Number) | Min | +| Minimum Color Sum | Min | +| Maximum Achromatic Number | Max | +| Minimum Edge Coloring | Min | +| Minimum Feedback Vertex Set | Min | +| Minimum Feedback Arc Set | Min | +| Minimum Maximal Matching | Min | +| Maximum Triangle Packing | Max | +| Maximum H-Matching | Max | +| Minimum Clique Partition | Min | +| Minimum Clique Cover | Min | +| Minimum Complete Bipartite Subgraph Cover | Min | + +### Graph Theory: Subgraph Problems + +| Problem | Type | +|---------|------| +| Maximum Clique | Max | +| Maximum Independent Set | Max | +| Maximum Independent Sequence | Max | +| Maximum Induced Subgraph with Property P | Max | +| Minimum Vertex Deletion (Subgraph Property) | Min | +| Minimum Edge Deletion (Subgraph Property) | Min | +| Maximum Degree-Bounded Connected Subgraph | Max | +| Maximum Planar Subgraph | Max | +| Maximum K-Colorable Subgraph | Max | +| Maximum Subforest | Max | +| Minimum Interval Graph Completion | Min | +| Minimum Chordal Graph Completion | Min | + +### Graph Theory: Vertex Ordering + +| Problem | Type | +|---------|------| +| Minimum Bandwidth | Min | +| Minimum Directed Bandwidth | Min | +| Minimum Linear Arrangement | Min | +| Minimum Cut Linear Arrangement | Min | + +### Network Design + +| Problem | Type | +|---------|------| +| Minimum Steiner Tree | Min | +| Minimum Biconnectivity Augmentation | Min | +| Minimum k-Connectivity Augmentation | Min | +| Minimum k-Vertex Connected Subgraph | Min | +| Traveling Salesman Problem | Min | +| Maximum Priority Flow | Max | + +### Approximability Classes + +| Class | Meaning | +|-------|---------| +| **PO** | Polynomial-time solvable exactly | +| **FPTAS** | Fully polynomial-time approximation scheme | +| **PTAS** | Polynomial-time approximation scheme | +| **APX** | Constant-factor approximation | +| **poly-APX** | Polynomial-factor approximation | +| **NPO** | General NP optimization (may have no good approximation) | + +--- + +## 4. Computers and Intractability (Garey & Johnson, 1979) + +**Source:** The classic reference cataloging 300+ NP-complete problems with reductions. The most cited book in computer science. + +### Problem Classification (Garey-Johnson numbering) + +Uses same category codes as the Compendium (GT, ND, SP, SS, MP, AN, GP, LO, AL, PO, MS). + +### Key Problems and Their GJ Numbers + +| GJ # | Problem | Our Name | +|------|---------|----------| +| GT1 | Minimum Vertex Cover | MinimumVertexCover | +| GT2 | Minimum Independent Dominating Set | MinimumDominatingSet | +| GT4 | Graph Coloring (Chromatic Number) | KColoring | +| GT5 | Clique | MaximumClique | +| GT20 | Maximum Independent Set | MaximumIndependentSet | +| GT21 | Maximum Clique | MaximumClique | +| GT24 | Maximum Cut | MaxCut | +| GT34 | Hamiltonian Circuit | — | +| GT39 | Feedback Vertex Set | — | +| GT46 | Traveling Salesman | TravelingSalesman | +| ND5 | Steiner Tree in Graphs | — | +| SP1 | 3-Dimensional Matching | — | +| SP2 | Partition | — | +| SP5 | Set Covering | MinimumSetCovering | +| SP3 | Set Packing | MaximumSetPacking | +| SP13 | Bin Packing | BinPacking | +| SS1 | Multiprocessor Scheduling | — | +| MP1 | Integer Programming | ILP | +| LO1 | Satisfiability (SAT) | Satisfiability | +| LO2 | 3-Satisfiability | KSatisfiability | + +### Classic Reductions from Garey & Johnson + +| Source | Target | Method | +|--------|--------|--------| +| SAT → 3-SAT | Clause splitting | Auxiliary variables for long clauses | +| 3-SAT → 3-Coloring | Variable + palette gadget | | +| 3-SAT → MIS | Clause-variable gadget | Triangle per clause, conflict edges | +| MIS ↔ Vertex Cover | Complement | IS + VC = n | +| MIS ↔ Clique | Complement graph | IS in G = Clique in complement(G) | +| Vertex Cover → Set Cover | Edge-based | Edges as universe, vertex neighborhoods as sets | +| SAT → Min Dominating Set | Triangle gadget | Variable triangle + clause vertices | +| Vertex Cover → Feedback Vertex Set | | | +| Exact Cover → 3D Matching | | | +| Partition → Bin Packing | | | +| SAT → 0-1 Integer Programming | Direct encoding | Variables → integers, clauses → constraints | +| SAT → Max Cut | | | + +--- + +## Cross-Reference: Our Problems vs External Sources + +| Our Problem Name | Karp 21 | GJ # | Compendium | Complexity Zoo | +|-----------------|---------|------|------------|----------------| +| MaximumIndependentSet | via Clique complement | GT20 | GT: Subgraph | | +| MinimumVertexCover | #5 Vertex Cover | GT1 | GT: Covering | | +| MaximumClique | #3 Clique | GT5/GT21 | GT: Subgraph | NP-complete | +| MaxCut | #21 Max Cut | GT24 | GT: Subgraph | MAX SNP | +| KColoring | #12 Chromatic Number | GT4 | GT: Covering | NP-complete (k>=3) | +| MinimumDominatingSet | | GT2 | GT: Covering | | +| MaximumMatching | | | GT: Covering | **P** (polynomial) | +| TravelingSalesman | | GT46 | ND | NP-complete | +| Satisfiability | #1 SAT | LO1 | LO | NP-complete | +| KSatisfiability | #11 3-SAT | LO2 | LO | NP-complete | +| MinimumSetCovering | #6 Set Covering | SP5 | SP | | +| MaximumSetPacking | #4 Set Packing | SP3 | SP | | +| BinPacking | | SP13 | SP | | +| ILP | #2 0-1 Integer Prog | MP1 | MP | | +| SpinGlass | | | | | +| QUBO | | | | | +| CircuitSAT | | | | NP-complete | +| Factoring | | | | Not known NP-complete | +| PaintShop | | | | | +| BMF | | | | | +| BicliqueCover | | | GT: Covering | | +| MaximalIS | | | | | +| CVP | | | | | diff --git a/docs/plans/2026-03-05-check-issue-skill-design.md b/docs/plans/2026-03-05-check-issue-skill-design.md new file mode 100644 index 000000000..7f8855996 --- /dev/null +++ b/docs/plans/2026-03-05-check-issue-skill-design.md @@ -0,0 +1,60 @@ +# check-issue Skill Design + +## Purpose +Manual quality gate for `[Rule]` and `[Model]` GitHub issues. Invoked via `/check-issue `. Runs 4 checks (adapted per issue type) and posts a structured report as a GitHub comment. Adds failure labels but does NOT close issues. + +## Rule Checks + +### 1. Usefulness (label: `Useless`) +- Parse Source/Target from issue body +- `pred path --json` to check existing reductions +- Direct reduction exists → compare proposed overhead vs existing (`pred show --json` → `reduces_from[]`); fail if not strictly lower +- Multi-hop only → pass with note +- No path → pass +- Check Motivation field is substantive + +### 2. Non-trivial (label: `Trivial`) +- Flag: pure variable substitution, subtype coercion, same-problem identity +- Algorithm must be detailed enough for a programmer to implement + +### 3. Correctness (label: `Wrong`) +- Cross-check references against `references.bib` first +- External references: arxiv MCP → Semantic Scholar MCP → WebSearch (fallback chain) +- Verify paper exists, claim matches, cross-reference with other sources +- If better algorithm found → recommend in comment + +### 4. Well-written (label: `PoorWritten`) +- Information completeness (all template sections filled) +- Algorithm is complete step-by-step procedure +- Symbol/notation consistency (defined before use, match between sections) +- Code metric names match actual getter methods +- Example quality (non-trivial, brute-force solvable, fully worked) + +## Model Checks + +### 1. Usefulness (label: `Useless`) +- Check if problem already exists via `pred show --json` +- Motivation must be concrete with use cases +- Must identify at least one solver method +- Should mention planned reduction rules + +### 2. Non-trivial (label: `Trivial`) +- Not isomorphic to existing problem under different name +- Not a trivial variant (graph/weight restriction) of existing model +- Must have genuinely different feasibility constraints or objective + +### 3. Correctness (label: `Wrong`) +- Definition mathematically well-formed +- Complexity claims verified against literature +- If better algorithm found → recommend in comment +- Same fallback chain as rules + +### 4. Well-written (label: `PoorWritten`) +- All template sections present (Motivation, Definition, Variables, Schema, Complexity, How to solve, Example) +- Definition precise and implementable +- Symbol/notation consistency across sections +- Naming conventions followed (Maximum/Minimum prefix) +- Example with known optimal solution + +## Output +Structured GitHub comment with pass/fail/warn table, per-check details, and recommendations. Labels added only for failed checks. Never closes issues. diff --git a/references/rules-karp.md b/references/rules-karp.md new file mode 100644 index 000000000..f4f0172d0 --- /dev/null +++ b/references/rules-karp.md @@ -0,0 +1,138 @@ +# Rules in Karp (PLDI 2022) + +**Source:** [Karp: A Language for NP Reductions](https://doi.org/10.1145/3519939.3523732), Chenhao Zhang, Jason D. Hartline, Christos Dimoulas, PLDI 2022. + +**Repository:** `pkgref/karp/` + +Karp is a Racket DSL for programming and random-testing Karp (many-one) reductions between NP decision problems. The paper evaluates expressiveness by implementing 25 reductions from Kleinberg & Tardos's algorithms textbook (5 in-text examples + 20 end-of-chapter exercises). Only one reduction (3-SAT → Independent Set) is included in the open-source repository; the rest are described in the paper's Table 1. + +## Implemented Reductions (Table 1) + +| # | Source | Target | Key Features | +|---|--------|--------|-------------| +| 1 | 3-SAT | 3D-Matching | Set | +| 2 | 3-SAT | Directed-Hamiltonian-Cycle | Graph, Connectivity | +| 3 | ? | Directed-Edge-Disjoint-Paths | Graph, Connectivity | +| 4 | ? | Fully-Compatible-Configuration | Graph, Mapping | +| 5 | 3-SAT | Graph-Coloring | Graph, Mapping | +| 6 | 3-SAT | Independent-Set | Graph | +| 7 | ? | Low-Diameter-Clustering | Mapping | +| 8 | ? | Plot-Fulfillment | Graph, Path | +| 9 | ? | Winner-Determination-for-Combinatorial-Auctions | Set, Mapping | +| 10 | ? | Diverse-Subset | Set, Mapping | +| 11 | ? | Resource-Reservation | Set | +| 12 | ? | Strongly-Independent-Set | Graph | +| 13 | Independent-Set | Vertex-Cover | Graph | +| 14 | ? | Independent-Set | Graph, Mapping | +| 15 | ? | (a,b)-Skeleton | Set, Graph | +| 16 | ? | 2-Partition | Set | +| 17 | ? | Galactic-Shortest-Path | Mapping, Path | +| 18 | ? | Dominating-Set | Graph | +| 19 | ? | Nearby-Electromagnetic-Observation | Set, Mapping | +| 20 | ? | Feedback-Vertex-Set | Graph, Acyclicity | +| 21 | ? | Hitting-Set | Set | +| 22 | ? | Monotone-Satisfiability | CNF | +| 23 | Vertex-Cover | Set-Cover | Set | +| 24 | ? | Graphical-Steiner-Tree | Graph, Connectivity | +| 25 | ? | Strategic-Advertising | Set | + +**Note:** `?` means the source problem is not specified in the extracted table — these are textbook exercises where students choose the source problem. The "Key Features" column indicates which Karp language features (set operations, graph library, mappings, connectivity/acyclicity predicates, CNF library) are needed. + +## Detailed Reduction Algorithms + +### 1. 3-SAT → Independent Set + +**Source:** `pkgref/karp/example/3sat-to-iset.karp` (the only reduction with full source code in the repository) + +**Algorithm:** + +Given a 3-CNF formula φ with clauses C₁, C₂, ..., Cₘ: + +1. **Vertices:** For each clause Cᵢ and each literal l in Cᵢ, create a vertex (l, i). This gives 3m vertices total (3 per clause). + +2. **Conflict edges (E₁):** For every pair of vertices (l₁, i) and (l₂, j) where l₁ and l₂ are complementary literals (one is the negation of the other), add an edge. These edges prevent the independent set from containing contradictory assignments. + +3. **Clause edges (E₂):** For every clause Cᵢ, add edges between all pairs of vertices created from that clause's literals. This forms a triangle within each clause's 3 vertices, forcing the independent set to pick at most one literal per clause. + +4. **Threshold:** Set k = m (the number of clauses). + +5. **Result:** The graph G = (V, E₁ ∪ E₂) with threshold k forms the Independent Set instance. + +**Correctness argument (via certificate constructions):** + +- **Forward (SAT assignment → independent set):** For each clause, pick the vertex corresponding to one satisfied literal. Since one literal per clause is selected and no two complementary literals are chosen, this gives an independent set of size m = k. + +- **Backward (independent set → SAT assignment):** For each variable x, if the independent set contains a vertex whose first subscript is the positive literal of x, assign x = true; otherwise assign x = false. The independent set constraints guarantee consistency. + +### 2. 3-SAT → Directed-Hamiltonian-Cycle + +**Algorithm:** The classic gadget-based construction. For each variable, create a row of vertices that can be traversed left-to-right (true) or right-to-left (false). For each clause, create a triangle gadget. Stitch clause triangles into variable rows so that a Hamiltonian cycle exists iff the formula is satisfiable. The paper notably found a bug in a textbook version of this reduction using Karp's random testing. + +### 3. 3-SAT → Graph-Coloring (3-Coloring) + +**Algorithm:** Standard reduction using gadgets. Create a truth-setting component for each variable (two vertices connected by an edge, representing x and ¬x, plus a base vertex connected to both). Create a clause-satisfaction gadget for each clause (an OR-gadget subgraph). Connect components so that a valid 3-coloring exists iff the formula is satisfiable. + +### 4. Independent-Set → Vertex-Cover + +**Algorithm:** The complement reduction. Given a graph G = (V, E) and threshold k for Independent Set, create a Vertex Cover instance with the same graph G and threshold |V| − k. A set S is an independent set of size ≥ k iff V \ S is a vertex cover of size ≤ |V| − k. + +### 5. Vertex-Cover → Set-Cover + +**Algorithm:** Given a graph G = (V, E) and threshold k: +- Universe E = the set of edges +- For each vertex v, create a set Sᵥ = {edges incident to v} +- Family F = {Sᵥ | v ∈ V} +- Threshold K = k + +A vertex cover of size k selects k vertices whose incident edges cover all edges, which directly corresponds to selecting k sets from F that cover the universe E. + +### 6. ? → 2-Partition + +**Algorithm:** Typically reduced from Subset-Sum. Given a set of integers and a target, construct a set where a partition into two equal-sum halves exists iff the original instance has a solution. The key challenge (noted in the paper's error analysis) is padding values to ensure all numbers are nonnegative. + +### 7. ? → Hitting-Set + +**Algorithm:** Typically reduced from Vertex-Cover or Set-Cover. Given a collection of sets and a budget k, find a set of ≤ k elements that "hits" (intersects) every set in the collection. This is the dual of Set-Cover. + +### 8. ? → Dominating-Set + +**Algorithm:** Typically reduced from Vertex-Cover. Given a graph G and threshold k, find a set of ≤ k vertices such that every vertex is either in the set or adjacent to a vertex in the set. + +### 9. ? → Feedback-Vertex-Set + +**Algorithm:** Typically reduced from Vertex-Cover or 3-SAT. Find a minimum set of vertices whose removal makes the graph acyclic. The paper notes this requires Karp's acyclicity predicates from the graph library. + +### 10. ? → Strongly-Independent-Set + +**Algorithm:** A variant of Independent Set with additional constraints. The paper's error analysis notes the common mistake of "no consideration for isolated vertices." + +### 11. ? → Monotone-Satisfiability + +**Algorithm:** Reduced from general SAT or 3-SAT. Monotone SAT restricts clauses to contain only positive or only negative literals. The reduction uses Karp's CNF library features. + +### 12. 3-SAT → 3D-Matching + +**Algorithm:** The classic reduction. Create element triples encoding variable assignments and clause satisfaction. A perfect 3D matching exists iff the formula is satisfiable. Uses Karp's set operations. + +### 13. ? → Graphical-Steiner-Tree + +**Algorithm:** Given a graph with weighted edges and a subset of required vertices (terminals), find a minimum-weight tree connecting all terminals. Typically reduced from 3-SAT or Vertex-Cover. Requires graph connectivity predicates. + +## Problem Categories (from the paper) + +The paper categorizes the 31 NP-hard problems from Kleinberg & Tardos into: + +| Category | Count | Examples | +|----------|-------|---------| +| Set Problems | 10 | Set-Cover, Hitting-Set, 3D-Matching | +| Labeled Object Problems | 5 | Interval-Scheduling, Graph-3-Coloring | +| Basic Graph Problems | 6 | Independent-Set, Vertex-Cover, Dominating-Set | +| Path/Connectivity Problems | 7 | Hamiltonian-Cycle, Steiner-Tree, Edge-Disjoint-Paths | +| Beyond Connectivity | 3 | Graph partitioning problems (not expressible) | +| Unsupported Data Types | 4 | String/real-number problems (not expressible) | + +## Notes + +- The `?` sources in Table 1 are textbook exercises where students choose an appropriate NP-hard source problem +- 7 of 42 exercises cannot be expressed in Karp due to language limitations (unsupported data types, graph partition certificates with variable number of parts) +- Karp's random testing successfully found bugs in 10 intentionally incorrect reductions, including a bug in a published textbook reduction (3-SAT → Directed-Hamiltonian-Cycle) From 300bdfed5f65a71219e3751f030760fea8f0d03d Mon Sep 17 00:00:00 2001 From: GiggleLiu Date: Thu, 5 Mar 2026 12:15:07 +0800 Subject: [PATCH 2/9] update documents --- .claude/skills/check-issue/SKILL.md | 7 +- .claude/skills/check-issue/references.md | 141 +++++----- .claude/skills/review-implementation/SKILL.md | 11 +- Cargo.toml | 2 + book.toml | 4 +- docs/src/SUMMARY.md | 1 + docs/src/complexity-review.md | 52 ++++ docs/src/design.md | 42 +-- docs/src/static/module-graph.css | 53 ++++ docs/src/static/module-graph.js | 257 ++++++++++++++++++ problemreductions-cli/src/commands/graph.rs | 17 ++ problemreductions-cli/src/mcp/tools.rs | 8 + problemreductions-cli/tests/cli_tests.rs | 126 +++++++++ scripts/gen_module_graph.py | 247 +++++++++++++++++ 14 files changed, 879 insertions(+), 89 deletions(-) create mode 100644 docs/src/complexity-review.md create mode 100644 docs/src/static/module-graph.css create mode 100644 docs/src/static/module-graph.js create mode 100644 scripts/gen_module_graph.py diff --git a/.claude/skills/check-issue/SKILL.md b/.claude/skills/check-issue/SKILL.md index b9656fbec..41f1659a9 100644 --- a/.claude/skills/check-issue/SKILL.md +++ b/.claude/skills/check-issue/SKILL.md @@ -69,12 +69,11 @@ Applies when the title contains `[Rule]`. pred path --json ``` -4. Decision: +4. Decision (principle: new rule must reduce the reduction overhead): - **No path exists** → **Pass** (novel reduction) - - **Multi-hop path only** (2+ hops) → **Pass** with note: "Currently reachable in N hops; a direct rule adds value" - - **Direct path exists** (1 hop) → compare overhead: + - **Path exists** → compare overhead: - Parse the proposed overhead from the issue's "Size Overhead" table - - Get existing overhead from `pred show --json` → `reduces_from[]` where source matches + - Parse the overhead of the path. - If proposed overhead is **strictly lower** on at least one dimension (and not higher on any) → **Pass** ("improves existing reduction") - If overhead is **equal or higher** on all dimensions → **Fail** - If overhead comparison is ambiguous (different dimensions, incomparable expressions) → **Warn** with explanation diff --git a/.claude/skills/check-issue/references.md b/.claude/skills/check-issue/references.md index 1b551170b..bc42d77c1 100644 --- a/.claude/skills/check-issue/references.md +++ b/.claude/skills/check-issue/references.md @@ -1,17 +1,19 @@ # Quick Reference: Known Facts for Issue Fact-Checking -Use this file to cross-check claims in `[Rule]` and `[Model]` issues against established results. Built from the Related Projects in README.md. +Use this file to cross-check claims in `[Rule]` and `[Model]` issues against established results. Built from the Related Projects in README.md. Each entry includes source URLs for traceability. --- ## 1. Karp DSL (PLDI 2022) **Source:** [REA1/karp](https://github.com/REA1/karp) — A Racket DSL for writing and testing Karp reductions between NP-complete problems. -**Paper:** Zhang, Hartline & Dimoulas, "Karp: A Language for NP Reductions", PLDI 2022. +**Paper:** Zhang, Hartline & Dimoulas, ["Karp: A Language for NP Reductions"](https://dl.acm.org/doi/abs/10.1145/3519939.3523732), PLDI 2022. ([PDF](https://users.cs.northwestern.edu/~chrdimo/pubs/pldi22-zhd.pdf)) +**Problem definitions:** [karp/problem-definition/](https://github.com/REA1/karp/tree/main/problem-definition) +**Reduction implementations:** [karp/reduction/](https://github.com/REA1/karp/tree/main/reduction) ### Karp's 21 NP-Complete Problems -All 21 problems from Karp (1972), which the DSL supports: +All 21 problems from [Karp (1972)](https://en.wikipedia.org/wiki/Karp%27s_21_NP-complete_problems), which the DSL supports: | # | Problem | Category | |---|---------|----------| @@ -39,6 +41,8 @@ All 21 problems from Karp (1972), which the DSL supports: ### Karp's Reduction Tree (source → target) +Source: [Wikipedia: Karp's 21 NP-complete problems](https://en.wikipedia.org/wiki/Karp%27s_21_NP-complete_problems) + ``` SAT ├── 3-SAT @@ -82,21 +86,23 @@ SAT ### Key Complexity Classes -| Class | Description | -|-------|-------------| -| **P** | Deterministic polynomial time | -| **NP** | Nondeterministic polynomial time; "yes" certificates verifiable in poly time | -| **co-NP** | Complements of NP problems | -| **PSPACE** | Polynomial space (contains NP) | -| **EXP** | Exponential time | -| **BPP** | Bounded-error probabilistic polynomial time | -| **BQP** | Bounded-error quantum polynomial time | -| **PH** | Polynomial hierarchy | -| **APX** | Problems with constant-factor approximation | -| **MAX SNP** | Syntactically defined optimization class | +| Class | Description | Source | +|-------|-------------|--------| +| **P** | Deterministic polynomial time | [P](https://complexityzoo.net/Complexity_Zoo:P#p) | +| **NP** | Nondeterministic polynomial time; "yes" certificates verifiable in poly time | [NP](https://complexityzoo.net/Complexity_Zoo:N#np) | +| **co-NP** | Complements of NP problems | [co-NP](https://complexityzoo.net/Complexity_Zoo:C#conp) | +| **PSPACE** | Polynomial space (contains NP) | [PSPACE](https://complexityzoo.net/Complexity_Zoo:P#pspace) | +| **EXP** | Exponential time | [EXP](https://complexityzoo.net/Complexity_Zoo:E#exp) | +| **BPP** | Bounded-error probabilistic polynomial time | [BPP](https://complexityzoo.net/Complexity_Zoo:B#bpp) | +| **BQP** | Bounded-error quantum polynomial time | [BQP](https://complexityzoo.net/Complexity_Zoo:B#bqp) | +| **PH** | Polynomial hierarchy | [PH](https://complexityzoo.net/Complexity_Zoo:P#ph) | +| **APX** | Problems with constant-factor approximation | [APX](https://complexityzoo.net/Complexity_Zoo:A#apx) | +| **MAX SNP** | Syntactically defined optimization class | [MAX SNP](https://complexityzoo.net/Complexity_Zoo:M#maxsnp) | ### Canonical NP-Complete Problems (from Complexity Zoo) +Source: [Complexity Zoo: NP](https://complexityzoo.net/Complexity_Zoo:N#np) + - **SAT** (Boolean satisfiability) — the first NP-complete problem (Cook 1971) - **3-Colorability** — Can vertices be colored with 3 colors, no adjacent same color? - **Hamiltonian Cycle** — Does a cycle visiting each vertex exactly once exist? @@ -116,26 +122,29 @@ SAT ## 3. Compendium of NP Optimization Problems **Source:** [csc.kth.se/tcs/compendium](https://www.csc.kth.se/tcs/compendium/) — Online catalog of NP optimization problems with approximability results (Crescenzi & Kann). +**Problem list index:** [node6.html](https://www.csc.kth.se/tcs/compendium/node6.html) ### Problem Categories -| Code | Category | Problem Count | -|------|----------|---------------| -| GT | Graph Theory | ~60 | -| ND | Network Design | ~30 | -| SP | Sets and Partitions | ~20 | -| SR | Storage and Retrieval | ~10 | -| SS | Sequencing and Scheduling | ~20 | -| MP | Mathematical Programming | ~15 | -| AN | Algebra and Number Theory | ~10 | -| GP | Games and Puzzles | ~5 | -| LO | Logic | ~10 | -| AL | Automata and Language Theory | ~5 | -| PO | Program Optimization | ~5 | -| MS | Miscellaneous | ~10 | +| Code | Category | Source | +|------|----------|--------| +| GT | Graph Theory | [Covering](https://www.csc.kth.se/tcs/compendium/node9.html), [Subgraph](https://www.csc.kth.se/tcs/compendium/node32.html), [Ordering](https://www.csc.kth.se/tcs/compendium/node52.html), [Iso/Homo](https://www.csc.kth.se/tcs/compendium/node56.html), [Connectivity](https://www.csc.kth.se/tcs/compendium/node100.html) | +| ND | Network Design | [node](https://www.csc.kth.se/tcs/compendium/node104.html) | +| SP | Sets and Partitions | [node](https://www.csc.kth.se/tcs/compendium/node120.html) | +| SR | Storage and Retrieval | [node](https://www.csc.kth.se/tcs/compendium/node135.html) | +| SS | Sequencing and Scheduling | [node](https://www.csc.kth.se/tcs/compendium/node140.html) | +| MP | Mathematical Programming | [node](https://www.csc.kth.se/tcs/compendium/node155.html) | +| AN | Algebra and Number Theory | [node](https://www.csc.kth.se/tcs/compendium/node163.html) | +| GP | Games and Puzzles | [node](https://www.csc.kth.se/tcs/compendium/node167.html) | +| LO | Logic | [node](https://www.csc.kth.se/tcs/compendium/node170.html) | +| AL | Automata and Language Theory | [node](https://www.csc.kth.se/tcs/compendium/node175.html) | +| PO | Program Optimization | [node](https://www.csc.kth.se/tcs/compendium/node178.html) | +| MS | Miscellaneous | [node](https://www.csc.kth.se/tcs/compendium/node180.html) | ### Graph Theory: Covering and Partitioning +Source: [node9.html](https://www.csc.kth.se/tcs/compendium/node9.html) + | Problem | Type | |---------|------| | Minimum Vertex Cover | Min | @@ -158,6 +167,8 @@ SAT ### Graph Theory: Subgraph Problems +Source: [node32.html](https://www.csc.kth.se/tcs/compendium/node32.html) + | Problem | Type | |---------|------| | Maximum Clique | Max | @@ -175,6 +186,8 @@ SAT ### Graph Theory: Vertex Ordering +Source: [node52.html](https://www.csc.kth.se/tcs/compendium/node52.html) + | Problem | Type | |---------|------| | Minimum Bandwidth | Min | @@ -184,6 +197,8 @@ SAT ### Network Design +Source: [node104.html](https://www.csc.kth.se/tcs/compendium/node104.html) + | Problem | Type | |---------|------| | Minimum Steiner Tree | Min | @@ -195,6 +210,8 @@ SAT ### Approximability Classes +Source: [node3.html](https://www.csc.kth.se/tcs/compendium/node3.html) + | Class | Meaning | |-------|---------| | **PO** | Polynomial-time solvable exactly | @@ -208,11 +225,11 @@ SAT ## 4. Computers and Intractability (Garey & Johnson, 1979) -**Source:** The classic reference cataloging 300+ NP-complete problems with reductions. The most cited book in computer science. +**Source:** Garey & Johnson, *Computers and Intractability: A Guide to the Theory of NP-Completeness*, W. H. Freeman, 1979. ISBN 0-7167-1045-5. No online version; see [WorldCat](https://www.worldcat.org/title/4527557). ### Problem Classification (Garey-Johnson numbering) -Uses same category codes as the Compendium (GT, ND, SP, SS, MP, AN, GP, LO, AL, PO, MS). +Uses same category codes as the Compendium (GT, ND, SP, SS, MP, AN, GP, LO, AL, PO, MS). Problems listed in Appendix A (pp. 187-213). ### Key Problems and Their GJ Numbers @@ -241,20 +258,22 @@ Uses same category codes as the Compendium (GT, ND, SP, SS, MP, AN, GP, LO, AL, ### Classic Reductions from Garey & Johnson -| Source | Target | Method | -|--------|--------|--------| -| SAT → 3-SAT | Clause splitting | Auxiliary variables for long clauses | -| 3-SAT → 3-Coloring | Variable + palette gadget | | -| 3-SAT → MIS | Clause-variable gadget | Triangle per clause, conflict edges | -| MIS ↔ Vertex Cover | Complement | IS + VC = n | -| MIS ↔ Clique | Complement graph | IS in G = Clique in complement(G) | -| Vertex Cover → Set Cover | Edge-based | Edges as universe, vertex neighborhoods as sets | -| SAT → Min Dominating Set | Triangle gadget | Variable triangle + clause vertices | -| Vertex Cover → Feedback Vertex Set | | | -| Exact Cover → 3D Matching | | | -| Partition → Bin Packing | | | -| SAT → 0-1 Integer Programming | Direct encoding | Variables → integers, clauses → constraints | -| SAT → Max Cut | | | +Source: Chapter 3, "Proving NP-Completeness Results" (pp. 45-89). + +| Source → Target | Method | +|-----------------|--------| +| SAT → 3-SAT | Auxiliary variables for long clauses | +| 3-SAT → 3-Coloring | Variable + palette gadget | +| 3-SAT → MIS | Triangle per clause, conflict edges | +| MIS ↔ Vertex Cover | Complement: IS + VC = n | +| MIS ↔ Clique | IS in G = Clique in complement(G) | +| Vertex Cover → Set Cover | Edges as universe, vertex neighborhoods as sets | +| SAT → Min Dominating Set | Variable triangle + clause vertices | +| Vertex Cover → Feedback Vertex Set | | +| Exact Cover → 3D Matching | | +| Partition → Bin Packing | | +| SAT → 0-1 Integer Programming | Variables → integers, clauses → constraints | +| SAT → Max Cut | | --- @@ -262,26 +281,26 @@ Uses same category codes as the Compendium (GT, ND, SP, SS, MP, AN, GP, LO, AL, | Our Problem Name | Karp 21 | GJ # | Compendium | Complexity Zoo | |-----------------|---------|------|------------|----------------| -| MaximumIndependentSet | via Clique complement | GT20 | GT: Subgraph | | -| MinimumVertexCover | #5 Vertex Cover | GT1 | GT: Covering | | -| MaximumClique | #3 Clique | GT5/GT21 | GT: Subgraph | NP-complete | -| MaxCut | #21 Max Cut | GT24 | GT: Subgraph | MAX SNP | -| KColoring | #12 Chromatic Number | GT4 | GT: Covering | NP-complete (k>=3) | -| MinimumDominatingSet | | GT2 | GT: Covering | | -| MaximumMatching | | | GT: Covering | **P** (polynomial) | -| TravelingSalesman | | GT46 | ND | NP-complete | -| Satisfiability | #1 SAT | LO1 | LO | NP-complete | -| KSatisfiability | #11 3-SAT | LO2 | LO | NP-complete | -| MinimumSetCovering | #6 Set Covering | SP5 | SP | | -| MaximumSetPacking | #4 Set Packing | SP3 | SP | | -| BinPacking | | SP13 | SP | | -| ILP | #2 0-1 Integer Prog | MP1 | MP | | +| MaximumIndependentSet | via Clique complement | GT20 | [GT: Subgraph](https://www.csc.kth.se/tcs/compendium/node32.html) | | +| MinimumVertexCover | #5 Vertex Cover | GT1 | [GT: Covering](https://www.csc.kth.se/tcs/compendium/node9.html) | | +| MaximumClique | #3 Clique | GT5/GT21 | [GT: Subgraph](https://www.csc.kth.se/tcs/compendium/node32.html) | [NP-complete](https://complexityzoo.net/Complexity_Zoo:N#np) | +| MaxCut | #21 Max Cut | GT24 | [GT: Subgraph](https://www.csc.kth.se/tcs/compendium/node32.html) | [MAX SNP](https://complexityzoo.net/Complexity_Zoo:M#maxsnp) | +| KColoring | #12 Chromatic Number | GT4 | [GT: Covering](https://www.csc.kth.se/tcs/compendium/node9.html) | NP-complete (k>=3) | +| MinimumDominatingSet | | GT2 | [GT: Covering](https://www.csc.kth.se/tcs/compendium/node9.html) | | +| MaximumMatching | | | [GT: Covering](https://www.csc.kth.se/tcs/compendium/node9.html) | **P** (polynomial) | +| TravelingSalesman | | GT46 | [ND](https://www.csc.kth.se/tcs/compendium/node104.html) | [NP-complete](https://complexityzoo.net/Complexity_Zoo:N#np) | +| Satisfiability | #1 SAT | LO1 | [LO](https://www.csc.kth.se/tcs/compendium/node170.html) | [NP-complete](https://complexityzoo.net/Complexity_Zoo:N#np) | +| KSatisfiability | #11 3-SAT | LO2 | [LO](https://www.csc.kth.se/tcs/compendium/node170.html) | [NP-complete](https://complexityzoo.net/Complexity_Zoo:N#np) | +| MinimumSetCovering | #6 Set Covering | SP5 | [SP](https://www.csc.kth.se/tcs/compendium/node120.html) | | +| MaximumSetPacking | #4 Set Packing | SP3 | [SP](https://www.csc.kth.se/tcs/compendium/node120.html) | | +| BinPacking | | SP13 | [SP](https://www.csc.kth.se/tcs/compendium/node120.html) | | +| ILP | #2 0-1 Integer Prog | MP1 | [MP](https://www.csc.kth.se/tcs/compendium/node155.html) | | | SpinGlass | | | | | | QUBO | | | | | -| CircuitSAT | | | | NP-complete | +| CircuitSAT | | | | [NP-complete](https://complexityzoo.net/Complexity_Zoo:N#np) | | Factoring | | | | Not known NP-complete | | PaintShop | | | | | | BMF | | | | | -| BicliqueCover | | | GT: Covering | | +| BicliqueCover | | | [GT: Covering](https://www.csc.kth.se/tcs/compendium/node9.html) | | | MaximalIS | | | | | | CVP | | | | | diff --git a/.claude/skills/review-implementation/SKILL.md b/.claude/skills/review-implementation/SKILL.md index 9ceae2c3d..e12a42707 100644 --- a/.claude/skills/review-implementation/SKILL.md +++ b/.claude/skills/review-implementation/SKILL.md @@ -107,8 +107,9 @@ When both subagents return: 1. **Parse results** -- identify FAIL/ISSUE items from both reports 2. **Fix automatically** -- structural FAILs (missing registration, missing file), clear semantic issues, Important+ quality issues -3. **Report to user** -- ambiguous semantic issues, Minor quality items, anything you're unsure about -4. **Present consolidated report** combining both reviews +3. **Update complexity review table** -- for any new/modified model or rule, add a row to `docs/src/complexity-review.md` with result PASS/FAIL/UNVERIFIED and review date +4. **Report to user** -- ambiguous semantic issues, Minor quality items, anything you're unsure about +5. **Present consolidated report** combining both reviews ## Step 5: Present Consolidated Report @@ -143,6 +144,12 @@ Merge both subagent outputs into a single report: ### Test Quality (from quality reviewer) ... +### Complexity & Overhead Review +- Update `docs/src/complexity-review.md` with review results for any new or modified models/rules +- Models table: verify `declare_variants!` complexity against best known algorithm +- Rules table: verify `#[reduction(overhead)]` against actual `reduce_to()` code +- Result: PASS / FAIL / UNVERIFIED + ### Fixes Applied - [list of issues automatically fixed by main agent] diff --git a/Cargo.toml b/Cargo.toml index 1dfc0a503..938ff968b 100644 --- a/Cargo.toml +++ b/Cargo.toml @@ -45,3 +45,5 @@ harness = false [profile.release] lto = true +codegen-units = 1 +strip = true diff --git a/book.toml b/book.toml index d49b7a9bd..bda4b2cf9 100644 --- a/book.toml +++ b/book.toml @@ -9,8 +9,8 @@ src = "docs/src" default-theme = "navy" git-repository-url = "https://github.com/CodingThrust/problem-reductions" edit-url-template = "https://github.com/CodingThrust/problem-reductions/edit/main/{path}" -additional-css = ["docs/src/static/theme-images.css", "docs/src/static/reduction-graph.css"] -additional-js = ["docs/src/static/cytoscape.min.js", "docs/src/static/reduction-graph.js"] +additional-css = ["docs/src/static/theme-images.css", "docs/src/static/reduction-graph.css", "docs/src/static/module-graph.css"] +additional-js = ["docs/src/static/cytoscape.min.js", "docs/src/static/reduction-graph.js", "docs/src/static/module-graph.js"] no-section-label = false [output.html.fold] diff --git a/docs/src/SUMMARY.md b/docs/src/SUMMARY.md index e4fe05d62..733405ea7 100644 --- a/docs/src/SUMMARY.md +++ b/docs/src/SUMMARY.md @@ -12,3 +12,4 @@ # Developer Guide - [API Reference](./api.md) +- [Complexity & Overhead Review](./complexity-review.md) diff --git a/docs/src/complexity-review.md b/docs/src/complexity-review.md new file mode 100644 index 000000000..307c3d6fe --- /dev/null +++ b/docs/src/complexity-review.md @@ -0,0 +1,52 @@ +# Complexity & Overhead Correctness Review + +Systematic review of `declare_variants!` complexity expressions and `#[reduction(overhead)]` formulas. +New reviews should be appended to the appropriate table. See [Issue #107](https://github.com/CodingThrust/problem-reductions/issues/107) for methodology. + +**Review result:** PASS (correct), FAIL (wrong — needs fix), UNVERIFIED (not yet reviewed). + +## Models (Variant Complexity) + +The complexity field represents the **worst-case time complexity of the best known algorithm** for each problem variant. + +| Result | Problem | Location | Declared | Best Known | Algorithm | Authors | Fix | Source | Reviewed | +|---|---|---|---|---|---|---|---|---|---| +| FAIL | MaximumMatching\ | `maximum_matching.rs:223` | `2^num_vertices` | O(V³) poly | Blossom algorithm | Edmonds 1965; Gabow 1990 | `num_vertices^3` | [DOI:10.4153/CJM-1965-045-4](https://doi.org/10.4153/CJM-1965-045-4) | 2026-02-28 | +| FAIL | KColoring\ | `kcoloring.rs` | `2^num_vertices` | O(V+E) poly | BFS bipartiteness check | Textbook (CLRS) | `num_vertices + num_edges` | Standard textbook result | 2026-02-28 | +| FAIL | KSatisfiability\ | `ksat.rs` | `2^num_variables` | O(n+m) poly | Implication graph SCC | Aspvall, Plass, Tarjan 1979 | `num_variables + num_clauses` | [DOI:10.1016/0020-0190(79)90002-4](https://doi.org/10.1016/0020-0190(79)90002-4) | 2026-02-28 | +| FAIL | KColoring\ | `kcoloring.rs` | `3^num_vertices` | O\*(1.3289^n) | Constraint satisfaction reduction | Beigel & Eppstein 2005 | `1.3289^num_vertices` | [DOI:10.1016/j.jalgor.2004.06.008](https://doi.org/10.1016/j.jalgor.2004.06.008) | 2026-02-28 | +| FAIL | KColoring\ | `kcoloring.rs` | `4^num_vertices` | O\*(1.7159^n) | Measure and conquer | Wu, Gu, Jiang, Shao, Xu 2024 | `1.7159^num_vertices` | [DOI:10.4230/LIPIcs.ESA.2024.103](https://doi.org/10.4230/LIPIcs.ESA.2024.103) | 2026-02-28 | +| FAIL | KColoring\ | `kcoloring.rs` | `5^num_vertices` | O\*((2−ε)^n) | Breaking the 2^n barrier | Zamir 2021 | `2^num_vertices` | [DOI:10.4230/LIPIcs.ICALP.2021.113](https://doi.org/10.4230/LIPIcs.ICALP.2021.113) | 2026-02-28 | +| FAIL | KColoring\ | `kcoloring.rs` | `k^num_vertices` | O\*(2^n) | Inclusion-exclusion / zeta transform | Björklund, Husfeldt, Koivisto 2009 | `2^num_vertices` | [DOI:10.1137/070683933](https://doi.org/10.1137/070683933) | 2026-02-28 | +| PASS | MIS\ (7 variants) | | `2^n` | O\*(1.1996^n) | Measure & Conquer | Xiao & Nagamochi 2017 | `1.1996^num_vertices` | [DOI:10.1016/j.ic.2017.06.001](https://doi.org/10.1016/j.ic.2017.06.001) | 2026-02-28 | +| PASS | MIS\ | | `2^n` | O\*(1.1893^n) | Degree-6 MIS | Xiao & Nagamochi 2017 | `1.1893^num_vertices` | Same as above | 2026-02-28 | +| PASS | MVC\ | | `2^n` | O\*(1.1996^n) | Via MIS complement | Xiao & Nagamochi 2017 | `1.1996^num_vertices` | Same as above | 2026-02-28 | +| PASS | MaxClique\ | | `2^n` | O\*(1.1892^n) exp space | Backtracking+DP | Robson 2001 | `1.1892^num_vertices` | [DOI:10.1016/0196-6774(86)90032-5](https://doi.org/10.1016/0196-6774(86)90032-5) | 2026-02-28 | +| PASS | MDS\ | | `2^n` | O\*(1.4969^n) | Measure & Conquer | van Rooij & Bodlaender 2011 | `1.4969^num_vertices` | [DOI:10.1016/j.dam.2011.07.001](https://doi.org/10.1016/j.dam.2011.07.001) | 2026-02-28 | +| PASS | MaximalIS\ | | `2^n` | O\*(1.4423^n) | MIS enumeration | Moon-Moser 1965; Tomita 2006 | `1.4423^num_vertices` | Tomita et al., TCS 363(1):28–42, 2006 | 2026-02-28 | +| PASS | TSP\ | | `n!` | O\*(2^n · n²) | Held-Karp DP | Held & Karp 1962 | `2^num_vertices` | [DOI:10.1137/0110015](https://doi.org/10.1137/0110015) | 2026-02-28 | +| PASS | MaxCut\ | | `2^n` | O\*(2^{ωn/3}) exp space | 2-CSP optimization | Williams 2005 | `2^(2.372 * num_vertices / 3)` | [DOI:10.1016/j.tcs.2005.09.023](https://doi.org/10.1016/j.tcs.2005.09.023) | 2026-02-28 | +| PASS | KSatisfiability\ | | `2^n` | O\*(1.307^n) | Biased-PPSZ | Hansen, Kaplan, Zamir & Zwick 2019 | `1.307^num_variables` | [DOI:10.1145/3313276.3316359](https://doi.org/10.1145/3313276.3316359) | 2026-02-28 | +| PASS | ILP | | `exp(n)` | exp(O(n log n)) | FPT algorithm | Dadush 2012 | | [Thesis](https://homepages.cwi.nl/~dadush/papers/dadush-thesis.pdf) | 2026-02-28 | +| PASS | Factoring | | `exp(√n)` | exp(O(n^{1/3} (log n)^{2/3})) | GNFS | Lenstra et al. 1993 | | [Springer](https://link.springer.com/chapter/10.1007/BFb0091539) | 2026-02-28 | +| PASS | SAT (general) | | `2^num_variables` | O\*(2^n) | SETH-tight | | | No sub-2^n for unbounded width | 2026-02-28 | +| PASS | KSatisfiability\ | | `2^num_variables` | O\*(2^n) | PPSZ base → 2 as k→∞ | | | | 2026-02-28 | +| PASS | CircuitSAT | | `2^num_inputs` | O\*(2^n) | SETH-tight (Williams 2010) | | | Note: `num_inputs()` getter missing | 2026-02-28 | +| PASS | QUBO\ | | `2^num_vars` | O\*(2^n) | NP-hard | | | No known sub-2^n | 2026-02-28 | +| PASS | SpinGlass (both) | | `2^num_vertices` | O\*(2^n) | NP-hard | Barahona 1982 | | | 2026-02-28 | +| PASS | MaxSetPacking (all) | | `2^num_sets` | O\*(2^m) | Brute-force | | | No known improvement | 2026-02-28 | +| PASS | MinSetCovering | | `2^num_sets` | O\*(2^m) | Brute-force | | | No known improvement | 2026-02-28 | + +## Rules (Reduction Overhead) + +The overhead formula describes how target problem size relates to source problem size. + +| Result | Reduction | Location | Declared | Actual | Fix | Source | Reviewed | +|---|---|---|---|---|---|---|---| +| FAIL | MIS → MaxSetPacking | `maximumindependentset_maximumsetpacking.rs:39` | `universe_size = "num_vertices"` | Universe elements are edge indices (0..num_edges−1) | `universe_size = "num_edges"` | Karp 1972, [DOI:10.1007/978-1-4684-2001-2_9](https://doi.org/10.1007/978-1-4684-2001-2_9) | 2026-02-28 | +| FAIL | MaxSetPacking → MIS | `maximumindependentset_maximumsetpacking.rs:90` | `num_edges = "num_sets"` | Intersection graph: worst case C(n,2) = O(n²) | `num_edges = "num_sets^2"` | Gavril 1972, [DOI:10.1137/0201013](https://doi.org/10.1137/0201013) | 2026-02-28 | +| FAIL | SAT → k-SAT | `sat_ksat.rs:114-117` | `num_clauses = "num_clauses + num_literals"` | Recursive padding underestimates; unit clause → 4 output clauses | `num_clauses = "4 * num_clauses + num_literals"`, `num_vars = "num_vars + 3 * num_clauses + num_literals"` | Sipser 2012, Theorem 7.32 | 2026-02-28 | +| FAIL | Factoring → CircuitSAT | `factoring_circuit.rs:177-180` | `num_variables = "num_bits_first * num_bits_second"` | Each cell creates 6 assignments+variables, plus m+n inputs | `"6 * num_bits_first * num_bits_second + num_bits_first + num_bits_second"` | Paar & Pelzl 2010, [DOI:10.1007/978-3-642-04101-3](https://doi.org/10.1007/978-3-642-04101-3) | 2026-02-28 | +| PASS | MIS ↔ MVC | | `num_vertices = "num_vertices", num_edges = "num_edges"` | | | Gallai 1959; Garey & Johnson 1979 | 2026-02-28 | +| PASS | k-SAT → SAT | | `num_clauses = "num_clauses", num_vars = "num_vars", num_literals = "num_literals"` | | | Trivial embedding | 2026-02-28 | +| PASS | Factoring → ILP | | `num_vars = "2m + 2n + mn"`, `num_constraints = "3mn + m + n + 1"` | | | McCormick 1976, [DOI:10.1007/BF01580665](https://doi.org/10.1007/BF01580665) | 2026-02-28 | diff --git a/docs/src/design.md b/docs/src/design.md index dd2d75937..4913b66ff 100644 --- a/docs/src/design.md +++ b/docs/src/design.md @@ -2,28 +2,30 @@ This guide covers the library internals for contributors. -## Module Overview - -
- -![Module Overview](static/module-overview.svg) - +## Module Architecture + + + +
+
+
+ Core + Graph Models + Formula Models + Set Models + Algebraic Models + Misc Models + Rules + Registry + Solvers + Utilities +
-
- -![Module Overview](static/module-overview-dark.svg) - +
+ Click a module to expand/collapse its public items. + Double-click to open rustdoc.
- -| Module | Purpose | -|--------|---------| -| [`src/models/`](#problem-model) | Problem implementations by input structure: `graph/`, `formula/`, `set/`, `algebraic/`, `misc/` | -| [`src/rules/`](#reduction-rules) | Reduction rules with `ReduceTo` implementations | -| [`src/registry/`](#reduction-graph) | Reduction graph metadata (collected via `inventory`) | -| [`src/solvers/`](#solvers) | BruteForce and ILP solvers | -| `src/traits.rs` | Core `Problem`, `OptimizationProblem`, `SatisfactionProblem` traits (see [Problem Model](#problem-model)) | -| `src/types.rs` | Shared types: `SolutionSize`, `Direction`, `ProblemSize` (see [Problem Model](#problem-model)) | -| `src/variant.rs` | Variant parameter system (see [Variant System](#variant-system)) | +
## Problem Model diff --git a/docs/src/static/module-graph.css b/docs/src/static/module-graph.css new file mode 100644 index 000000000..6689e74c6 --- /dev/null +++ b/docs/src/static/module-graph.css @@ -0,0 +1,53 @@ +#module-graph { + width: 100%; + height: 550px; + border: 1px solid var(--sidebar-bg); + border-radius: 4px; + background: var(--bg); +} + +#mg-controls { + margin-top: 8px; + display: flex; + align-items: center; + justify-content: space-between; + flex-wrap: wrap; + gap: 8px; + font-family: sans-serif; + font-size: 13px; + color: var(--fg); +} + +#mg-legend span.swatch { + display: inline-block; + width: 14px; + height: 14px; + border: 1px solid #999; + margin-right: 3px; + vertical-align: middle; + border-radius: 2px; +} + +#mg-tooltip { + display: none; + position: absolute; + background: var(--bg); + color: var(--fg); + border: 1px solid var(--sidebar-bg); + padding: 6px 10px; + border-radius: 4px; + font-family: sans-serif; + font-size: 13px; + box-shadow: 0 2px 8px rgba(0,0,0,0.15); + pointer-events: none; + z-index: 1000; + max-width: 300px; +} + +#mg-help { + margin-top: 6px; + font-family: sans-serif; + font-size: 12px; + color: var(--fg); + opacity: 0.6; +} diff --git a/docs/src/static/module-graph.js b/docs/src/static/module-graph.js new file mode 100644 index 000000000..0b9381021 --- /dev/null +++ b/docs/src/static/module-graph.js @@ -0,0 +1,257 @@ +document.addEventListener('DOMContentLoaded', function() { + var container = document.getElementById('module-graph'); + if (!container) return; + + var categoryColors = { + core: '#c8f0c8', + model_graph: '#c8c8f0', + model_formula: '#d8c8f0', + model_set: '#f0c8c8', + model_algebraic: '#f0f0a0', + model_misc: '#f0c8e0', + rule: '#f0d8b0', + registry: '#b0e0f0', + solver: '#d0f0d0', + utility: '#e0e0e0' + }; + var categoryBorders = { + core: '#4a8c4a', + model_graph: '#4a4a8c', + model_formula: '#6a4a8c', + model_set: '#8c4a4a', + model_algebraic: '#8c8c4a', + model_misc: '#8c4a6a', + rule: '#8c6a4a', + registry: '#4a6a8c', + solver: '#4a8c6a', + utility: '#888888' + }; + var kindIcons = { + 'struct': 'S', 'enum': 'E', 'function': 'fn', 'trait': 'T', + 'type_alias': 'type', 'constant': 'const' + }; + + // Fixed positions — arranged in logical groups + var fixedPositions = { + // Core (left column) + 'traits': { x: 80, y: 80 }, + 'types': { x: 80, y: 230 }, + 'variant': { x: 80, y: 380 }, + 'topology': { x: 80, y: 530 }, + // Models (center, spread across) + 'models/graph': { x: 340, y: 80 }, + 'models/formula': { x: 340, y: 250 }, + 'models/set': { x: 340, y: 380 }, + 'models/algebraic':{ x: 340, y: 500 }, + 'models/misc': { x: 540, y: 350 }, + // Rules + Registry (right-center) + 'rules': { x: 600, y: 80 }, + 'registry': { x: 600, y: 230 }, + // Solvers + Utilities (far right) + 'solvers': { x: 800, y: 80 }, + 'expr': { x: 800, y: 230 }, + 'io': { x: 800, y: 380 } + }; + + fetch('static/module-graph.json') + .then(function(r) { if (!r.ok) throw new Error('HTTP ' + r.status); return r.json(); }) + .then(function(data) { + var elements = []; + + data.modules.forEach(function(mod) { + var parentId = 'mod_' + mod.name.replace(/\//g, '_'); + var pos = fixedPositions[mod.name] || { x: 500, y: 280 }; + + // Compound parent node + elements.push({ + data: { + id: parentId, + label: mod.name, + category: mod.category, + doc_path: mod.doc_path, + itemCount: mod.items.length, + isParent: true + } + }); + + // Child item nodes + mod.items.forEach(function(item, idx) { + var childId = mod.name.replace(/\//g, '_') + '::' + item.name; + var icon = kindIcons[item.kind] || item.kind; + elements.push({ + data: { + id: childId, + parent: parentId, + label: icon + ' ' + item.name, + fullLabel: mod.name + '::' + item.name, + category: mod.category, + kind: item.kind, + doc: item.doc || '', + isChild: true, + moduleName: mod.name, + itemName: item.name + }, + position: { + x: pos.x, + y: pos.y + 18 + idx * 22 + } + }); + }); + }); + + // Module-level edges + data.edges.forEach(function(e) { + elements.push({ + data: { + id: 'edge_' + e.source.replace(/\//g, '_') + '_' + e.target.replace(/\//g, '_'), + source: 'mod_' + e.source.replace(/\//g, '_'), + target: 'mod_' + e.target.replace(/\//g, '_') + } + }); + }); + + var cy = cytoscape({ + container: container, + elements: elements, + style: [ + // Module nodes (compound parents) + { selector: 'node[?isParent]', style: { + 'label': 'data(label)', + 'text-valign': 'center', 'text-halign': 'center', + 'font-size': '11px', 'font-family': 'monospace', 'font-weight': 'bold', + 'min-width': function(ele) { return Math.max(ele.data('label').length * 7.5 + 20, 80); }, + 'min-height': 36, + 'padding': '4px', + 'shape': 'round-rectangle', + 'background-color': function(ele) { return categoryColors[ele.data('category')] || '#f0f0f0'; }, + 'border-width': 2, + 'border-color': function(ele) { return categoryBorders[ele.data('category')] || '#999'; }, + 'compound-sizing-wrt-labels': 'include', + 'cursor': 'pointer' + }}, + // Expanded parent + { selector: 'node[?isParent].expanded', style: { + 'text-valign': 'top', + 'padding': '10px' + }}, + // Child item nodes + { selector: 'node[?isChild]', style: { + 'label': 'data(label)', + 'text-valign': 'center', 'text-halign': 'center', + 'font-size': '9px', 'font-family': 'monospace', + 'width': function(ele) { return Math.max(ele.data('label').length * 5.5 + 8, 40); }, + 'height': 18, + 'shape': 'round-rectangle', + 'background-color': function(ele) { return categoryColors[ele.data('category')] || '#f0f0f0'; }, + 'border-width': 1, + 'border-color': function(ele) { return categoryBorders[ele.data('category')] || '#999'; } + }}, + // Edges + { selector: 'edge', style: { + 'width': 1.5, 'line-color': '#999', 'target-arrow-color': '#999', + 'target-arrow-shape': 'triangle', 'curve-style': 'bezier', + 'arrow-scale': 0.8, + 'source-distance-from-node': 5, + 'target-distance-from-node': 5 + }} + ], + layout: { name: 'preset' }, + userZoomingEnabled: true, + userPanningEnabled: true, + boxSelectionEnabled: false + }); + + // Initial state: hide all children, position parents at fixed positions + cy.nodes('[?isChild]').style('display', 'none'); + Object.keys(fixedPositions).forEach(function(name) { + var node = cy.getElementById('mod_' + name.replace(/\//g, '_')); + if (node.length) node.position(fixedPositions[name]); + }); + cy.fit(40); + + var expandedParents = {}; + + // Click: toggle expand/collapse + cy.on('tap', 'node[?isParent]', function(evt) { + var parentNode = evt.target; + var parentId = parentNode.id(); + var children = parentNode.children(); + + if (expandedParents[parentId]) { + // Collapse + children.style('display', 'none'); + parentNode.removeClass('expanded'); + expandedParents[parentId] = false; + var name = parentNode.data('label'); + if (fixedPositions[name]) { + parentNode.position(fixedPositions[name]); + } + } else { + // Expand + children.style('display', 'element'); + parentNode.addClass('expanded'); + expandedParents[parentId] = true; + } + }); + + // Rustdoc URL prefixes by kind + var kindPrefix = { + 'function': 'fn', 'struct': 'struct', 'enum': 'enum', + 'trait': 'trait', 'type_alias': 'type', 'constant': 'constant' + }; + + // Double-click: open rustdoc + cy.on('dbltap', 'node[?isParent]', function(evt) { + var d = evt.target.data(); + if (d.doc_path) { + window.open('api/problemreductions/' + d.doc_path, '_blank'); + } + }); + cy.on('dbltap', 'node[?isChild]', function(evt) { + var d = evt.target.data(); + var prefix = kindPrefix[d.kind] || d.kind; + var modPath = d.moduleName.replace(/\//g, '/'); + window.open('api/problemreductions/' + modPath + '/' + prefix + '.' + d.itemName + '.html', '_blank'); + }); + + // Tooltip + var tooltip = document.getElementById('mg-tooltip'); + cy.on('mouseover', 'node[?isParent]', function(evt) { + var d = evt.target.data(); + tooltip.innerHTML = '' + d.label + ' (' + d.itemCount + ' items)
Click to expand, double-click for docs'; + tooltip.style.display = 'block'; + }); + cy.on('mouseover', 'node[?isChild]', function(evt) { + var d = evt.target.data(); + var html = '' + d.fullLabel + '
' + d.kind + ''; + if (d.doc) html += '
' + d.doc + ''; + tooltip.innerHTML = html; + tooltip.style.display = 'block'; + }); + cy.on('mousemove', 'node', function(evt) { + var pos = evt.renderedPosition || evt.position; + var rect = container.getBoundingClientRect(); + tooltip.style.left = (rect.left + window.scrollX + pos.x + 15) + 'px'; + tooltip.style.top = (rect.top + window.scrollY + pos.y - 10) + 'px'; + }); + cy.on('mouseout', 'node', function() { tooltip.style.display = 'none'; }); + + // Edge tooltip + cy.on('mouseover', 'edge', function(evt) { + var src = evt.target.source().data('label'); + var dst = evt.target.target().data('label'); + tooltip.innerHTML = '' + src + ' \u2192 ' + dst + ''; + tooltip.style.display = 'block'; + }); + cy.on('mousemove', 'edge', function(evt) { + var pos = evt.renderedPosition || evt.position; + var rect = container.getBoundingClientRect(); + tooltip.style.left = (rect.left + window.scrollX + pos.x + 15) + 'px'; + tooltip.style.top = (rect.top + window.scrollY + pos.y - 10) + 'px'; + }); + cy.on('mouseout', 'edge', function() { tooltip.style.display = 'none'; }); + }) + .catch(function(err) { + container.innerHTML = '

Failed to load module graph: ' + err.message + '

'; + }); +}); diff --git a/problemreductions-cli/src/commands/graph.rs b/problemreductions-cli/src/commands/graph.rs index 05bc9f28f..9d7d8d1ad 100644 --- a/problemreductions-cli/src/commands/graph.rs +++ b/problemreductions-cli/src/commands/graph.rs @@ -304,6 +304,15 @@ fn format_path_text( } } + // Show composed overall overhead for multi-step paths + if reduction_path.len() > 1 { + let composed = graph.compose_path_overhead(reduction_path); + text.push_str(&format!("\n {}:\n", crate::output::fmt_section("Overall"))); + for (field, poly) in &composed.output_size { + text.push_str(&format!(" {field} = {poly}\n")); + } + } + text } @@ -329,9 +338,17 @@ fn format_path_json( }) .collect(); + let composed = graph.compose_path_overhead(reduction_path); + let overall: Vec = composed + .output_size + .iter() + .map(|(field, poly)| serde_json::json!({"field": field, "formula": poly.to_string()})) + .collect(); + serde_json::json!({ "steps": reduction_path.len(), "path": steps_json, + "overall_overhead": overall, }) } diff --git a/problemreductions-cli/src/mcp/tools.rs b/problemreductions-cli/src/mcp/tools.rs index f7a940267..9f478a150 100644 --- a/problemreductions-cli/src/mcp/tools.rs +++ b/problemreductions-cli/src/mcp/tools.rs @@ -1067,9 +1067,17 @@ fn format_path_json( }) .collect(); + let composed = graph.compose_path_overhead(reduction_path); + let overall: Vec = composed + .output_size + .iter() + .map(|(field, poly)| serde_json::json!({"field": field, "formula": poly.to_string()})) + .collect(); + serde_json::json!({ "steps": reduction_path.len(), "path": steps_json, + "overall_overhead": overall, }) } diff --git a/problemreductions-cli/tests/cli_tests.rs b/problemreductions-cli/tests/cli_tests.rs index 5dbcef629..28396b507 100644 --- a/problemreductions-cli/tests/cli_tests.rs +++ b/problemreductions-cli/tests/cli_tests.rs @@ -1274,6 +1274,132 @@ fn test_path_unknown_cost() { assert!(stderr.contains("Unknown cost function")); } +#[test] +fn test_path_overall_overhead_text() { + // Use a multi-step path so the "Overall" section appears + let output = pred() + .args(["path", "3SAT", "MIS"]) + .output() + .unwrap(); + assert!(output.status.success()); + let stdout = String::from_utf8(output.stdout).unwrap(); + assert!(stdout.contains("Overall"), "multi-step path should show Overall overhead"); +} + +#[test] +fn test_path_overall_overhead_json() { + let tmp = std::env::temp_dir().join("pred_test_path_overall.json"); + let output = pred() + .args(["path", "3SAT", "MIS", "-o", tmp.to_str().unwrap()]) + .output() + .unwrap(); + assert!(output.status.success()); + let content = std::fs::read_to_string(&tmp).unwrap(); + let json: serde_json::Value = serde_json::from_str(&content).unwrap(); + assert!(json["overall_overhead"].is_array(), "JSON should contain overall_overhead"); + let items = json["overall_overhead"].as_array().unwrap(); + assert!(!items.is_empty(), "overall_overhead should have entries"); + assert!(items[0]["field"].is_string()); + assert!(items[0]["formula"].is_string()); + std::fs::remove_file(&tmp).ok(); +} + +#[test] +fn test_path_overall_overhead_composition() { + // Verify that overall overhead is the symbolic composition of per-step overheads, + // not just the last step's overhead. For a multi-step path A→B→C, the overall + // should substitute B's output expressions into C's input expressions. + let tmp = std::env::temp_dir().join("pred_test_path_composition.json"); + // 3SAT → SAT → MIS gives a 2-step path where: + // Step 1 (3SAT→SAT): num_literals = num_literals (identity) + // Step 2 (SAT→MIS): num_vertices = num_literals, num_edges = num_literals^2 + // Overall: num_vertices = num_literals, num_edges = num_literals^2 + let output = pred() + .args(["path", "3SAT", "MIS", "-o", tmp.to_str().unwrap()]) + .output() + .unwrap(); + assert!(output.status.success()); + let content = std::fs::read_to_string(&tmp).unwrap(); + let json: serde_json::Value = serde_json::from_str(&content).unwrap(); + + // Must have exactly 2 steps + assert_eq!(json["steps"].as_u64().unwrap(), 2); + + // Collect overall overhead into a map + let overall: std::collections::HashMap = json["overall_overhead"] + .as_array() + .unwrap() + .iter() + .map(|e| { + ( + e["field"].as_str().unwrap().to_string(), + e["formula"].as_str().unwrap().to_string(), + ) + }) + .collect(); + + // The composed overhead should reference source (3SAT) variables, not intermediate ones. + // num_vertices and num_edges should both be expressed in terms of num_literals. + assert!( + overall.contains_key("num_vertices"), + "overall should have num_vertices" + ); + assert!( + overall.contains_key("num_edges"), + "overall should have num_edges" + ); + assert!( + overall["num_vertices"].contains("num_literals"), + "num_vertices should be in terms of source vars, got: {}", + overall["num_vertices"] + ); + assert!( + overall["num_edges"].contains("num_literals"), + "num_edges should be in terms of source vars, got: {}", + overall["num_edges"] + ); + + std::fs::remove_file(&tmp).ok(); +} + +#[test] +fn test_path_all_overall_overhead() { + // Every path in --all --json output should have overall_overhead + let output = pred() + .args(["path", "3SAT", "MIS", "--all", "--json"]) + .output() + .unwrap(); + assert!(output.status.success()); + let stdout = String::from_utf8(output.stdout).unwrap(); + let paths: Vec = serde_json::from_str(&stdout).unwrap(); + assert!(!paths.is_empty()); + for (i, p) in paths.iter().enumerate() { + assert!( + p["overall_overhead"].is_array(), + "path {} missing overall_overhead", + i + 1 + ); + let items = p["overall_overhead"].as_array().unwrap(); + assert!( + !items.is_empty(), + "path {} has empty overall_overhead", + i + 1 + ); + } +} + +#[test] +fn test_path_single_step_no_overall_text() { + // Single-step path should NOT show the Overall section + let output = pred() + .args(["path", "MIS", "QUBO"]) + .output() + .unwrap(); + assert!(output.status.success()); + let stdout = String::from_utf8(output.stdout).unwrap(); + assert!(!stdout.contains("Overall"), "single-step path should not show Overall"); +} + #[test] fn test_show_json_output() { let tmp = std::env::temp_dir().join("pred_test_show.json"); diff --git a/scripts/gen_module_graph.py b/scripts/gen_module_graph.py new file mode 100644 index 000000000..360e66f81 --- /dev/null +++ b/scripts/gen_module_graph.py @@ -0,0 +1,247 @@ +#!/usr/bin/env python3 +"""Generate module-graph.json for the mdbook interactive architecture page. + +Reads the rustdoc JSON output and produces a Cytoscape-ready graph with: +- Module nodes (compound, color-coded by category) +- Public item nodes (children of modules) +- Dependency edges between modules + +Usage: + cargo +nightly rustdoc -- -Z unstable-options --output-format json + python3 scripts/gen_module_graph.py +""" +import json +import sys +from pathlib import Path + +RUSTDOC_JSON = Path("target/doc/problemreductions.json") +OUTPUT = Path("docs/src/static/module-graph.json") +CRATE_NAME = "problemreductions" + +# Category assignment by full module path +CATEGORIES = { + "traits": "core", + "types": "core", + "variant": "core", + "topology": "core", + "models/graph": "model_graph", + "models/formula": "model_formula", + "models/set": "model_set", + "models/algebraic": "model_algebraic", + "models/misc": "model_misc", + "rules": "rule", + "registry": "registry", + "solvers": "solver", + "io": "utility", + "export": "utility", + "config": "utility", + "error": "utility", +} + +# Only include these top-level and second-level modules in the graph +INCLUDED_MODULES = { + "traits", + "types", + "variant", + "topology", + "models/graph", + "models/formula", + "models/set", + "models/algebraic", + "models/misc", + "rules", + "registry", + "solvers", + "io", + "export", +} + +# Item kinds to include in the child list +INCLUDE_KINDS = {"struct", "enum", "trait", "function", "type_alias", "constant"} + + +def main(): + if not RUSTDOC_JSON.exists(): + print( + "Error: {} not found. Run:\n" + " cargo +nightly rustdoc -- -Z unstable-options --output-format json".format( + RUSTDOC_JSON + ), + file=sys.stderr, + ) + sys.exit(1) + + with open(RUSTDOC_JSON) as f: + data = json.load(f) + + index = data["index"] + + # 1. Build parent map to reconstruct full module paths + parent_map = {} # child_module_id → (parent_module_id, parent_name) + for item_id, item in index.items(): + inner = item.get("inner", {}) + if not isinstance(inner, dict) or "module" not in inner: + continue + for child_id in inner["module"].get("items", []): + cid = str(child_id) + if cid in index: + child_inner = index[cid].get("inner", {}) + if isinstance(child_inner, dict) and "module" in child_inner: + parent_map[cid] = (item_id, item.get("name")) + + def get_full_path(item_id): + parts = [index[item_id].get("name")] + current = item_id + while current in parent_map: + current = parent_map[current][0] + parts.append(index[current].get("name")) + parts.reverse() + return "/".join(p for p in parts if p != CRATE_NAME) + + def resolve_item(item_id): + """Follow re-exports (use) to get the actual item.""" + cid = str(item_id) + seen = set() + while cid in index and cid not in seen: + seen.add(cid) + item = index[cid] + inner = item.get("inner", {}) + if isinstance(inner, dict) and "use" in inner: + target_id = str(inner["use"].get("id", "")) + if target_id and target_id in index: + cid = target_id + else: + return cid + else: + return cid + return cid + + # 2. Find included modules and their public items + item_to_module = {} # item_id → module full path + modules = {} + for item_id, item in index.items(): + inner = item.get("inner", {}) + if not isinstance(inner, dict) or "module" not in inner: + continue + name = item.get("name") + if name == CRATE_NAME: + continue + full_path = get_full_path(item_id) + if full_path not in INCLUDED_MODULES: + continue + + children = [] + for child_id in inner["module"].get("items", []): + cid = str(child_id) + # Register the direct child for dependency tracking + item_to_module[cid] = full_path + + # Resolve through re-exports to find the actual item + resolved_id = resolve_item(cid) + item_to_module[resolved_id] = full_path + + if resolved_id not in index: + continue + child = index[resolved_id] + child_inner = child.get("inner", {}) + if not isinstance(child_inner, dict): + continue + kind = list(child_inner.keys())[0] if child_inner else "unknown" + if kind not in INCLUDE_KINDS: + continue + doc = child.get("docs", "") or "" + doc_summary = doc.strip().split("\n")[0][:120] if doc.strip() else "" + child_name = child.get("name") + if child_name: + children.append( + {"name": child_name, "kind": kind, "doc": doc_summary} + ) + + doc_path = full_path.replace("/", "/") + "/index.html" + modules[full_path] = { + "name": full_path, + "category": CATEGORIES.get(full_path, "utility"), + "doc_path": doc_path, + "items": children, + } + + # 3. Extract inter-module dependencies by scanning type references + def find_ids(obj, found=None): + if found is None: + found = set() + if isinstance(obj, dict): + for k, v in obj.items(): + if k == "id" and isinstance(v, (int, str)): + found.add(str(v)) + else: + find_ids(v, found) + elif isinstance(obj, list): + for v in obj: + find_ids(v, found) + return found + + edges = set() + for item_id, item in index.items(): + src = item_to_module.get(item_id) + if not src: + continue + inner = item.get("inner", {}) + if not isinstance(inner, dict): + continue + for ref_id in find_ids(inner): + dst = item_to_module.get(ref_id) + if dst and dst != src: + edges.add((src, dst)) + + # 4. Add well-known architectural edges that rustdoc heuristics may miss + # (re-exports at crate root obscure the actual module→module dependencies) + KNOWN_EDGES = [ + # All model categories depend on core + ("models/graph", "traits"), + ("models/graph", "types"), + ("models/graph", "topology"), + ("models/graph", "variant"), + ("models/formula", "traits"), + ("models/formula", "types"), + ("models/set", "traits"), + ("models/set", "types"), + ("models/algebraic", "traits"), + ("models/algebraic", "types"), + ("models/misc", "traits"), + ("models/misc", "types"), + # Rules depend on models and core + ("rules", "traits"), + ("rules", "types"), + # Registry depends on rules and core + ("registry", "rules"), + ("registry", "traits"), + ("registry", "types"), + # Solvers depend on core + ("solvers", "traits"), + ("solvers", "types"), + # IO depends on core + ("io", "traits"), + # Topology uses variant system + ("topology", "variant"), + ] + for src, dst in KNOWN_EDGES: + if src in modules and dst in modules: + edges.add((src, dst)) + + # 5. Write output + OUTPUT.parent.mkdir(parents=True, exist_ok=True) + result = { + "modules": sorted(modules.values(), key=lambda m: m["name"]), + "edges": [{"source": s, "target": t} for s, t in sorted(edges)], + } + with open(OUTPUT, "w") as f: + json.dump(result, f, indent=2) + print( + "Wrote {} ({} modules, {} edges)".format( + OUTPUT, len(result["modules"]), len(result["edges"]) + ) + ) + + +if __name__ == "__main__": + main() From 012c5d7cc31b4a49f5fe095f717ba19271ff08ed Mon Sep 17 00:00:00 2001 From: GiggleLiu Date: Fri, 6 Mar 2026 20:16:56 +0900 Subject: [PATCH 3/9] update module overview --- .claude/skills/test-feature/SKILL.md | 91 ++++++++++++++++++++++++++++ docs/paper/collaborate.md | 20 ++++++ docs/src/design.md | 6 +- docs/src/static/module-graph.js | 26 ++++---- 4 files changed, 125 insertions(+), 18 deletions(-) create mode 100644 .claude/skills/test-feature/SKILL.md create mode 100644 docs/paper/collaborate.md diff --git a/.claude/skills/test-feature/SKILL.md b/.claude/skills/test-feature/SKILL.md new file mode 100644 index 000000000..3352daaef --- /dev/null +++ b/.claude/skills/test-feature/SKILL.md @@ -0,0 +1,91 @@ +--- +name: test-feature +description: Use when testing a project feature from a user's perspective — simulates a downstream user who reads README and docs, then writes and runs code exercising the feature end-to-end, reporting on discoverability, functionality, and doc quality +--- + +## Test Feature + +Simulates a downstream user who wants to use a specific project feature. The user reads README and documentation, writes code exercising the feature, and reports whether the experience matches what the docs promise. + +**Input:** User specifies feature name(s) (e.g., `/test-feature MCP server`). If none specified, discover features from README/docs and test all. + +--- + +### Step 0 — Discover Features + +1. Read `README.md`, doc files, and project structure to identify user-facing features (e.g., "library API", "CLI tool", "MCP server", "plugin system"). +2. Determine scope: + - User specified feature(s): test only those + - User specified nothing: list discovered features, then test all +3. For each feature, collect the relevant doc sections — installation instructions, usage examples, API references. + +Print a brief summary of features to test. Proceed immediately. + +### Step 1 — Test Each Feature + +For each feature, dispatch a **subagent** (via Agent tool). Give the subagent: + +- **Role:** A lightweight user description relevant to the feature (e.g., "a researcher who wants to solve optimization problems via CLI", "a developer integrating this library into their web service"). Not a full persona — just enough to set expectations and domain knowledge level. +- **Docs:** The README and relevant doc excerpts for this feature. +- **Instructions:** + +``` +You are [role description]. You want to use the "[feature]" capability of this project. + +Here is the README: +[content] + +Here are the relevant docs: +[excerpts] + +Your task — act as a real user: +1. Read the docs to figure out how to use "[feature]". +2. Follow the installation/setup instructions. +3. Write and run code (or commands) that exercises the feature meaningfully. +4. Report back with: + - **Discoverability:** Could you figure out how to use this from docs alone? What was missing? + - **Setup:** Did installation/setup work as described? + - **Functionality:** Did the feature work? What succeeded, what failed? + - **Friction points:** What was confusing, misleading, or undocumented? + - **Doc suggestions:** What would you add or change in the docs? +``` + +**Parallelism:** Independent features can be tested in parallel via multiple subagents. + +### Step 2 — Report + +Gather results from all subagents. Save report to `docs/test-reports/test-feature-.md`: + +```markdown +# Feature Test Report: [project name] + +**Date:** [timestamp] +**Features tested:** [list] + +## Summary + +| Feature | Discoverable | Setup | Works | Doc Quality | +|---------|-------------|-------|-------|-------------| +| CLI tool | yes | yes | yes | good | +| MCP server | partial | yes | yes | missing config example | +| Library API | yes | yes | no | outdated example | + +## Per-Feature Details + +### [feature name] +- **Role:** [who the simulated user was] +- **What they tried:** [brief description] +- **Discoverability:** [could they find how to use it from docs alone?] +- **Setup:** [did installation work as described?] +- **Functionality:** [what worked, what didn't] +- **Friction points:** [what was confusing or missing] +- **Doc suggestions:** [what would help a real user] + +## Issues Found +[problems discovered, ordered by severity] + +## Suggestions +[actionable improvements ordered by impact] +``` + +Present the report path. Offer to fix documentation gaps or re-test specific features. diff --git a/docs/paper/collaborate.md b/docs/paper/collaborate.md new file mode 100644 index 000000000..7cd004aea --- /dev/null +++ b/docs/paper/collaborate.md @@ -0,0 +1,20 @@ +# Workflow +1. File an issue +2. Validate issue: + - Rule: 1. algorithm is useful, correct, detailed, and optimal, 2. doc contains correct reduction overhead, solid reference, a clear round trip test proposal. + - Model: 1. useful, classification correct, can be related to existing problems, 2. doc contains correct solving complexity, solid reference. + - Source white list - build up white list and dataset. + - Issue check record, saved in a markdown file. +3. Implement +4. Review implementation: + - Generic code quality check. + - Faithful to issue, especially the round trip test is not changed, or ignored. + + Remark: in this phase, we do not focus too much on the documentation quality. + +5. Agentic test: + - Setup senarios, e.g. + 1. a student learning the reduction book, and is wondering how to prove A is NP-complete + 2. an algorithm developer develop an algorithm solving problem X, he use pred to check the existence of Y, such that by Y -> X, solving X gives better complexity against existing best known solver for Y. + 3. an engineer want to solve problem X, use pred to find the best algorithm. + Each agent provides a report about "how to improve". diff --git a/docs/src/design.md b/docs/src/design.md index 4913b66ff..c5a523b57 100644 --- a/docs/src/design.md +++ b/docs/src/design.md @@ -10,11 +10,7 @@ This guide covers the library internals for contributors.
Core - Graph Models - Formula Models - Set Models - Algebraic Models - Misc Models + Models Rules Registry Solvers diff --git a/docs/src/static/module-graph.js b/docs/src/static/module-graph.js index 0b9381021..1e131ffab 100644 --- a/docs/src/static/module-graph.js +++ b/docs/src/static/module-graph.js @@ -5,10 +5,10 @@ document.addEventListener('DOMContentLoaded', function() { var categoryColors = { core: '#c8f0c8', model_graph: '#c8c8f0', - model_formula: '#d8c8f0', - model_set: '#f0c8c8', - model_algebraic: '#f0f0a0', - model_misc: '#f0c8e0', + model_formula: '#c8c8f0', + model_set: '#c8c8f0', + model_algebraic: '#c8c8f0', + model_misc: '#c8c8f0', rule: '#f0d8b0', registry: '#b0e0f0', solver: '#d0f0d0', @@ -17,10 +17,10 @@ document.addEventListener('DOMContentLoaded', function() { var categoryBorders = { core: '#4a8c4a', model_graph: '#4a4a8c', - model_formula: '#6a4a8c', - model_set: '#8c4a4a', - model_algebraic: '#8c8c4a', - model_misc: '#8c4a6a', + model_formula: '#4a4a8c', + model_set: '#4a4a8c', + model_algebraic: '#4a4a8c', + model_misc: '#4a4a8c', rule: '#8c6a4a', registry: '#4a6a8c', solver: '#4a8c6a', @@ -38,12 +38,12 @@ document.addEventListener('DOMContentLoaded', function() { 'types': { x: 80, y: 230 }, 'variant': { x: 80, y: 380 }, 'topology': { x: 80, y: 530 }, - // Models (center, spread across) + // Models (center column) 'models/graph': { x: 340, y: 80 }, - 'models/formula': { x: 340, y: 250 }, - 'models/set': { x: 340, y: 380 }, - 'models/algebraic':{ x: 340, y: 500 }, - 'models/misc': { x: 540, y: 350 }, + 'models/formula': { x: 340, y: 230 }, + 'models/set': { x: 340, y: 350 }, + 'models/algebraic':{ x: 340, y: 450 }, + 'models/misc': { x: 340, y: 550 }, // Rules + Registry (right-center) 'rules': { x: 600, y: 80 }, 'registry': { x: 600, y: 230 }, From 68c80db2609f113a25ed489b73c215d36ddd16c2 Mon Sep 17 00:00:00 2001 From: GiggleLiu Date: Fri, 6 Mar 2026 20:25:38 +0900 Subject: [PATCH 4/9] =?UTF-8?q?remove=20collaborate.md=20=E2=80=94=20conte?= =?UTF-8?q?nt=20moved=20to=20issue=20#187?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The collaboration workflow draft has been reviewed and filed as GitHub issue #187 for further refinement and discussion. Co-Authored-By: Claude Opus 4.6 --- docs/paper/collaborate.md | 20 -------------------- 1 file changed, 20 deletions(-) delete mode 100644 docs/paper/collaborate.md diff --git a/docs/paper/collaborate.md b/docs/paper/collaborate.md deleted file mode 100644 index 7cd004aea..000000000 --- a/docs/paper/collaborate.md +++ /dev/null @@ -1,20 +0,0 @@ -# Workflow -1. File an issue -2. Validate issue: - - Rule: 1. algorithm is useful, correct, detailed, and optimal, 2. doc contains correct reduction overhead, solid reference, a clear round trip test proposal. - - Model: 1. useful, classification correct, can be related to existing problems, 2. doc contains correct solving complexity, solid reference. - - Source white list - build up white list and dataset. - - Issue check record, saved in a markdown file. -3. Implement -4. Review implementation: - - Generic code quality check. - - Faithful to issue, especially the round trip test is not changed, or ignored. - - Remark: in this phase, we do not focus too much on the documentation quality. - -5. Agentic test: - - Setup senarios, e.g. - 1. a student learning the reduction book, and is wondering how to prove A is NP-complete - 2. an algorithm developer develop an algorithm solving problem X, he use pred to check the existence of Y, such that by Y -> X, solving X gives better complexity against existing best known solver for Y. - 3. an engineer want to solve problem X, use pred to find the best algorithm. - Each agent provides a report about "how to improve". From b28b07ef129a3a755e3e0831e8dd784f8e0a4887 Mon Sep 17 00:00:00 2001 From: GiggleLiu Date: Fri, 6 Mar 2026 21:58:28 +0900 Subject: [PATCH 5/9] update --- .claude/skills/check-issue/SKILL.md | 4 +- .claude/skills/test-feature/SKILL.md | 91 ---------------------------- .gitignore | 1 + Makefile | 12 ++-- 4 files changed, 12 insertions(+), 96 deletions(-) delete mode 100644 .claude/skills/test-feature/SKILL.md diff --git a/.claude/skills/check-issue/SKILL.md b/.claude/skills/check-issue/SKILL.md index 41f1659a9..2e1ecefcb 100644 --- a/.claude/skills/check-issue/SKILL.md +++ b/.claude/skills/check-issue/SKILL.md @@ -95,6 +95,8 @@ Read the "Reduction Algorithm" section and flag as **Fail** if: If the construction involves gadgets, penalty terms, auxiliary variables, or non-trivial structural transformation → **Pass**. +Note: if a trivial reduction make original disconnected problems connected, it is also **Pass**. + --- ## Rule Check 3: Correctness (fail label: `Wrong`) @@ -128,7 +130,7 @@ For each reference, verify: If any cited fact **cannot be verified** (paper not found, claim not in paper) → **Fail** with specifics. -### 3d: Better Algorithm Discovery +### 3d: Better Algorithm Discovery (not fatal) While searching, if you find: - A **more recent paper** that supersedes the cited reference diff --git a/.claude/skills/test-feature/SKILL.md b/.claude/skills/test-feature/SKILL.md deleted file mode 100644 index 3352daaef..000000000 --- a/.claude/skills/test-feature/SKILL.md +++ /dev/null @@ -1,91 +0,0 @@ ---- -name: test-feature -description: Use when testing a project feature from a user's perspective — simulates a downstream user who reads README and docs, then writes and runs code exercising the feature end-to-end, reporting on discoverability, functionality, and doc quality ---- - -## Test Feature - -Simulates a downstream user who wants to use a specific project feature. The user reads README and documentation, writes code exercising the feature, and reports whether the experience matches what the docs promise. - -**Input:** User specifies feature name(s) (e.g., `/test-feature MCP server`). If none specified, discover features from README/docs and test all. - ---- - -### Step 0 — Discover Features - -1. Read `README.md`, doc files, and project structure to identify user-facing features (e.g., "library API", "CLI tool", "MCP server", "plugin system"). -2. Determine scope: - - User specified feature(s): test only those - - User specified nothing: list discovered features, then test all -3. For each feature, collect the relevant doc sections — installation instructions, usage examples, API references. - -Print a brief summary of features to test. Proceed immediately. - -### Step 1 — Test Each Feature - -For each feature, dispatch a **subagent** (via Agent tool). Give the subagent: - -- **Role:** A lightweight user description relevant to the feature (e.g., "a researcher who wants to solve optimization problems via CLI", "a developer integrating this library into their web service"). Not a full persona — just enough to set expectations and domain knowledge level. -- **Docs:** The README and relevant doc excerpts for this feature. -- **Instructions:** - -``` -You are [role description]. You want to use the "[feature]" capability of this project. - -Here is the README: -[content] - -Here are the relevant docs: -[excerpts] - -Your task — act as a real user: -1. Read the docs to figure out how to use "[feature]". -2. Follow the installation/setup instructions. -3. Write and run code (or commands) that exercises the feature meaningfully. -4. Report back with: - - **Discoverability:** Could you figure out how to use this from docs alone? What was missing? - - **Setup:** Did installation/setup work as described? - - **Functionality:** Did the feature work? What succeeded, what failed? - - **Friction points:** What was confusing, misleading, or undocumented? - - **Doc suggestions:** What would you add or change in the docs? -``` - -**Parallelism:** Independent features can be tested in parallel via multiple subagents. - -### Step 2 — Report - -Gather results from all subagents. Save report to `docs/test-reports/test-feature-.md`: - -```markdown -# Feature Test Report: [project name] - -**Date:** [timestamp] -**Features tested:** [list] - -## Summary - -| Feature | Discoverable | Setup | Works | Doc Quality | -|---------|-------------|-------|-------|-------------| -| CLI tool | yes | yes | yes | good | -| MCP server | partial | yes | yes | missing config example | -| Library API | yes | yes | no | outdated example | - -## Per-Feature Details - -### [feature name] -- **Role:** [who the simulated user was] -- **What they tried:** [brief description] -- **Discoverability:** [could they find how to use it from docs alone?] -- **Setup:** [did installation work as described?] -- **Functionality:** [what worked, what didn't] -- **Friction points:** [what was confusing or missing] -- **Doc suggestions:** [what would help a real user] - -## Issues Found -[problems discovered, ordered by severity] - -## Suggestions -[actionable improvements ordered by impact] -``` - -Present the report path. Offer to fix documentation gaps or re-test specific features. diff --git a/.gitignore b/.gitignore index 9934dbb94..963f425e3 100644 --- a/.gitignore +++ b/.gitignore @@ -85,3 +85,4 @@ claude-output.log .worktree/ *.json .claude/worktrees/ +docs/test-reports/ diff --git a/Makefile b/Makefile index 5a2fd6f95..7aca87558 100644 --- a/Makefile +++ b/Makefile @@ -76,10 +76,14 @@ diagrams: # Build and serve mdBook with API docs mdbook: - cargo run --example export_graph - cargo run --example export_schemas - RUSTDOCFLAGS="--default-theme=dark" cargo doc --features ilp-highs --no-deps - mdbook build + @echo "Exporting graph..." + @cargo run --example export_graph 2>&1 | tail -1 + @echo "Exporting schemas..." + @cargo run --example export_schemas 2>&1 | tail -1 + @echo "Building API docs..." + @RUSTDOCFLAGS="--default-theme=dark" cargo doc --features ilp-highs --no-deps 2>&1 | tail -1 + @echo "Building mdBook..." + @mdbook build rm -rf book/api cp -r target/doc book/api @-lsof -ti:3001 | xargs kill 2>/dev/null || true From 7a4f762b83e0f225df91fe141d226e02b7455703 Mon Sep 17 00:00:00 2001 From: GiggleLiu Date: Fri, 6 Mar 2026 22:47:06 +0900 Subject: [PATCH 6/9] update review-implementation --- .claude/skills/review-implementation/SKILL.md | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/.claude/skills/review-implementation/SKILL.md b/.claude/skills/review-implementation/SKILL.md index e12a42707..ed8a4467c 100644 --- a/.claude/skills/review-implementation/SKILL.md +++ b/.claude/skills/review-implementation/SKILL.md @@ -107,9 +107,8 @@ When both subagents return: 1. **Parse results** -- identify FAIL/ISSUE items from both reports 2. **Fix automatically** -- structural FAILs (missing registration, missing file), clear semantic issues, Important+ quality issues -3. **Update complexity review table** -- for any new/modified model or rule, add a row to `docs/src/complexity-review.md` with result PASS/FAIL/UNVERIFIED and review date -4. **Report to user** -- ambiguous semantic issues, Minor quality items, anything you're unsure about -5. **Present consolidated report** combining both reviews +3. **Report to user** -- ambiguous semantic issues, Minor quality items, anything you're unsure about +4. **Present consolidated report** combining both reviews ## Step 5: Present Consolidated Report @@ -144,11 +143,10 @@ Merge both subagent outputs into a single report: ### Test Quality (from quality reviewer) ... -### Complexity & Overhead Review -- Update `docs/src/complexity-review.md` with review results for any new or modified models/rules -- Models table: verify `declare_variants!` complexity against best known algorithm -- Rules table: verify `#[reduction(overhead)]` against actual `reduce_to()` code -- Result: PASS / FAIL / UNVERIFIED +### Overhead Consistency Check +- Rules: verify `#[reduction(overhead)]` expressions match actual sizes constructed in `reduce_to()` code +- Models: verify `dims()` and getter methods are consistent with struct fields +- Result: PASS / FAIL ### Fixes Applied - [list of issues automatically fixed by main agent] From d6530b566b88109eebaf25581e23254f8dee938a Mon Sep 17 00:00:00 2001 From: GiggleLiu Date: Sat, 7 Mar 2026 00:03:32 +0900 Subject: [PATCH 7/9] update skills --- .claude/skills/add-model/SKILL.md | 2 +- .claude/skills/add-rule/SKILL.md | 2 +- .claude/skills/check-issue/SKILL.md | 9 +- .claude/skills/issue-to-pr/SKILL.md | 193 +++++++++++++++--- .claude/skills/meta-power/SKILL.md | 134 ++++-------- .../structural-reviewer-prompt.md | 28 +-- 6 files changed, 232 insertions(+), 136 deletions(-) diff --git a/.claude/skills/add-model/SKILL.md b/.claude/skills/add-model/SKILL.md index ffe1bbad7..66ae6ee6a 100644 --- a/.claude/skills/add-model/SKILL.md +++ b/.claude/skills/add-model/SKILL.md @@ -147,7 +147,7 @@ Invoke the `/write-model-in-paper` skill to write the problem-def entry in `docs make test clippy # Must pass ``` -Then run the [review-implementation](../review-implementation/SKILL.md) skill to verify all structural and semantic checks pass. +If running standalone (not inside `make run-plan`), invoke [review-implementation](../review-implementation/SKILL.md) to verify all structural and semantic checks pass. When running inside a plan, the outer orchestrator handles the review. ## Naming Conventions diff --git a/.claude/skills/add-rule/SKILL.md b/.claude/skills/add-rule/SKILL.md index 04a1d5da6..54639e310 100644 --- a/.claude/skills/add-rule/SKILL.md +++ b/.claude/skills/add-rule/SKILL.md @@ -139,7 +139,7 @@ cargo run --example export_schemas # Update problem schemas make test clippy # Must pass ``` -Then run the [review-implementation](../review-implementation/SKILL.md) skill to verify all structural and semantic checks pass. +If running standalone (not inside `make run-plan`), invoke [review-implementation](../review-implementation/SKILL.md) to verify all structural and semantic checks pass. When running inside a plan, the outer orchestrator handles the review. ## Solver Rules diff --git a/.claude/skills/check-issue/SKILL.md b/.claude/skills/check-issue/SKILL.md index 2e1ecefcb..10e188bf0 100644 --- a/.claude/skills/check-issue/SKILL.md +++ b/.claude/skills/check-issue/SKILL.md @@ -383,11 +383,18 @@ Post a single GitHub comment. The table adapts to the issue type: ### Label Application ```bash -# Only add labels for FAILED checks (not warnings) +# Add labels for FAILED checks (not warnings) gh issue edit --add-label "Useless" # if Check 1 failed gh issue edit --add-label "Trivial" # if Check 2 failed gh issue edit --add-label "Wrong" # if Check 3 failed gh issue edit --add-label "PoorWritten" # if Check 4 failed + +# If ALL checks passed (no failures), add the "Good" label +gh issue edit --add-label "Good" + +# If re-checking after fixes, remove stale failure labels and add "Good" if now passing +gh issue edit --remove-label "Useless,Trivial,Wrong,PoorWritten" 2>/dev/null +gh issue edit --add-label "Good" ``` **Never close the issue.** Labels and comments only. diff --git a/.claude/skills/issue-to-pr/SKILL.md b/.claude/skills/issue-to-pr/SKILL.md index 64f8c56ed..30187f40f 100644 --- a/.claude/skills/issue-to-pr/SKILL.md +++ b/.claude/skills/issue-to-pr/SKILL.md @@ -5,25 +5,31 @@ description: Use when you have a GitHub issue and want to create a PR with an im # Issue to PR -Convert a GitHub issue into an actionable PR with a plan. This skill validates the issue, then dispatches to the appropriate implementation skill. +Convert a GitHub issue into an actionable PR with a plan. Optionally execute the plan immediately using subagent-driven-development. + +## Invocation + +- `/issue-to-pr 42` — create PR with plan only +- `/issue-to-pr 42 --execute` — create PR, then execute the plan and review ## Workflow ``` -Receive issue number +Receive issue number [+ --execute flag] -> Fetch issue with gh - -> Classify: [Model] or [Rule]? - -> Validate against checklist (from add-model or add-rule) - -> If incomplete: comment on issue, STOP - -> If complete: research references, write plan, create PR + -> Verify Good label (from check-issue) + -> If not Good: STOP + -> If Good: research references, write plan, create PR + -> If --execute: run plan via subagent-driven-development, then review-implementation ``` ## Steps ### 1. Parse Input -Extract issue number from argument: -- `123` -> issue #123 +Extract issue number and flags from arguments: +- `123` -> issue #123, plan only +- `123 --execute` -> issue #123, plan + execute - `https://github.com/owner/repo/issues/123` -> issue #123 - `owner/repo#123` -> issue #123 in owner/repo @@ -35,17 +41,18 @@ gh issue view --json title,body,labels,assignees Present issue summary to user. -### 3. Classify and Validate +### 3. Verify Issue Has Passed check-issue -Determine the issue type from its title/labels: -- **`[Model]`** issues -> validate against the checklist in [add-model](../add-model/SKILL.md) Step 0 -- **`[Rule]`** issues -> validate against the checklist in [add-rule](../add-rule/SKILL.md) Step 0 +The issue must have already passed the `check-issue` quality gate (Stage 1 validation). Do NOT re-validate the issue here. -Check every item in the relevant checklist against the issue body. Verify facts provided by the user -- feel free to use `WebSearch` and `WebFetch` to cross-check claims. +**Gate condition:** The issue must have the `Good` label (added by `check-issue` when all checks pass). -**If any item is missing or unclear:** comment on the issue via `gh issue comment --body "..."` listing what's missing. Then STOP -- do NOT proceed until the issue is complete. +```bash +LABELS=$(gh issue view --json labels --jq '[.labels[].name] | join(",")') +``` -**If all items are present:** continue to step 4. +- If `Good` is NOT in the labels → **STOP**: "Issue #N has not passed check-issue. Please run `/check-issue ` first." +- If `Good` is present → continue to step 4. ### 4. Research References @@ -77,9 +84,19 @@ Include the concrete details from the issue (problem definition, reduction algor - Run the example; verify JSON output against user-provided information. - Present in `docs/paper/reductions.typ` in tutorial style with clear intuition (see KColoring->QUBO section for reference). -### 6. Create PR +### 6. Create PR (or Resume Existing) + +**Check for existing PR first:** +```bash +EXISTING_PR=$(gh pr list --search "Fixes #" --state open --json number,headRefName --jq '.[0].number // empty') +``` + +**If a PR already exists** (`EXISTING_PR` is non-empty): +- Switch to its branch: `git checkout ` +- Capture `PR=$EXISTING_PR` +- Skip plan creation — jump directly to Step 7 (execute) -Create a pull request with only the plan file. +**If no existing PR** — create one with only the plan file: **Pre-flight checks** (before creating the branch): 1. Verify clean working tree: `git status --porcelain` must be empty. If not, STOP and ask user to stash or commit. @@ -104,9 +121,125 @@ gh pr create --title "Fix #: " --body " ## Summary <Brief description> -Closes #<number>" +Fixes #<number>" + +# Capture PR number +PR=$(gh pr view --json number --jq .number) +``` + +### 7. Execute Plan (only with `--execute`) + +Skip this step if `--execute` was not provided. + +#### 7a. Implement + +Execute the plan using `superpowers:subagent-driven-development`: + +1. **Read the plan** from `docs/plans/<plan-file>.md` +2. **Clear context** — summarize only the plan content and essential file paths, then invoke subagent-driven-development with a clean prompt. Do not carry forward research notes, issue comments, or other accumulated context from prior steps. +3. **Invoke subagent-driven-development** with the plan as input — this dispatches parallel subagents for independent tasks in the plan + +If execution fails, leave the PR open with the plan commit only — the user can run `make run-plan` manually later. Skip remaining sub-steps. + +#### 7b. Review + +Run review-implementation to verify the code: + +``` +/review-implementation +``` + +Auto-fix any issues found. If unfixable issues remain, report them to the user. + +**Commit all changes** (implementation + review fixes): +```bash +git add -A +git commit -m "Implement #<number>: <title>" +``` + +#### 7c. Push, Post Summary, and Request Copilot Review + +Post an implementation summary comment on the PR **before** pushing. This comment should: +- Summarize what was implemented (files added/changed) +- Highlight any **deviations from the plan** — design changes, unexpected issues, or workarounds discovered during implementation +- Note any open questions or trade-offs made + +```bash +gh pr comment $PR --body "$(cat <<'EOF' +## Implementation Summary + +### Changes +- [list of files added/modified and what they do] + +### Deviations from Plan +- [any design changes, accidents, or workarounds — or "None"] + +### Open Questions +- [any trade-offs or items needing review — or "None"] +EOF +)" + +git push +make copilot-review ``` +#### 7d. Fix Loop (max 3 retries) + +```bash +REPO=$(gh repo view --json nameWithOwner --jq .nameWithOwner) +``` + +For each retry: + +1. **Wait for CI to complete** (poll every 30s, up to 15 minutes): + ```bash + for i in $(seq 1 30); do + sleep 30 + HEAD_SHA=$(gh api repos/$REPO/pulls/$PR | python3 -c "import sys,json; print(json.load(sys.stdin)['head']['sha'])") + STATUS=$(gh api repos/$REPO/commits/$HEAD_SHA/check-runs | python3 -c " + import sys,json + runs = json.load(sys.stdin)['check_runs'] + if not runs: + print('PENDING') # CI hasn't registered yet + else: + failed = [r['name'] for r in runs if r.get('conclusion') not in ('success', 'skipped', None)] + pending = [r['name'] for r in runs if r.get('conclusion') is None and r['status'] != 'completed'] + if pending: + print('PENDING') + elif failed: + print('FAILED') + else: + print('GREEN') + ") + if [ "$STATUS" != "PENDING" ]; then break; fi + done + ``` + + - If `GREEN` on the **first** iteration (before any fix-pr): skip the fix loop, done. + - If `GREEN` after a fix-pr pass: break, done. + - If `FAILED`: continue to step 2. + - If still `PENDING` after 15 min: treat as `FAILED`. + +2. **Invoke `/fix-pr`** to address review comments, CI failures, and coverage gaps. + +3. **Push fixes:** + ```bash + git push + ``` + +4. Increment retry counter. If `< 3`, go back to step 1. If `= 3`, give up. + +**After 3 failed retries:** leave PR open, report to user. + +#### 7e. Done + +Report final status: +- PR URL +- CI status (green / failed after retries) +- Any unresolved review items + +The PR is **not merged** — the user or `meta-power` decides when to merge. + ## Example ``` @@ -115,11 +248,8 @@ User: /issue-to-pr 42 Claude: Let me fetch issue #42... [Fetches issue: "[Rule] IndependentSet to QUBO"] -[Classifies as [Rule] issue] -[Validates against add-rule checklist -- all items present] - -All required info is present. I'll create the plan... - +[Verifies Good label — passed] +[Researches references] [Writes docs/plans/2026-02-09-independentset-to-qubo.md] [Creates branch, commits, pushes] [Creates PR] @@ -127,11 +257,26 @@ All required info is present. I'll create the plan... Created PR #45: Fix #42: Add IndependentSet -> QUBO reduction ``` +``` +User: /issue-to-pr 42 --execute + +Claude: [Same as above, then continues...] + +Executing plan via subagent-driven-development... +[Subagents implement the plan steps] +[Runs review-implementation — all checks pass, auto-fixes applied] +[Pushes + requests Copilot review] +[Polls CI... GREEN on first pass] + +PR #45: CI green, ready for merge. +``` + ## Common Mistakes | Mistake | Fix | |---------|-----| -| Issue template incomplete | Comment listing missing items, then STOP | +| Issue not checked | Run `/check-issue <N>` first — issue-to-pr requires it | +| Issue has failure labels | Fix the issue, re-run `/check-issue`, then retry | | Including implementation code in initial PR | First PR: plan only | | Generic plan | Use specifics from the issue, mapped to add-model/add-rule steps | | Skipping CLI registration in plan | add-model requires CLI dispatch updates -- include in plan | diff --git a/.claude/skills/meta-power/SKILL.md b/.claude/skills/meta-power/SKILL.md index 59cffef91..7e788d0cf 100644 --- a/.claude/skills/meta-power/SKILL.md +++ b/.claude/skills/meta-power/SKILL.md @@ -9,7 +9,7 @@ Batch-process open `[Model]` and `[Rule]` issues end-to-end: plan, implement, re ## Overview -You are the **outer orchestrator**. For each issue you invoke existing skills and shell out to subprocesses. You never implement code directly — `make run-plan` does the heavy lifting in a separate Claude session. +You are the **outer orchestrator**. For each issue you invoke `issue-to-pr --execute`, which handles the full pipeline (plan, implement, review, fix). You never implement code directly. **Batch context:** When invoking sub-skills (like `issue-to-pr`), you are running in batch mode. Auto-approve any confirmation prompts from sub-skills — do not wait for user input mid-batch. @@ -22,11 +22,21 @@ gh issue list --state open --limit 50 --json number,title Filter to issues whose title contains `[Model]` or `[Rule]`. Partition into two buckets, sort each by issue number ascending. Final order: **all Models first, then all Rules**. +**Filter to checked issues only:** Only include issues with the `Good` label (added by `check-issue` when all checks pass): + +```bash +# Only process issues that have the "Good" label +LABELS=$(gh issue view <number> --json labels --jq '[.labels[].name] | join(",")') +# Skip if "Good" is not in LABELS +``` + +Issues without the `Good` label are excluded from the batch with status `skipped (not checked)`. + **Check for existing PRs:** For each issue, check if a PR already exists: ```bash gh pr list --search "Fixes #<number>" --state open --json number,headRefName ``` -If a PR exists, mark the issue as `resume` — skip Step 1 (plan) and jump to Step 2 (execute) or Step 4 (fix loop) depending on whether the PR already has implementation commits. +If a PR exists, mark the issue as `resume` — `issue-to-pr --execute` will detect the existing PR and continue from where it left off. Present the ordered list to the user for confirmation before starting: @@ -46,7 +56,7 @@ Proceed? (user confirms) Initialize a results table to track status for each issue. -## Step 1: Plan (issue-to-pr) +## Step 1: Plan, Execute, Review, Fix (issue-to-pr --execute) For the current issue: @@ -63,101 +73,31 @@ if [ -n "$STALE" ]; then fi ``` -Invoke the `issue-to-pr` skill with the issue number. This creates a branch, writes a plan to `docs/plans/`, and opens a PR. +Invoke `issue-to-pr --execute` with the issue number. This single skill call handles: +- Creating branch and PR with plan +- Executing the plan via subagent-driven-development +- Running review-implementation +- Pushing and requesting Copilot review +- Fix loop (up to 3 retries of fix-pr) -**If `issue-to-pr` fails** (e.g., incomplete issue template): record status as `skipped (plan failed)`, move to next issue. - -Capture the PR number for later steps: -```bash -PR=$(gh pr view --json number --jq .number) ``` - -## Step 2: Execute (make run-plan) - -Run the plan in a separate Claude subprocess: - -```bash -make run-plan +/issue-to-pr <number> --execute ``` -This spawns a new Claude session (up to 500 turns) that reads the plan and implements it using `add-model` or `add-rule`. - -**If the subprocess exits non-zero:** record status as `skipped (execution failed)`, move to next issue. - -## Step 3: Review - -After execution completes, push and request Copilot review: +**If `issue-to-pr` fails** at any stage: record the failure status, skip Steps 2-3, move to next issue. +Capture the PR number for the merge step: ```bash -git push -make copilot-review +PR=$(gh pr view --json number --jq .number 2>/dev/null) +if [ -z "$PR" ]; then + # issue-to-pr failed before creating a PR + # Record status and move to next issue +fi ``` -## Step 4: Fix Loop (max 3 retries) - -```dot -digraph fix_loop { - "Poll CI until done" [shape=box]; - "Run fix-pr" [shape=box]; - "Push changes" [shape=box]; - "Poll CI until done (2)" [shape=box]; - "CI green?" [shape=diamond]; - "Retries < 3?" [shape=diamond]; - "Proceed to merge" [shape=doublecircle]; - "Give up" [shape=doublecircle]; - - "Poll CI until done" -> "Run fix-pr"; - "Run fix-pr" -> "Push changes"; - "Push changes" -> "Poll CI until done (2)"; - "Poll CI until done (2)" -> "CI green?"; - "CI green?" -> "Proceed to merge" [label="yes"]; - "CI green?" -> "Retries < 3?" [label="no"]; - "Retries < 3?" -> "Poll CI until done" [label="yes, re-run fix-pr"]; - "Retries < 3?" -> "Give up" [label="no"]; -} -``` +## Step 2: Merge (skip if Step 1 failed) -For each retry: - -1. **Wait for CI to complete** (poll every 30s, up to 15 minutes): - ```bash - REPO=$(gh repo view --json nameWithOwner --jq .nameWithOwner) - for i in $(seq 1 30); do - sleep 30 - HEAD_SHA=$(gh api repos/$REPO/pulls/$PR | python3 -c "import sys,json; print(json.load(sys.stdin)['head']['sha'])") - STATUS=$(gh api repos/$REPO/commits/$HEAD_SHA/check-runs | python3 -c " - import sys,json - runs = json.load(sys.stdin)['check_runs'] - failed = [r['name'] for r in runs if r.get('conclusion') not in ('success', 'skipped', None)] - pending = [r['name'] for r in runs if r.get('conclusion') is None and r['status'] != 'completed'] - if pending: - print('PENDING') - elif failed: - print('FAILED') - else: - print('GREEN') - ") - if [ "$STATUS" != "PENDING" ]; then break; fi - done - ``` - - - If `GREEN` on the **first** iteration (before any fix-pr): skip the fix loop entirely, proceed to merge. - - If `GREEN` after a fix-pr pass: break out of loop, proceed to merge. - - If `FAILED`: continue to step 2. - - If still `PENDING` after 15 min: treat as `FAILED`. - -2. **Invoke `/fix-pr`** to address review comments, CI failures, and coverage gaps. - -3. **Push fixes:** - ```bash - git push - ``` - -4. Increment retry counter. If `< 3`, go back to step 1 (poll CI). If `= 3`, give up. - -**After 3 failed retries:** record status as `fix-pr failed (3 retries)`, leave PR open, move to next issue. - -## Step 5: Merge +Only attempt merge if `issue-to-pr --execute` completed successfully (CI green or fix loop passed). ```bash gh pr merge $PR --squash --delete-branch --auto @@ -177,7 +117,7 @@ for i in $(seq 1 20); do done ``` -## Step 6: Sync +## Step 3: Sync Return to main for the next issue: @@ -187,7 +127,7 @@ git checkout main && git pull origin main This ensures the next issue (especially a Rule that depends on a just-merged Model) sees all prior work. -## Step 7: Report +## Step 4: Report After all issues are processed, print the summary table: @@ -210,20 +150,24 @@ Completed: 4/6 | Skipped: 1 | Failed: 1 | Name | Value | Rationale | |------|-------|-----------| -| `MAX_RETRIES` | 3 | Most issues fix in 1-2 rounds | -| `CI_POLL_INTERVAL` | 30s | Frequent enough to react quickly | -| `CI_POLL_MAX` | 15 min | Upper bound for CI completion | | `MERGE_POLL_INTERVAL` | 15s | Wait for auto-merge to land | | `MERGE_POLL_MAX` | 5 min | Upper bound for merge completion | +CI polling and retry constants are defined in `issue-to-pr` Step 7d. + +## Context Budget + +Each `issue-to-pr --execute` call consumes significant context (research, planning, implementation, review, fix loop). In a long batch, the session may hit context limits after 2-3 issues. If this happens, the current issue's status is recorded as `context exhausted` and the batch report is printed with remaining issues marked `not started`. + ## Common Failure Modes | Symptom | Cause | Mitigation | |---------|-------|------------| | `issue-to-pr` comments and stops | Issue template incomplete | Skip; user must fix the issue | -| `make run-plan` exits non-zero | Implementation too complex for 500 turns | Skip; needs manual work | +| `issue-to-pr --execute` fails | Implementation too complex or subagent error | Skip; needs manual work | | CI red after 3 retries | Deep bug or flaky test | Leave PR open for human review | | Merge conflict | Concurrent push to main | Leave PR open; manual rebase needed | | Rule fails because model missing | Model issue was skipped earlier | Expected; skip rule too | | Stale branch from previous run | Previous meta-power run failed mid-issue | Auto-cleaned in Step 1 | | PR already exists for issue | Previous partial attempt | Resumed from existing PR | +| Context exhausted mid-batch | Too many issues in one session | Print report with remaining issues as `not started` | diff --git a/.claude/skills/review-implementation/structural-reviewer-prompt.md b/.claude/skills/review-implementation/structural-reviewer-prompt.md index a9e01ef86..491b84038 100644 --- a/.claude/skills/review-implementation/structural-reviewer-prompt.md +++ b/.claude/skills/review-implementation/structural-reviewer-prompt.md @@ -98,24 +98,24 @@ Compare the implementation against the requirements in the original issue. The i ### For Models (check against issue): | # | Check | How to verify | |---|-------|--------------| -| 1 | Problem name matches | Compare struct name against issue item 1 | -| 2 | Mathematical definition matches | Read `evaluate()` and verify it implements the definition from issue item 2 | -| 3 | Problem type matches | Verify optimization direction or satisfaction matches issue item 3 | -| 4 | Type parameters match | Verify struct generics match issue item 4 | -| 5 | Configuration space matches | Verify `dims()` matches issue item 6 | -| 6 | Feasibility check matches | Verify `evaluate()` feasibility logic matches issue item 7 | -| 7 | Objective function matches | Verify `evaluate()` objective logic matches issue item 8 | -| 8 | Complexity matches | Verify `declare_variants!` complexity string matches issue item 9 | +| 1 | Problem name matches | Compare struct name against the issue's **Problem name** field | +| 2 | Mathematical definition matches | Read `evaluate()` and verify it implements the issue's **Mathematical definition** | +| 3 | Problem type matches | Verify optimization direction or satisfaction matches the issue's **Problem type** | +| 4 | Type parameters match | Verify struct generics match the issue's **Type parameters** | +| 5 | Configuration space matches | Verify `dims()` matches the issue's **Configuration space** | +| 6 | Feasibility check matches | Verify `evaluate()` feasibility logic matches the issue's **Feasibility check** | +| 7 | Objective function matches | Verify `evaluate()` objective logic matches the issue's **Objective function** | +| 8 | Complexity matches | Verify `declare_variants!` complexity string matches the issue's **Best known exact algorithm** | ### For Rules (check against issue): | # | Check | How to verify | |---|-------|--------------| -| 1 | Source/target match | Compare `ReduceTo` impl against issue items 1-2 | -| 2 | Reduction algorithm matches | Read `reduce_to()` and verify it follows the algorithm from issue item 3 | -| 3 | Solution extraction matches | Read `extract_solution()` and verify it matches issue item 4 | -| 4 | Correctness preserved | Verify the reduction logic is consistent with the correctness argument in issue item 5 | -| 5 | Overhead expressions match | Compare `#[reduction(overhead = {...})]` against issue item 6 | -| 6 | Example matches | Verify the example program uses the instance from issue item 7 | +| 1 | Source/target match | Compare `ReduceTo` impl against the issue's **Source problem** and **Target problem** | +| 2 | Reduction algorithm matches | Read `reduce_to()` and verify it follows the issue's **Reduction algorithm** | +| 3 | Solution extraction matches | Read `extract_solution()` and verify it matches the issue's **Solution extraction** | +| 4 | Correctness preserved | Verify the reduction logic is consistent with the issue's **Correctness argument** | +| 5 | Overhead expressions match | Compare `#[reduction(overhead = {...})]` against the issue's **Size overhead** | +| 6 | Example matches | Verify the example program uses the instance from the issue's **Concrete example** | Flag any deviation as ISSUE -- the implementation must match what was specified in the issue unless there's a documented reason for the change. From f9b14199fae43743bd5b2f9cf154f231d586ff09 Mon Sep 17 00:00:00 2001 From: GiggleLiu <cacate0129@gmail.com> Date: Sat, 7 Mar 2026 00:06:42 +0900 Subject: [PATCH 8/9] run-issue --- .claude/CLAUDE.md | 1 + Makefile | 14 +++++++++++++- 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md index 955610862..b9a04d7af 100644 --- a/.claude/CLAUDE.md +++ b/.claude/CLAUDE.md @@ -40,6 +40,7 @@ make cli # Build the pred CLI tool (release mode) make cli-demo # Run closed-loop CLI demo (exercises all commands) make mcp-test # Run MCP server tests (unit + integration) make run-plan # Execute a plan with Claude autorun +make run-issue N=42 # Run issue-to-pr --execute for a GitHub issue make copilot-review # Request Copilot code review on current PR make release V=x.y.z # Tag and push a new release (CI publishes to crates.io) ``` diff --git a/Makefile b/Makefile index 7aca87558..8590cbf94 100644 --- a/Makefile +++ b/Makefile @@ -1,6 +1,6 @@ # Makefile for problemreductions -.PHONY: help build test mcp-test fmt clippy doc mdbook paper examples clean coverage rust-export compare qubo-testdata export-schemas release run-plan diagrams jl-testdata cli cli-demo copilot-review +.PHONY: help build test mcp-test fmt clippy doc mdbook paper examples clean coverage rust-export compare qubo-testdata export-schemas release run-plan run-issue diagrams jl-testdata cli cli-demo copilot-review # Default target help: @@ -28,6 +28,7 @@ help: @echo " cli - Build the pred CLI tool" @echo " cli-demo - Run closed-loop CLI demo (build + exercise all commands)" @echo " run-plan - Execute a plan with Claude autorun (latest plan in docs/plans/)" + @echo " run-issue N=<number> - Run issue-to-pr --execute for a GitHub issue" @echo " copilot-review - Request Copilot code review on current PR" # Build the project @@ -212,6 +213,17 @@ run-plan: --max-turns 500 \ -p "$$PROMPT" 2>&1 | tee "$(OUTPUT)" +# Run issue-to-pr --execute for a GitHub issue +# Usage: make run-issue N=42 +N ?= +run-issue: + @if [ -z "$(N)" ]; then echo "Usage: make run-issue N=<issue-number>"; exit 1; fi + claude --dangerously-skip-permissions \ + --model opus \ + --verbose \ + --max-turns 500 \ + -p "/issue-to-pr $(N) --execute" 2>&1 | tee "issue-$(N)-output.log" + # Closed-loop CLI demo: exercises all commands end-to-end PRED := cargo run -p problemreductions-cli --release -- CLI_DEMO_DIR := /tmp/pred-cli-demo From e953d2439e8b5b0b9d10eecaf48dbcb4e0756bd7 Mon Sep 17 00:00:00 2001 From: GiggleLiu <cacate0129@gmail.com> Date: Sat, 7 Mar 2026 00:14:40 +0900 Subject: [PATCH 9/9] fix-pr: check all review comment sources, prioritize user comments - Step 1a now prints counts for each source and labels them clearly - Triage table separates user comments (priority 2) from Copilot (priority 3) - Added "User comments always take priority" instruction - Deleted stale plan file per reviewer request Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --- .claude/skills/fix-pr/SKILL.md | 38 +++++++----- .../2026-03-05-check-issue-skill-design.md | 60 ------------------- 2 files changed, 23 insertions(+), 75 deletions(-) delete mode 100644 docs/plans/2026-03-05-check-issue-skill-design.md diff --git a/.claude/skills/fix-pr/SKILL.md b/.claude/skills/fix-pr/SKILL.md index 5e6d718ea..04cf07a37 100644 --- a/.claude/skills/fix-pr/SKILL.md +++ b/.claude/skills/fix-pr/SKILL.md @@ -26,35 +26,41 @@ HEAD_SHA=$(gh api repos/$REPO/pulls/$PR | python3 -c "import sys,json; print(jso ### 1a. Fetch Review Comments -Three sources of feedback to check: +**Check ALL four sources.** User inline comments are the most commonly missed — do not skip any. ```bash -# Copilot and user inline review comments (on code lines) +# 1. Inline review comments on code lines (from ALL reviewers: users AND Copilot) gh api repos/$REPO/pulls/$PR/comments | python3 -c " import sys,json -for c in json.load(sys.stdin): +comments = json.load(sys.stdin) +print(f'=== Inline comments: {len(comments)} ===') +for c in comments: line = c.get('line') or c.get('original_line') or '?' - print(f'[{c[\"user\"][\"login\"]}] {c[\"path\"]}:{line} — {c[\"body\"]}') + print(f'[{c[\"user\"][\"login\"]}] {c[\"path\"]}:{line} — {c[\"body\"][:200]}') " -# Review-level comments (top-level review body) +# 2. Review-level comments (top-level review body from formal reviews) gh api repos/$REPO/pulls/$PR/reviews | python3 -c " import sys,json -for r in json.load(sys.stdin): +reviews = json.load(sys.stdin) +print(f'=== Reviews: {len(reviews)} ===') +for r in reviews: if r.get('body'): - print(f'[{r[\"user\"][\"login\"]}] {r[\"state\"]}: {r[\"body\"]}') + print(f'[{r[\"user\"][\"login\"]}] {r[\"state\"]}: {r[\"body\"][:200]}') " -# Issue-level comments (general discussion, excluding bots) +# 3. Issue-level comments (general discussion) gh api repos/$REPO/issues/$PR/comments | python3 -c " import sys,json -for c in json.load(sys.stdin): - login = c['user']['login'] - if 'codecov' not in login and 'copilot' not in login: - print(f'[{login}] {c[\"body\"]}') +comments = [c for c in json.load(sys.stdin) if 'codecov' not in c['user']['login']] +print(f'=== Issue comments: {len(comments)} ===') +for c in comments: + print(f'[{c[\"user\"][\"login\"]}] {c[\"body\"][:200]}') " ``` +**Verify counts:** If any source returns 0, confirm it's genuinely empty — don't assume no feedback exists. + ### 1b. Check CI Status ```bash @@ -84,11 +90,13 @@ Categorize all findings: | Priority | Type | Action | |----------|------|--------| -| 1 | CI failures (test/clippy/build) | Fix immediately -- blocks merge | -| 2 | User review comments | Address each one -- respond on PR | -| 3 | Copilot review comments | Evaluate validity, fix if correct | +| 1 | CI failures (test/clippy/build) | Fix immediately — blocks merge | +| 2 | User inline/review comments | Address each one — highest review priority | +| 3 | Copilot inline suggestions | Evaluate validity, fix if correct | | 4 | Codecov coverage gaps | Add tests for uncovered lines | +**User comments always take priority over bot comments.** A user inline comment requesting file deletion is just as important as a user review requesting a code change. + ## Step 3: Fix CI Failures For each failing check: diff --git a/docs/plans/2026-03-05-check-issue-skill-design.md b/docs/plans/2026-03-05-check-issue-skill-design.md deleted file mode 100644 index 7f8855996..000000000 --- a/docs/plans/2026-03-05-check-issue-skill-design.md +++ /dev/null @@ -1,60 +0,0 @@ -# check-issue Skill Design - -## Purpose -Manual quality gate for `[Rule]` and `[Model]` GitHub issues. Invoked via `/check-issue <number>`. Runs 4 checks (adapted per issue type) and posts a structured report as a GitHub comment. Adds failure labels but does NOT close issues. - -## Rule Checks - -### 1. Usefulness (label: `Useless`) -- Parse Source/Target from issue body -- `pred path <source> <target> --json` to check existing reductions -- Direct reduction exists → compare proposed overhead vs existing (`pred show <target> --json` → `reduces_from[]`); fail if not strictly lower -- Multi-hop only → pass with note -- No path → pass -- Check Motivation field is substantive - -### 2. Non-trivial (label: `Trivial`) -- Flag: pure variable substitution, subtype coercion, same-problem identity -- Algorithm must be detailed enough for a programmer to implement - -### 3. Correctness (label: `Wrong`) -- Cross-check references against `references.bib` first -- External references: arxiv MCP → Semantic Scholar MCP → WebSearch (fallback chain) -- Verify paper exists, claim matches, cross-reference with other sources -- If better algorithm found → recommend in comment - -### 4. Well-written (label: `PoorWritten`) -- Information completeness (all template sections filled) -- Algorithm is complete step-by-step procedure -- Symbol/notation consistency (defined before use, match between sections) -- Code metric names match actual getter methods -- Example quality (non-trivial, brute-force solvable, fully worked) - -## Model Checks - -### 1. Usefulness (label: `Useless`) -- Check if problem already exists via `pred show <name> --json` -- Motivation must be concrete with use cases -- Must identify at least one solver method -- Should mention planned reduction rules - -### 2. Non-trivial (label: `Trivial`) -- Not isomorphic to existing problem under different name -- Not a trivial variant (graph/weight restriction) of existing model -- Must have genuinely different feasibility constraints or objective - -### 3. Correctness (label: `Wrong`) -- Definition mathematically well-formed -- Complexity claims verified against literature -- If better algorithm found → recommend in comment -- Same fallback chain as rules - -### 4. Well-written (label: `PoorWritten`) -- All template sections present (Motivation, Definition, Variables, Schema, Complexity, How to solve, Example) -- Definition precise and implementable -- Symbol/notation consistency across sections -- Naming conventions followed (Maximum/Minimum prefix) -- Example with known optimal solution - -## Output -Structured GitHub comment with pass/fail/warn table, per-check details, and recommendations. Labels added only for failed checks. Never closes issues.