From 9b3df6e81657e6ce052ccb5ffed4600edb1107c8 Mon Sep 17 00:00:00 2001 From: GiggleLiu Date: Sat, 21 Mar 2026 15:03:29 +0800 Subject: [PATCH 1/3] Add plan for #215: [Model] EnsembleComputation --- docs/plans/2026-03-21-ensemble-computation.md | 209 ++++++++++++++++++ 1 file changed, 209 insertions(+) create mode 100644 docs/plans/2026-03-21-ensemble-computation.md diff --git a/docs/plans/2026-03-21-ensemble-computation.md b/docs/plans/2026-03-21-ensemble-computation.md new file mode 100644 index 000000000..4fd8658b2 --- /dev/null +++ b/docs/plans/2026-03-21-ensemble-computation.md @@ -0,0 +1,209 @@ +# EnsembleComputation Implementation Plan + +> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task. + +**Goal:** Add the `EnsembleComputation` satisfaction model from issue #215, including registry/CLI/example-db integration, tests, and a paper entry. + +**Architecture:** Implement `EnsembleComputation` as a new `misc` satisfaction problem with fields `universe_size`, `subsets`, and `budget`. Encode each union step by two operand indices over the fixed domain `0..(universe_size + budget)`, validate operand references against the prefix of previously computed sets, and treat the issue's `j <= J` semantics as "a satisfying prefix of at most `budget` operations exists" while still using a fixed-length config for brute force. + +**Tech Stack:** Rust workspace (`problemreductions` + `problemreductions-cli`), serde, inventory registry, clap CLI, Typst paper, existing brute-force solver. + +--- + +## Batch 1: Model, Tests, Registry, CLI, Example DB + +### Task 1: Write the failing model tests first + +**Files:** +- Create: `src/unit_tests/models/misc/ensemble_computation.rs` +- Reference: `src/unit_tests/models/misc/multiprocessor_scheduling.rs` +- Reference: `src/unit_tests/models/misc/sequencing_within_intervals.rs` + +**Step 1: Write the failing tests** + +Add tests for: +- construction/getters/dims for `EnsembleComputation::new(4, vec![vec![0, 1, 2], vec![0, 1, 3]], 4)` +- satisfiable witness evaluation using a concrete full-length config such as `[0, 1, 4, 2, 4, 3, 0, 1]` +- invalid configs for future references, overlapping operands, out-of-range config length, and missing required subsets +- a small brute-force-solvable instance (for example `universe_size = 3`, `subsets = [[0, 1]]`, `budget = 1`) +- serde round-trip +- paper/canonical example validity without brute-force exhaustiveness if the search space is too large + +**Step 2: Run the targeted test to verify it fails** + +Run: `cargo test ensemble_computation --lib` + +Expected: FAIL because `EnsembleComputation` does not exist yet. + +**Step 3: Do not write production code until the failure is confirmed** + +Use this failing run as the RED checkpoint for the model implementation. + +### Task 2: Implement the model and register it in the library + +**Files:** +- Create: `src/models/misc/ensemble_computation.rs` +- Modify: `src/models/misc/mod.rs` +- Modify: `src/models/mod.rs` +- Modify: `src/lib.rs` + +**Step 1: Write the minimal model implementation** + +Implement: +- `ProblemSchemaEntry` with display name `Ensemble Computation` +- struct fields `universe_size: usize`, `subsets: Vec>`, `budget: usize` +- getters `universe_size()`, `num_subsets()`, `budget()` +- `Problem` with `Metric = bool`, `variant_params![]`, `dims() = vec![universe_size + budget; 2 * budget]` +- `evaluate()` that: + - rejects wrong config length + - decodes operand references as either singletons or previously computed `z_k` + - rejects non-disjoint unions + - tracks computed sets in sequence order + - returns `true` once every required subset has appeared as some computed `z_i` +- `SatisfactionProblem` +- `declare_variants! { default sat EnsembleComputation => "(universe_size + budget)^(2 * budget)" }` +- `canonical_model_example_specs()` using the issue's satisfiable example instance + +**Step 2: Register exports** + +Wire the new model through: +- `src/models/misc/mod.rs` +- `src/models/mod.rs` +- `src/lib.rs` / `prelude` +- misc example-spec aggregation in `src/models/misc/mod.rs` + +**Step 3: Run the targeted tests** + +Run: `cargo test ensemble_computation --lib` + +Expected: PASS for the new model test file. + +**Step 4: Refactor only if needed** + +Keep helpers local to the model file unless another existing model clearly needs reuse. + +### Task 3: Add CLI creation support and example-path integration + +**Files:** +- Modify: `problemreductions-cli/src/commands/create.rs` +- Modify: `problemreductions-cli/src/cli.rs` +- Modify: `problemreductions-cli/src/problem_name.rs` only if a lowercase alias mapping is needed + +**Step 1: Write or extend a failing CLI-focused test if there is an existing pattern** + +If there is a nearby unit/integration test pattern for `pred create`, add a focused failing test for `EnsembleComputation`. +If there is no practical pattern already in the workspace, skip adding a new CLI test file and rely on `cargo test` plus manual `pred create` verification later. + +**Step 2: Implement the CLI arm** + +In `create.rs`, add a new `EnsembleComputation` arm that parses: +- `--universe` as `universe_size` +- `--sets` as required subsets +- `--budget` as the union-operation bound + +Also update: +- `example_for(...)` +- any field-to-flag mapping needed so schema-driven help shows `--universe`, `--sets`, and `--budget` +- `cli.rs` "Flags by problem type" help table + +Do not invent a short literature alias unless one is clearly standard. + +**Step 3: Verify the CLI path** + +Run: +- `cargo test -p problemreductions-cli create` +- `cargo run -p problemreductions-cli -- create EnsembleComputation --universe 4 --sets "0,1,2;0,1,3" --budget 4 --json` + +Expected: +- tests pass +- the CLI emits a valid serialized `EnsembleComputation` JSON object + +### Task 4: Run batch-1 verification before moving to paper work + +**Files:** +- No new files + +**Step 1: Run focused workspace verification** + +Run: +- `cargo test ensemble_computation` +- `cargo test -p problemreductions-cli create` +- `cargo clippy --all-targets --all-features -- -D warnings` + +Expected: all pass. + +**Step 2: Commit batch-1 work** + +Run: +- `git add src/models/misc/ensemble_computation.rs src/models/misc/mod.rs src/models/mod.rs src/lib.rs src/unit_tests/models/misc/ensemble_computation.rs problemreductions-cli/src/commands/create.rs problemreductions-cli/src/cli.rs problemreductions-cli/src/problem_name.rs` +- `git commit -m "Add EnsembleComputation model"` + +Only include `problem_name.rs` in the commit if it was actually changed. + +## Batch 2: Paper Entry and Documentation-Specific Verification + +### Task 5: Add the paper entry and any missing bibliography entry + +**Files:** +- Modify: `docs/paper/reductions.typ` +- Modify: `docs/paper/references.bib` if the Järvisalo et al. 2012 citation is not already present + +**Step 1: Write the paper example after implementation is stable** + +Add: +- `display-name` entry for `EnsembleComputation` +- `problem-def("EnsembleComputation")` with a self-contained definition +- short background tying Garey & Johnson PO9 to monotone/ensemble circuit computation +- algorithm note using the brute-force search-space expression and, if cited, the SAT 2012 practical approach +- a worked example based on the issue instance, explicitly explaining the union sequence + +Keep the paper text consistent with the implemented encoding: +- the mathematical problem remains "at most `J` operations" +- the code-level config uses `2 * budget` operand slots +- the example should explain how the witness sequence maps onto that encoding + +**Step 2: Run paper verification** + +Run: `make paper` + +Expected: PASS with regenerated untracked docs artifacts only. + +**Step 3: Re-run the paper/example test if needed** + +Run: `cargo test ensemble_computation_paper_example --lib` + +Expected: PASS. + +### Task 6: Final verification and pipeline handoff + +**Files:** +- No new files + +**Step 1: Run the full verification required before claiming completion** + +Run: +- `make test` +- `make clippy` +- `git status --short` + +Expected: +- test and clippy succeed +- only intended tracked changes remain +- ignored generated doc exports may appear but are not staged + +**Step 2: Commit the documentation batch** + +Run: +- `git add docs/paper/reductions.typ docs/paper/references.bib` +- `git commit -m "Document EnsembleComputation"` + +Only include `references.bib` if it changed. + +**Step 3: Prepare PR summary inputs** + +Collect: +- files changed +- any deviation from the issue (especially the fixed-length encoding choice for `j <= J`) +- verification commands actually run + +This summary will be posted to the PR before the final push in the pipeline skill. From 99067337beb6054c14c57da1f9ae96f70311fc15 Mon Sep 17 00:00:00 2001 From: GiggleLiu Date: Sat, 21 Mar 2026 15:26:57 +0800 Subject: [PATCH 2/3] Implement #215: [Model] EnsembleComputation --- docs/paper/reductions.typ | 51 ++++ docs/paper/references.bib | 12 + problemreductions-cli/src/cli.rs | 1 + problemreductions-cli/src/commands/create.rs | 60 ++++- problemreductions-cli/tests/cli_tests.rs | 107 ++++++++ src/lib.rs | 6 +- src/models/misc/ensemble_computation.rs | 236 ++++++++++++++++++ src/models/misc/mod.rs | 3 + src/models/mod.rs | 15 +- .../models/misc/ensemble_computation.rs | 104 ++++++++ 10 files changed, 584 insertions(+), 11 deletions(-) create mode 100644 src/models/misc/ensemble_computation.rs create mode 100644 src/unit_tests/models/misc/ensemble_computation.rs diff --git a/docs/paper/reductions.typ b/docs/paper/reductions.typ index 53188f3f8..d282b446f 100644 --- a/docs/paper/reductions.typ +++ b/docs/paper/reductions.typ @@ -96,6 +96,7 @@ "KSatisfiability": [$k$-SAT], "CircuitSAT": [CircuitSAT], "ConjunctiveQueryFoldability": [Conjunctive Query Foldability], + "EnsembleComputation": [Ensemble Computation], "Factoring": [Factoring], "KingsSubgraph": [King's Subgraph MIS], "TriangularSubgraph": [Triangular Subgraph MIS], @@ -2547,6 +2548,56 @@ A classical NP-complete problem from Garey and Johnson @garey1979[Ch.~3, p.~76], ) ] +#problem-def("EnsembleComputation")[ + Given a finite set $A$, a collection $C$ of subsets of $A$, and a positive integer $J$, determine whether there exists a sequence $S = (z_1 <- x_1 union y_1, z_2 <- x_2 union y_2, dots, z_j <- x_j union y_j)$ of $j <= J$ union operations such that each operand $x_i, y_i$ is either a singleton ${a}$ for some $a in A$ or a previously computed set $z_k$ with $k < i$, the two operands are disjoint for every step, and every target subset $c in C$ is equal to some computed set $z_i$. +][ + Ensemble Computation is problem PO9 in Garey and Johnson @garey1979. It can be viewed as monotone circuit synthesis over set union: each operation introduces one reusable intermediate set, and the objective is simply to realize all targets within the given budget. The implementation in this library uses $2J$ operand variables with domain size $|A| + J$ and accepts as soon as some valid prefix has produced every target set, so the original "$j <= J$" semantics are preserved under brute-force enumeration. The resulting search space yields a straightforward exact upper bound of $(|A| + J)^(2J)$. Järvisalo, Kaski, Koivisto, and Korhonen study SAT encodings for finding efficient ensemble computations in a monotone-circuit setting @jarvisalo2012. + + *Example.* Let $A = {0, 1, 2, 3}$, $C = {{0, 1, 2}, {0, 1, 3}}$, and $J = 4$. A satisfying witness uses three essential unions: + $z_1 = {0} union {1} = {0, 1}$, + $z_2 = z_1 union {2} = {0, 1, 2}$, and + $z_3 = z_1 union {3} = {0, 1, 3}$. + Thus both target subsets appear among the computed $z_i$ values while staying within the budget. + + #figure( + canvas(length: 1cm, { + import draw: * + let node(pos, label, name, fill) = { + rect( + (pos.at(0) - 0.45, pos.at(1) - 0.18), + (pos.at(0) + 0.45, pos.at(1) + 0.18), + radius: 0.08, + fill: fill, + stroke: 0.5pt + luma(140), + name: name, + ) + content(name, text(7pt, label)) + } + + let base = rgb("#4e79a7").transparentize(78%) + let target = rgb("#59a14f").transparentize(72%) + let aux = rgb("#f28e2b").transparentize(74%) + + node((0.0, 1.4), [\{0\}], "a0", base) + node((1.2, 1.4), [\{1\}], "a1", base) + node((2.4, 1.4), [\{2\}], "a2", base) + node((3.6, 1.4), [\{3\}], "a3", base) + + node((0.6, 0.6), [$z_1 = \{0,1\}$], "z1", aux) + node((1.8, -0.2), [$z_2 = \{0,1,2\}$], "z2", target) + node((3.0, -0.2), [$z_3 = \{0,1,3\}$], "z3", target) + + line("a0.south", "z1.north-west", stroke: 0.5pt + luma(120), mark: (end: "straight", scale: 0.4)) + line("a1.south", "z1.north-east", stroke: 0.5pt + luma(120), mark: (end: "straight", scale: 0.4)) + line("z1.south-west", "z2.north-west", stroke: 0.5pt + luma(120), mark: (end: "straight", scale: 0.4)) + line("a2.south", "z2.north-east", stroke: 0.5pt + luma(120), mark: (end: "straight", scale: 0.4)) + line("z1.south-east", "z3.north-west", stroke: 0.5pt + luma(120), mark: (end: "straight", scale: 0.4)) + line("a3.south", "z3.north-east", stroke: 0.5pt + luma(120), mark: (end: "straight", scale: 0.4)) + }), + caption: [An ensemble computation for $A = {0,1,2,3}$ and $C = {{0,1,2}, {0,1,3}}$. The intermediate set $z_1 = {0,1}$ is reused to produce both target subsets.], + ) +] + #{ let x = load-model-example("Factoring") let N = x.instance.target diff --git a/docs/paper/references.bib b/docs/paper/references.bib index bce4d90d8..fd8106120 100644 --- a/docs/paper/references.bib +++ b/docs/paper/references.bib @@ -504,6 +504,18 @@ @article{bjorklund2009 doi = {10.1137/070683933} } +@incollection{jarvisalo2012, + author = {Matti J\"{a}rvisalo and Petteri Kaski and Mikko Koivisto and Janne H. Korhonen}, + title = {Finding Efficient Circuits for Ensemble Computation}, + booktitle = {Theory and Applications of Satisfiability Testing -- SAT 2012}, + series = {Lecture Notes in Computer Science}, + volume = {7317}, + pages = {369--382}, + year = {2012}, + publisher = {Springer}, + doi = {10.1007/978-3-642-31612-8_28} +} + @article{aspvall1979, author = {Bengt Aspvall and Michael F. Plass and Robert Endre Tarjan}, title = {A Linear-Time Algorithm for Testing the Truth of Certain Quantified Boolean Formulas}, diff --git a/problemreductions-cli/src/cli.rs b/problemreductions-cli/src/cli.rs index 9fdeea3da..df74d308e 100644 --- a/problemreductions-cli/src/cli.rs +++ b/problemreductions-cli/src/cli.rs @@ -242,6 +242,7 @@ Flags by problem type: PaintShop --sequence MaximumSetPacking --sets [--weights] MinimumSetCovering --universe, --sets [--weights] + EnsembleComputation --universe, --sets, --budget ComparativeContainment --universe, --r-sets, --s-sets [--r-weights] [--s-weights] X3C (ExactCoverBy3Sets) --universe, --sets (3 elements each) SetBasis --universe, --sets, --k diff --git a/problemreductions-cli/src/commands/create.rs b/problemreductions-cli/src/commands/create.rs index f04afb8cb..58ac71074 100644 --- a/problemreductions-cli/src/commands/create.rs +++ b/problemreductions-cli/src/commands/create.rs @@ -18,7 +18,7 @@ use problemreductions::models::graph::{ }; use problemreductions::models::misc::{ AdditionalKey, BinPacking, BoyceCoddNormalFormViolation, CbqRelation, ConjunctiveBooleanQuery, - FlowShopScheduling, LongestCommonSubsequence, MinimumTardinessSequencing, + EnsembleComputation, FlowShopScheduling, LongestCommonSubsequence, MinimumTardinessSequencing, MultiprocessorScheduling, PaintShop, PartiallyOrderedKnapsack, QueryArg, RectilinearPictureCompression, ResourceConstrainedScheduling, SchedulingWithIndividualDeadlines, SequencingToMinimizeMaximumCumulativeCost, @@ -368,6 +368,7 @@ fn type_format_hint(type_name: &str, graph_type: Option<&str>) -> &'static str { fn cli_flag_name(field_name: &str) -> String { match field_name { + "universe_size" => "universe".to_string(), "vertex_weights" => "weights".to_string(), "edge_lengths" => "edge-weights".to_string(), _ => field_name.replace('_', "-"), @@ -423,6 +424,7 @@ fn example_for(canonical: &str, graph_type: Option<&str>) -> &'static str { "SpinGlass" => "--graph 0-1,1-2 --couplings 1,1", "KColoring" => "--graph 0-1,1-2,2-0 --k 3", "HamiltonianCircuit" => "--graph 0-1,1-2,2-3,3-0", + "EnsembleComputation" => "--universe 4 --sets \"0,1,2;0,1,3\" --budget 4", "MinMaxMulticenter" => { "--graph 0-1,1-2,2-3 --weights 1,1,1,1 --edge-weights 1,1,1 --k 2 --bound 2" } @@ -1761,6 +1763,31 @@ pub fn create(args: &CreateArgs, out: &OutputConfig) -> Result<()> { ) } + // EnsembleComputation + "EnsembleComputation" => { + let usage = + "Usage: pred create EnsembleComputation --universe 4 --sets \"0,1,2;0,1,3\" --budget 4"; + let universe_size = args.universe.ok_or_else(|| { + anyhow::anyhow!("EnsembleComputation requires --universe\n\n{usage}") + })?; + let subsets = parse_sets(args)?; + let budget = args + .budget + .as_deref() + .ok_or_else(|| anyhow::anyhow!("EnsembleComputation requires --budget\n\n{usage}"))? + .parse::() + .map_err(|e| { + anyhow::anyhow!( + "Invalid --budget value for EnsembleComputation: {e}\n\n{usage}" + ) + })?; + ( + ser(EnsembleComputation::try_new(universe_size, subsets, budget) + .map_err(anyhow::Error::msg)?)?, + resolved_variant.clone(), + ) + } + // ComparativeContainment "ComparativeContainment" => { let universe = args.universe.ok_or_else(|| { @@ -5290,6 +5317,37 @@ mod tests { std::fs::remove_file(output_path).ok(); } + #[test] + fn test_create_ensemble_computation_json() { + let mut args = empty_args(); + args.problem = Some("EnsembleComputation".to_string()); + args.universe = Some(4); + args.sets = Some("0,1,2;0,1,3".to_string()); + args.budget = Some("4".to_string()); + + let output_path = std::env::temp_dir().join("pred_test_create_ensemble_computation.json"); + let out = OutputConfig { + output: Some(output_path.clone()), + quiet: true, + json: false, + auto_json: false, + }; + + create(&args, &out).unwrap(); + + let content = std::fs::read_to_string(&output_path).unwrap(); + let json: serde_json::Value = serde_json::from_str(&content).unwrap(); + assert_eq!(json["type"], "EnsembleComputation"); + assert_eq!(json["data"]["universe_size"], 4); + assert_eq!( + json["data"]["subsets"], + serde_json::json!([[0, 1, 2], [0, 1, 3]]) + ); + assert_eq!(json["data"]["budget"], 4); + + std::fs::remove_file(output_path).ok(); + } + #[test] fn test_create_balanced_complete_bipartite_subgraph() { use crate::dispatch::ProblemJsonOutput; diff --git a/problemreductions-cli/tests/cli_tests.rs b/problemreductions-cli/tests/cli_tests.rs index a6aa1b924..e5df1c559 100644 --- a/problemreductions-cli/tests/cli_tests.rs +++ b/problemreductions-cli/tests/cli_tests.rs @@ -7194,6 +7194,97 @@ fn test_create_sequencing_within_intervals() { std::fs::remove_file(&output_file).ok(); } +#[test] +fn test_create_ensemble_computation() { + let output_file = std::env::temp_dir().join("pred_test_create_ensemble_computation.json"); + let output = pred() + .args([ + "-o", + output_file.to_str().unwrap(), + "create", + "EnsembleComputation", + "--universe", + "4", + "--sets", + "0,1,2;0,1,3", + "--budget", + "4", + ]) + .output() + .unwrap(); + assert!( + output.status.success(), + "stderr: {}", + String::from_utf8_lossy(&output.stderr) + ); + let content = std::fs::read_to_string(&output_file).unwrap(); + let json: serde_json::Value = serde_json::from_str(&content).unwrap(); + assert_eq!(json["type"], "EnsembleComputation"); + assert_eq!(json["data"]["universe_size"], 4); + assert_eq!( + json["data"]["subsets"], + serde_json::json!([[0, 1, 2], [0, 1, 3]]) + ); + assert_eq!(json["data"]["budget"], 4); + std::fs::remove_file(&output_file).ok(); +} + +#[test] +fn test_create_ensemble_computation_no_flags_uses_cli_flag_names() { + let output = pred() + .args(["create", "EnsembleComputation"]) + .output() + .unwrap(); + assert!( + !output.status.success(), + "problem-specific help should exit non-zero" + ); + let stderr = String::from_utf8_lossy(&output.stderr); + assert!( + stderr.contains("--universe"), + "expected --universe in help, got: {stderr}" + ); + assert!( + stderr.contains("--sets"), + "expected --sets in help, got: {stderr}" + ); + assert!( + stderr.contains("--budget"), + "expected --budget in help, got: {stderr}" + ); + assert!( + !stderr.contains("--universe-size"), + "help should use actual CLI flags, got: {stderr}" + ); +} + +#[test] +fn test_create_ensemble_computation_rejects_out_of_range_elements_without_panicking() { + let output = pred() + .args([ + "create", + "EnsembleComputation", + "--universe", + "4", + "--sets", + "0,1,5", + "--budget", + "4", + ]) + .output() + .unwrap(); + assert!(!output.status.success()); + let stderr = String::from_utf8_lossy(&output.stderr); + assert!( + !stderr.contains("panicked at"), + "expected graceful CLI error, got panic: {stderr}" + ); + assert!( + stderr.contains("outside universe") || stderr.contains("universe of size"), + "expected out-of-range subset error, got: {stderr}" + ); +} + #[test] fn test_create_scheduling_with_individual_deadlines_with_m_alias() { let output_file = @@ -7320,6 +7411,22 @@ fn test_create_model_example_sequencing_within_intervals() { assert_eq!(json["type"], "SequencingWithinIntervals"); } +#[test] +fn test_create_model_example_ensemble_computation() { + let output = pred() + .args(["create", "--example", "EnsembleComputation"]) + .output() + .unwrap(); + assert!( + output.status.success(), + "stderr: {}", + String::from_utf8_lossy(&output.stderr) + ); + let stdout = String::from_utf8(output.stdout).unwrap(); + let json: serde_json::Value = serde_json::from_str(&stdout).unwrap(); + assert_eq!(json["type"], "EnsembleComputation"); +} + #[test] fn test_create_minimum_multiway_cut_rejects_single_terminal() { let output = pred() diff --git a/src/lib.rs b/src/lib.rs index 973c8476e..9ad70c59c 100644 --- a/src/lib.rs +++ b/src/lib.rs @@ -65,9 +65,9 @@ pub mod prelude { }; pub use crate::models::misc::{ AdditionalKey, BinPacking, BoyceCoddNormalFormViolation, CbqRelation, - ConjunctiveBooleanQuery, ConjunctiveQueryFoldability, Factoring, FlowShopScheduling, - Knapsack, LongestCommonSubsequence, MinimumTardinessSequencing, MultiprocessorScheduling, - PaintShop, Partition, QueryArg, RectilinearPictureCompression, + ConjunctiveBooleanQuery, ConjunctiveQueryFoldability, EnsembleComputation, Factoring, + FlowShopScheduling, Knapsack, LongestCommonSubsequence, MinimumTardinessSequencing, + MultiprocessorScheduling, PaintShop, Partition, QueryArg, RectilinearPictureCompression, ResourceConstrainedScheduling, SchedulingWithIndividualDeadlines, SequencingToMinimizeMaximumCumulativeCost, SequencingToMinimizeWeightedCompletionTime, SequencingToMinimizeWeightedTardiness, SequencingWithReleaseTimesAndDeadlines, diff --git a/src/models/misc/ensemble_computation.rs b/src/models/misc/ensemble_computation.rs new file mode 100644 index 000000000..0f710a072 --- /dev/null +++ b/src/models/misc/ensemble_computation.rs @@ -0,0 +1,236 @@ +//! Ensemble Computation problem implementation. + +use crate::registry::{FieldInfo, ProblemSchemaEntry}; +use crate::traits::{Problem, SatisfactionProblem}; +use serde::{Deserialize, Serialize}; + +inventory::submit! { + ProblemSchemaEntry { + name: "EnsembleComputation", + display_name: "Ensemble Computation", + aliases: &[], + dimensions: &[], + module_path: module_path!(), + description: "Determine whether required subsets can be built by a bounded sequence of disjoint unions", + fields: &[ + FieldInfo { name: "universe_size", type_name: "usize", description: "Number of elements in the universe A" }, + FieldInfo { name: "subsets", type_name: "Vec>", description: "Required subsets that must appear among the computed z_i values" }, + FieldInfo { name: "budget", type_name: "usize", description: "Maximum number of union operations J" }, + ], + } +} + +#[derive(Debug, Clone, Serialize, Deserialize)] +#[serde(try_from = "EnsembleComputationDef")] +pub struct EnsembleComputation { + universe_size: usize, + subsets: Vec>, + budget: usize, +} + +impl EnsembleComputation { + pub fn new(universe_size: usize, subsets: Vec>, budget: usize) -> Self { + Self::try_new(universe_size, subsets, budget).unwrap_or_else(|err| panic!("{err}")) + } + + pub fn try_new( + universe_size: usize, + subsets: Vec>, + budget: usize, + ) -> Result { + if budget == 0 { + return Err("budget must be positive".to_string()); + } + let subsets = subsets + .into_iter() + .enumerate() + .map(|(subset_index, subset)| { + Self::normalize_subset(universe_size, subset).ok_or_else(|| { + format!( + "subset {subset_index} contains element outside universe of size {universe_size}" + ) + }) + }) + .collect::, _>>()?; + Ok(Self { + universe_size, + subsets, + budget, + }) + } + + pub fn universe_size(&self) -> usize { + self.universe_size + } + + pub fn subsets(&self) -> &[Vec] { + &self.subsets + } + + pub fn num_subsets(&self) -> usize { + self.subsets.len() + } + + pub fn budget(&self) -> usize { + self.budget + } + + fn normalize_subset(universe_size: usize, mut subset: Vec) -> Option> { + if subset.iter().any(|&element| element >= universe_size) { + return None; + } + subset.sort_unstable(); + subset.dedup(); + Some(subset) + } + + fn decode_operand(&self, operand: usize, computed: &[Vec]) -> Option> { + if operand < self.universe_size { + return Some(vec![operand]); + } + computed.get(operand - self.universe_size).cloned() + } + + fn are_disjoint(left: &[usize], right: &[usize]) -> bool { + let mut i = 0; + let mut j = 0; + + while i < left.len() && j < right.len() { + match left[i].cmp(&right[j]) { + std::cmp::Ordering::Less => i += 1, + std::cmp::Ordering::Greater => j += 1, + std::cmp::Ordering::Equal => return false, + } + } + + true + } + + fn union_disjoint(left: &[usize], right: &[usize]) -> Vec { + let mut union = Vec::with_capacity(left.len() + right.len()); + let mut i = 0; + let mut j = 0; + + while i < left.len() && j < right.len() { + if left[i] < right[j] { + union.push(left[i]); + i += 1; + } else { + union.push(right[j]); + j += 1; + } + } + + union.extend_from_slice(&left[i..]); + union.extend_from_slice(&right[j..]); + union + } + + fn required_subsets(&self) -> Option>> { + self.subsets + .iter() + .cloned() + .map(|subset| Self::normalize_subset(self.universe_size, subset)) + .collect() + } + + fn all_required_subsets_present( + required_subsets: &[Vec], + computed: &[Vec], + ) -> bool { + required_subsets + .iter() + .all(|subset| computed.iter().any(|candidate| candidate == subset)) + } +} + +impl Problem for EnsembleComputation { + const NAME: &'static str = "EnsembleComputation"; + type Metric = bool; + + fn dims(&self) -> Vec { + vec![self.universe_size + self.budget; 2 * self.budget] + } + + fn evaluate(&self, config: &[usize]) -> bool { + if config.len() != 2 * self.budget { + return false; + } + + let Some(required_subsets) = self.required_subsets() else { + return false; + }; + if required_subsets.is_empty() { + return true; + } + + let mut computed = Vec::with_capacity(self.budget); + for step in 0..self.budget { + let left_operand = config[2 * step]; + let right_operand = config[2 * step + 1]; + + let Some(left) = self.decode_operand(left_operand, &computed) else { + return false; + }; + let Some(right) = self.decode_operand(right_operand, &computed) else { + return false; + }; + + if !Self::are_disjoint(&left, &right) { + return false; + } + + computed.push(Self::union_disjoint(&left, &right)); + if Self::all_required_subsets_present(&required_subsets, &computed) { + return true; + } + } + + false + } + + fn variant() -> Vec<(&'static str, &'static str)> { + crate::variant_params![] + } +} + +impl SatisfactionProblem for EnsembleComputation {} + +crate::declare_variants! { + default sat EnsembleComputation => "(universe_size + budget)^(2 * budget)", +} + +#[derive(Debug, Clone, Deserialize)] +struct EnsembleComputationDef { + universe_size: usize, + subsets: Vec>, + budget: usize, +} + +impl TryFrom for EnsembleComputation { + type Error = String; + + fn try_from(value: EnsembleComputationDef) -> Result { + Self::try_new(value.universe_size, value.subsets, value.budget) + } +} + +#[cfg(feature = "example-db")] +pub(crate) fn canonical_model_example_specs() -> Vec { + // Keep the canonical example small enough for the example-db optimality check to solve + // it via brute force, while still demonstrating reuse of a previously computed set. + vec![crate::example_db::specs::ModelExampleSpec { + id: "ensemble_computation", + instance: Box::new(EnsembleComputation::new( + 3, + vec![vec![0, 1], vec![0, 1, 2]], + 2, + )), + optimal_config: vec![0, 1, 3, 2], + optimal_value: serde_json::json!(true), + }] +} + +#[cfg(test)] +#[path = "../../unit_tests/models/misc/ensemble_computation.rs"] +mod tests; diff --git a/src/models/misc/mod.rs b/src/models/misc/mod.rs index 37f421cb3..e2623d890 100644 --- a/src/models/misc/mod.rs +++ b/src/models/misc/mod.rs @@ -34,6 +34,7 @@ mod bin_packing; mod boyce_codd_normal_form_violation; pub(crate) mod conjunctive_boolean_query; pub(crate) mod conjunctive_query_foldability; +mod ensemble_computation; pub(crate) mod factoring; mod flow_shop_scheduling; mod knapsack; @@ -63,6 +64,7 @@ pub use bin_packing::BinPacking; pub use boyce_codd_normal_form_violation::BoyceCoddNormalFormViolation; pub use conjunctive_boolean_query::{ConjunctiveBooleanQuery, QueryArg, Relation as CbqRelation}; pub use conjunctive_query_foldability::{ConjunctiveQueryFoldability, Term}; +pub use ensemble_computation::EnsembleComputation; pub use factoring::Factoring; pub use flow_shop_scheduling::FlowShopScheduling; pub use knapsack::Knapsack; @@ -93,6 +95,7 @@ pub(crate) fn canonical_model_example_specs() -> Vec EnsembleComputation { + EnsembleComputation::new(4, vec![vec![0, 1, 2], vec![0, 1, 3]], 4) +} + +#[test] +fn test_ensemble_computation_creation() { + let problem = issue_problem(); + + assert_eq!(problem.universe_size(), 4); + assert_eq!(problem.num_subsets(), 2); + assert_eq!(problem.budget(), 4); + assert_eq!(problem.num_variables(), 8); + assert_eq!(problem.dims(), vec![8; 8]); + assert_eq!( + ::NAME, + "EnsembleComputation" + ); + assert!(::variant().is_empty()); +} + +#[test] +fn test_ensemble_computation_issue_witness() { + let problem = issue_problem(); + + assert!(problem.evaluate(&[0, 1, 4, 2, 4, 3, 0, 1])); +} + +#[test] +fn test_ensemble_computation_rejects_future_reference() { + let problem = issue_problem(); + + assert!(!problem.evaluate(&[4, 1, 0, 1, 0, 1, 0, 1])); +} + +#[test] +fn test_ensemble_computation_rejects_overlapping_operands() { + let problem = issue_problem(); + + assert!(!problem.evaluate(&[0, 0, 4, 2, 4, 3, 0, 1])); +} + +#[test] +fn test_ensemble_computation_rejects_missing_required_subset() { + let problem = issue_problem(); + + assert!(!problem.evaluate(&[0, 1, 0, 1, 0, 1, 0, 1])); +} + +#[test] +fn test_ensemble_computation_rejects_wrong_config_length() { + let problem = issue_problem(); + + assert!(!problem.evaluate(&[0, 1, 4, 2])); +} + +#[test] +fn test_ensemble_computation_small_bruteforce_instance() { + let problem = EnsembleComputation::new(2, vec![vec![0, 1]], 1); + let solver = BruteForce::new(); + + let satisfying = solver.find_all_satisfying(&problem); + assert_eq!(satisfying.len(), 2); + assert!(satisfying.contains(&vec![0, 1])); + assert!(satisfying.contains(&vec![1, 0])); + assert_eq!(solver.find_satisfying(&problem), Some(vec![0, 1])); +} + +#[test] +fn test_ensemble_computation_serialization_round_trip() { + let problem = issue_problem(); + let json = serde_json::to_string(&problem).unwrap(); + let round_trip: EnsembleComputation = serde_json::from_str(&json).unwrap(); + + assert_eq!(round_trip.universe_size(), 4); + assert_eq!(round_trip.num_subsets(), 2); + assert_eq!(round_trip.budget(), 4); + assert!(round_trip.evaluate(&[0, 1, 4, 2, 4, 3, 0, 1])); +} + +#[test] +fn test_ensemble_computation_try_new_rejects_out_of_range_subset() { + let result = EnsembleComputation::try_new(4, vec![vec![0, 1, 5]], 4); + + assert!(result.is_err()); +} + +#[test] +fn test_ensemble_computation_deserialization_rejects_zero_budget() { + let json = r#"{"universe_size":4,"subsets":[[0,1,2]],"budget":0}"#; + let result: Result = serde_json::from_str(json); + + assert!(result.is_err()); +} + +#[test] +fn test_ensemble_computation_paper_example() { + let problem = issue_problem(); + + assert!(problem.evaluate(&[0, 1, 4, 2, 4, 3, 0, 1])); +} From d29a5fab514098408d566bddd072d2958d304c5e Mon Sep 17 00:00:00 2001 From: GiggleLiu Date: Sat, 21 Mar 2026 15:27:00 +0800 Subject: [PATCH 3/3] chore: remove plan file after implementation --- docs/plans/2026-03-21-ensemble-computation.md | 209 ------------------ 1 file changed, 209 deletions(-) delete mode 100644 docs/plans/2026-03-21-ensemble-computation.md diff --git a/docs/plans/2026-03-21-ensemble-computation.md b/docs/plans/2026-03-21-ensemble-computation.md deleted file mode 100644 index 4fd8658b2..000000000 --- a/docs/plans/2026-03-21-ensemble-computation.md +++ /dev/null @@ -1,209 +0,0 @@ -# EnsembleComputation Implementation Plan - -> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task. - -**Goal:** Add the `EnsembleComputation` satisfaction model from issue #215, including registry/CLI/example-db integration, tests, and a paper entry. - -**Architecture:** Implement `EnsembleComputation` as a new `misc` satisfaction problem with fields `universe_size`, `subsets`, and `budget`. Encode each union step by two operand indices over the fixed domain `0..(universe_size + budget)`, validate operand references against the prefix of previously computed sets, and treat the issue's `j <= J` semantics as "a satisfying prefix of at most `budget` operations exists" while still using a fixed-length config for brute force. - -**Tech Stack:** Rust workspace (`problemreductions` + `problemreductions-cli`), serde, inventory registry, clap CLI, Typst paper, existing brute-force solver. - ---- - -## Batch 1: Model, Tests, Registry, CLI, Example DB - -### Task 1: Write the failing model tests first - -**Files:** -- Create: `src/unit_tests/models/misc/ensemble_computation.rs` -- Reference: `src/unit_tests/models/misc/multiprocessor_scheduling.rs` -- Reference: `src/unit_tests/models/misc/sequencing_within_intervals.rs` - -**Step 1: Write the failing tests** - -Add tests for: -- construction/getters/dims for `EnsembleComputation::new(4, vec![vec![0, 1, 2], vec![0, 1, 3]], 4)` -- satisfiable witness evaluation using a concrete full-length config such as `[0, 1, 4, 2, 4, 3, 0, 1]` -- invalid configs for future references, overlapping operands, out-of-range config length, and missing required subsets -- a small brute-force-solvable instance (for example `universe_size = 3`, `subsets = [[0, 1]]`, `budget = 1`) -- serde round-trip -- paper/canonical example validity without brute-force exhaustiveness if the search space is too large - -**Step 2: Run the targeted test to verify it fails** - -Run: `cargo test ensemble_computation --lib` - -Expected: FAIL because `EnsembleComputation` does not exist yet. - -**Step 3: Do not write production code until the failure is confirmed** - -Use this failing run as the RED checkpoint for the model implementation. - -### Task 2: Implement the model and register it in the library - -**Files:** -- Create: `src/models/misc/ensemble_computation.rs` -- Modify: `src/models/misc/mod.rs` -- Modify: `src/models/mod.rs` -- Modify: `src/lib.rs` - -**Step 1: Write the minimal model implementation** - -Implement: -- `ProblemSchemaEntry` with display name `Ensemble Computation` -- struct fields `universe_size: usize`, `subsets: Vec>`, `budget: usize` -- getters `universe_size()`, `num_subsets()`, `budget()` -- `Problem` with `Metric = bool`, `variant_params![]`, `dims() = vec![universe_size + budget; 2 * budget]` -- `evaluate()` that: - - rejects wrong config length - - decodes operand references as either singletons or previously computed `z_k` - - rejects non-disjoint unions - - tracks computed sets in sequence order - - returns `true` once every required subset has appeared as some computed `z_i` -- `SatisfactionProblem` -- `declare_variants! { default sat EnsembleComputation => "(universe_size + budget)^(2 * budget)" }` -- `canonical_model_example_specs()` using the issue's satisfiable example instance - -**Step 2: Register exports** - -Wire the new model through: -- `src/models/misc/mod.rs` -- `src/models/mod.rs` -- `src/lib.rs` / `prelude` -- misc example-spec aggregation in `src/models/misc/mod.rs` - -**Step 3: Run the targeted tests** - -Run: `cargo test ensemble_computation --lib` - -Expected: PASS for the new model test file. - -**Step 4: Refactor only if needed** - -Keep helpers local to the model file unless another existing model clearly needs reuse. - -### Task 3: Add CLI creation support and example-path integration - -**Files:** -- Modify: `problemreductions-cli/src/commands/create.rs` -- Modify: `problemreductions-cli/src/cli.rs` -- Modify: `problemreductions-cli/src/problem_name.rs` only if a lowercase alias mapping is needed - -**Step 1: Write or extend a failing CLI-focused test if there is an existing pattern** - -If there is a nearby unit/integration test pattern for `pred create`, add a focused failing test for `EnsembleComputation`. -If there is no practical pattern already in the workspace, skip adding a new CLI test file and rely on `cargo test` plus manual `pred create` verification later. - -**Step 2: Implement the CLI arm** - -In `create.rs`, add a new `EnsembleComputation` arm that parses: -- `--universe` as `universe_size` -- `--sets` as required subsets -- `--budget` as the union-operation bound - -Also update: -- `example_for(...)` -- any field-to-flag mapping needed so schema-driven help shows `--universe`, `--sets`, and `--budget` -- `cli.rs` "Flags by problem type" help table - -Do not invent a short literature alias unless one is clearly standard. - -**Step 3: Verify the CLI path** - -Run: -- `cargo test -p problemreductions-cli create` -- `cargo run -p problemreductions-cli -- create EnsembleComputation --universe 4 --sets "0,1,2;0,1,3" --budget 4 --json` - -Expected: -- tests pass -- the CLI emits a valid serialized `EnsembleComputation` JSON object - -### Task 4: Run batch-1 verification before moving to paper work - -**Files:** -- No new files - -**Step 1: Run focused workspace verification** - -Run: -- `cargo test ensemble_computation` -- `cargo test -p problemreductions-cli create` -- `cargo clippy --all-targets --all-features -- -D warnings` - -Expected: all pass. - -**Step 2: Commit batch-1 work** - -Run: -- `git add src/models/misc/ensemble_computation.rs src/models/misc/mod.rs src/models/mod.rs src/lib.rs src/unit_tests/models/misc/ensemble_computation.rs problemreductions-cli/src/commands/create.rs problemreductions-cli/src/cli.rs problemreductions-cli/src/problem_name.rs` -- `git commit -m "Add EnsembleComputation model"` - -Only include `problem_name.rs` in the commit if it was actually changed. - -## Batch 2: Paper Entry and Documentation-Specific Verification - -### Task 5: Add the paper entry and any missing bibliography entry - -**Files:** -- Modify: `docs/paper/reductions.typ` -- Modify: `docs/paper/references.bib` if the Järvisalo et al. 2012 citation is not already present - -**Step 1: Write the paper example after implementation is stable** - -Add: -- `display-name` entry for `EnsembleComputation` -- `problem-def("EnsembleComputation")` with a self-contained definition -- short background tying Garey & Johnson PO9 to monotone/ensemble circuit computation -- algorithm note using the brute-force search-space expression and, if cited, the SAT 2012 practical approach -- a worked example based on the issue instance, explicitly explaining the union sequence - -Keep the paper text consistent with the implemented encoding: -- the mathematical problem remains "at most `J` operations" -- the code-level config uses `2 * budget` operand slots -- the example should explain how the witness sequence maps onto that encoding - -**Step 2: Run paper verification** - -Run: `make paper` - -Expected: PASS with regenerated untracked docs artifacts only. - -**Step 3: Re-run the paper/example test if needed** - -Run: `cargo test ensemble_computation_paper_example --lib` - -Expected: PASS. - -### Task 6: Final verification and pipeline handoff - -**Files:** -- No new files - -**Step 1: Run the full verification required before claiming completion** - -Run: -- `make test` -- `make clippy` -- `git status --short` - -Expected: -- test and clippy succeed -- only intended tracked changes remain -- ignored generated doc exports may appear but are not staged - -**Step 2: Commit the documentation batch** - -Run: -- `git add docs/paper/reductions.typ docs/paper/references.bib` -- `git commit -m "Document EnsembleComputation"` - -Only include `references.bib` if it changed. - -**Step 3: Prepare PR summary inputs** - -Collect: -- files changed -- any deviation from the issue (especially the fixed-length encoding choice for `j <= J`) -- verification commands actually run - -This summary will be posted to the PR before the final push in the pipeline skill.