Skip to content

[Model] EnsembleComputation #215

@isPANN

Description

@isPANN

Motivation

ENSEMBLE COMPUTATION (P301) from Garey & Johnson, A11 PO9. A classical NP-complete problem useful for reductions. It asks whether a given collection of target subsets can all be computed by a short sequence of disjoint union operations, starting from singletons or previously computed intermediate sets.

Planned reduction: MinimumVertexCover → EnsembleComputation (per Garey & Johnson, Section 3.2.2 — transformation from VERTEX COVER). See also rule issue #204.

Definition

Name: EnsembleComputation
Reference: Garey & Johnson, Computers and Intractability, A11 PO9

Mathematical definition:

INSTANCE: Collection C of subsets of a finite set A, positive integer J.
QUESTION: Is there a sequence S = (z₁ ← x₁ ∪ y₁, z₂ ← x₂ ∪ y₂, ..., zⱼ ← xⱼ ∪ yⱼ) of j ≤ J union operations, where each xᵢ and yᵢ is either {a} for some a ∈ A or z_k for some k < i, such that xᵢ and yᵢ are disjoint, 1 ≤ i ≤ j, and such that for every subset c ∈ C there exists some zᵢ, 1 ≤ i ≤ j, that is identical to c?

This is a satisfaction (decision) problem. The decision framing with budget J is the canonical formulation in the literature (Garey & Johnson, Järvisalo et al. 2012). The budget J determines the number of variables, analogous to k in KColoring.

Variables

  • Count: 2 * J variables, encoding J union operations. Each operation i ∈ {1, ..., J} has two operand variables: operand1_i and operand2_i.
  • Per-variable domain: Each operand variable ranges over |A| + J values (indices 0..|A| represent singletons {a} for a ∈ A; indices |A|..|A|+J represent previously computed sets z₁..zⱼ). This gives dims() = vec![|A| + J; 2 * J].
  • Meaning: A configuration [op1_x, op1_y, op2_x, op2_y, ..., opJ_x, opJ_y] specifies the two operands for each of the J union operations.
  • evaluate(): Returns true if and only if:
    1. For each operation i, both operands refer to valid sources (singletons or z_k with k < i — i.e., operand indices ≥ |A| must satisfy the index < |A| + i),
    2. For each operation i, the sets x_i and y_i are disjoint,
    3. Every subset c ∈ C appears as some z_i in the computed sequence.

Schema (data type)

Type name: EnsembleComputation
Variants: No generic parameters needed (elements and subsets represented by integer indices); a single concrete variant.

Field Type Description
universe_size usize Number of elements in A (elements are represented as 0..universe_size)
subsets Vec<Vec<usize>> The collection C: each inner Vec lists elements of one required subset
budget usize Maximum number of union operations J allowed

Complexity

  • Decision complexity: NP-complete (Garey & Johnson, Section 3.2.2; transformation from VERTEX COVER). Remains NP-complete even if each c ∈ C satisfies |c| ≤ 3.
  • Best known exact algorithm: No specialized exact algorithm is known; brute-force enumeration of the search space is the baseline approach. Each of the 2J operand variables ranges over |A| + J values, giving a brute-force complexity of O((|A| + J)^(2J)).
  • declare_variants! complexity string: "(universe_size + budget)^(2 * budget)"
  • References:
    • [Garey & Johnson, 1979] M. R. Garey, D. S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness, W. H. Freeman, 1979. Problem PO9, Appendix A11.
    • [Järvisalo et al., 2012] M. Järvisalo, P. Kaski, M. Koivisto, J. Korhonen, "Finding Efficient Circuits for Ensemble Computation", Proc. SAT 2012, LNCS 7317, pp. 369–382. https://doi.org/10.1007/978-3-642-31612-8_28

Extra Remark

Full book text:

INSTANCE: Collection C of subsets of a finite set A, positive integer J.
QUESTION: Is there a sequence S = (z1 ← x1 ∪ y1, z2 ← x2 ∪ y2,...,zj ← xj ∪ yj) of j ≤ J union operations, where each xi and yi is either {a} for some a ∈ A or zk for some k < i, such that xi and yi are disjoint, 1 ≤ i ≤ j, and such that for every subset c ∈ C there exists some zi, 1 ≤ i ≤ j, that is identical to c?
Reference: [Garey and Johnson, ——]. Transformation from VERTEX COVER (see Section 3.2.2).
Comment: Remains NP-complete even if each c ∈ C satisfies |c| ≤ 3. The analogous problem in which xi and yi need not be disjoint for 1 ≤ i ≤ j is also NP-complete under the same restriction.

How to solve

  • It can be solved by (existing) bruteforce: enumerate all valid sequences of ≤ J disjoint-union operations over elements of A (and previously built sets), check if every c ∈ C is produced.
  • It can be solved by reducing to integer programming.
  • Other: N/A — no other known tractable special case applies to the general problem

Example Instance

Small satisfiable instance:

  • Universe: A = {0, 1, 2, 3, 4} (5 elements)
  • Required subsets: C = {{0,1,2}, {0,1,3}, {2,3,4}}
  • Budget: J = 5

Brute-force complexity: (5 + 5)^(2 × 5) = 10^10 = 10 billion configurations (illustrates the rapid growth of the search space).

Witness sequence (j = 4 ≤ J = 5):

  • z₁ = {0} ∪ {1} = {0,1} (disjoint ✓)
  • z₂ = z₁ ∪ {2} = {0,1,2} = C[0] ✓
  • z₃ = z₁ ∪ {3} = {0,1,3} = C[1] ✓
  • z₄ = {2} ∪ {3} = {2,3} (disjoint ✓; intermediate set)
  • z₅ = z₄ ∪ {4} = {2,3,4} = C[2] ✓

All 3 subsets in C appear (j = 5 ≤ J = 5 ✓). Note z₁ = {0,1} is reused by both z₂ and z₃, demonstrating sharing.

Greedy trap: Building each subset independently requires (|c|−1) operations per subset: 2 + 2 + 2 = 6 operations. Sharing the intermediate z₁ = {0,1} between C[0] and C[1] saves one operation, reducing the total to 5.

Why budget J = 4 is infeasible: With only 4 operations, we cannot produce all 3 target subsets. Building C[0] = {0,1,2} requires at least 2 operations (e.g., z₁ = {0}∪{1}, z₂ = z₁∪{2}). Building C[2] = {2,3,4} requires at least 2 operations (e.g., z₃ = {2}∪{3}, z₄ = z₃∪{4}). That's 4 operations consumed, but C[1] = {0,1,3} still needs z₁∪{3} — requiring a 5th operation. So J = 4 is a NO instance.

Suboptimal witness: Using J = 6 (budget exceeds need): build each subset independently without sharing — z₁={0}∪{1}, z₂=z₁∪{2}=C[0], z₃={0}∪{1}, z₄=z₃∪{3}=C[1], z₅={2}∪{3}, z₆=z₅∪{4}=C[2]. Valid but wasteful (j=6 > optimal j=5).

Metadata

Metadata

Assignees

No one assigned

    Labels

    GoodAn issue passed all checks.modelA model problem to be implemented.

    Type

    No type

    Projects

    Status

    Done

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions