Motivation
SET BASIS (P134) from Garey & Johnson, A3 SP7. An NP-complete problem in set/storage theory: given a collection C of subsets of a finite set S, find a minimum-size "basis" B of subsets such that every set in C can be expressed as a union of sets in B. This has applications in data compression, database schema design, and Boolean function minimization -- finding a compact representation of a family of sets. The problem is closely related to set cover but has a fundamentally different structure: instead of covering elements, we must reconstruct exact sets via unions.
Associated reduction rules:
- As target: R74 (VERTEX COVER -> SET BASIS)
Definition
Name: SetBasis
Canonical name: Set Basis Problem; also: Minimum Test Set Basis, Minimum Set Basis
Reference: Garey & Johnson, Computers and Intractability, A3 SP7
Mathematical definition:
INSTANCE: Collection C of subsets of a finite set S, positive integer K ≤ |C|.
QUESTION: Is there a collection B of subsets of S with |B| = K such that, for each c ∈ C, there is a subcollection of B whose union is exactly c?
The optimization version asks: find the minimum K such that a basis B of size K exists.
Variables
- Count: K × |S| binary variables. The K basis elements are each encoded as a binary vector of length |S|, indicating which elements of S are included in that basis set.
- Per-variable domain: binary {0, 1} — whether element j ∈ S belongs to basis set i.
- Meaning: The configuration is a K × |S| binary matrix where row i defines basis element b_i ⊆ S. The assignment is valid (returns
true) if every set c ∈ C can be exactly reconstructed as a union of some subcollection of B = {b_1, ..., b_K}.
dims() spec: vec![2; k * universe_size] — K × |S| binary variables, where variables [i*|S| .. (i+1)*|S|) encode basis element b_i.
Schema (data type)
Type name: SetBasis
Variants: none (pure set-theoretic problem)
| Field |
Type |
Description |
universe_size |
usize |
Size of the ground set S (= |
collection |
Vec<Vec<usize>> |
The collection C of target subsets of S (each represented as sorted element indices) |
k |
usize |
Maximum allowed basis size K |
Size fields (for overhead expressions): universe_size (= |S|), num_sets (= |C|), basis_size (= K)
Notes:
- This is a satisfaction (decision) problem:
Metric = bool, implementing SatisfactionProblem.
- Key getter methods:
universe_size() (= |S|), num_sets() (= |C|), basis_size() (= K).
- Basis elements are arbitrary subsets of S (not necessarily members of C).
- Field naming follows
MinimumSetCovering conventions (universe_size, not set_size).
Complexity
- Decision complexity: NP-complete (Stockmeyer, 1975; transformation from VERTEX COVER). Remains NP-complete when all c ∈ C have |c| ≤ 3, but is trivial when all |c| ≤ 2.
- Best known exact algorithm: No specialized exact exponential algorithm is known. The problem can be formulated as an ILP. For practical instances, constraint programming and SAT solvers are used.
- Complexity string:
"2^(basis_size * universe_size)" (brute force over K × |S| binary variables)
declare_variants! guidance:
crate::declare_variants! {
SetBasis => "2^(basis_size * universe_size)",
}
- References:
- L. J. Stockmeyer (1975). "The Set Basis Problem is NP-Complete." IBM Research Report RC 5431, IBM Research Center, Yorktown Heights, NY.
Specialization
- This is a special case of: General set representation / compression problems
- Known special cases:
- All sets in C have size ≤ 2: trivially solvable (each element is its own basis element)
- All sets in C have size ≤ 3: still NP-complete
- Related problems: SET COVER (basis elements must cover S, not reconstruct C), EXACT COVER (disjoint union), MINIMUM EQUIVALENT EXPRESSION
Extra Remark
Full book text:
INSTANCE: Collection C of subsets of a finite set S, positive integer K ≤ |C|.
QUESTION: Is there a collection B of subsets of S with |B| = K such that, for each c ∈ C, there is a subcollection of B whose union is exactly c?
Reference: [Stockmeyer, 1975]. Transformation from VERTEX COVER.
Comment: Remains NP-complete if all c ∈ C have |c| ≤ 3, but is trivial if all c ∈ C have |c| ≤ 2.
How to solve
Example Instance
Ground set S = {a, b, c, d} (|S| = 4), Collection C of 4 subsets, K = 3:
- c_1 = {a, b}
- c_2 = {b, c}
- c_3 = {a, c}
- c_4 = {a, b, c}
Question: Is there a basis B of 3 subsets of S such that each c_i is a union of sets in B?
Solution: B = { {a}, {b}, {c} } with |B| = 3.
- c_1 = {a, b} = {a} ∪ {b} ✓
- c_2 = {b, c} = {b} ∪ {c} ✓
- c_3 = {a, c} = {a} ∪ {c} ✓
- c_4 = {a, b, c} = {a} ∪ {b} ∪ {c} ✓
Answer for K = 3: YES.
Infeasibility for K = 2: Suppose B = {b_1, b_2}. Then c_1 = {a,b}, c_2 = {b,c}, and c_3 = {a,c} must each be either b_1, b_2, or b_1 ∪ b_2. Since b_1 ∪ b_2 can equal at most one of these three sets, the other two must each be b_1 or b_2 — but we only have two basis elements for three distinct sets. So K = 2 is infeasible.
Answer for K = 2: NO. Minimum basis size = 3.
Motivation
SET BASIS (P134) from Garey & Johnson, A3 SP7. An NP-complete problem in set/storage theory: given a collection C of subsets of a finite set S, find a minimum-size "basis" B of subsets such that every set in C can be expressed as a union of sets in B. This has applications in data compression, database schema design, and Boolean function minimization -- finding a compact representation of a family of sets. The problem is closely related to set cover but has a fundamentally different structure: instead of covering elements, we must reconstruct exact sets via unions.
Associated reduction rules:
Definition
Name:
SetBasisCanonical name: Set Basis Problem; also: Minimum Test Set Basis, Minimum Set Basis
Reference: Garey & Johnson, Computers and Intractability, A3 SP7
Mathematical definition:
INSTANCE: Collection C of subsets of a finite set S, positive integer K ≤ |C|.
QUESTION: Is there a collection B of subsets of S with |B| = K such that, for each c ∈ C, there is a subcollection of B whose union is exactly c?
The optimization version asks: find the minimum K such that a basis B of size K exists.
Variables
true) if every set c ∈ C can be exactly reconstructed as a union of some subcollection of B = {b_1, ..., b_K}.dims()spec:vec![2; k * universe_size]— K × |S| binary variables, where variables[i*|S| .. (i+1)*|S|)encode basis element b_i.Schema (data type)
Type name:
SetBasisVariants: none (pure set-theoretic problem)
universe_sizeusizecollectionVec<Vec<usize>>kusizeSize fields (for overhead expressions):
universe_size(= |S|),num_sets(= |C|),basis_size(= K)Notes:
Metric = bool, implementingSatisfactionProblem.universe_size()(= |S|),num_sets()(= |C|),basis_size()(= K).MinimumSetCoveringconventions (universe_size, notset_size).Complexity
"2^(basis_size * universe_size)"(brute force over K × |S| binary variables)declare_variants!guidance:Specialization
Extra Remark
Full book text:
INSTANCE: Collection C of subsets of a finite set S, positive integer K ≤ |C|.
QUESTION: Is there a collection B of subsets of S with |B| = K such that, for each c ∈ C, there is a subcollection of B whose union is exactly c?
Reference: [Stockmeyer, 1975]. Transformation from VERTEX COVER.
Comment: Remains NP-complete if all c ∈ C have |c| ≤ 3, but is trivial if all c ∈ C have |c| ≤ 2.
How to solve
Example Instance
Ground set S = {a, b, c, d} (|S| = 4), Collection C of 4 subsets, K = 3:
Question: Is there a basis B of 3 subsets of S such that each c_i is a union of sets in B?
Solution: B = { {a}, {b}, {c} } with |B| = 3.
Answer for K = 3: YES.
Infeasibility for K = 2: Suppose B = {b_1, b_2}. Then c_1 = {a,b}, c_2 = {b,c}, and c_3 = {a,c} must each be either b_1, b_2, or b_1 ∪ b_2. Since b_1 ∪ b_2 can equal at most one of these three sets, the other two must each be b_1 or b_2 — but we only have two basis elements for three distinct sets. So K = 2 is infeasible.
Answer for K = 2: NO. Minimum basis size = 3.