You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Source: 3SAT Target: Consistency of Database Frequency Tables Motivation: Establishes NP-completeness of Consistency of Database Frequency Tables via polynomial-time reduction from 3SAT. This result has practical implications for statistical database security: it shows that no polynomial-time algorithm can determine whether a set of published frequency tables can be used to "compromise" a database by deducing specific attribute values of individual records, unless P = NP. The reduction encodes Boolean variables as attribute values and clauses as frequency table constraints, so that satisfying all clauses corresponds to finding an assignment of attribute values consistent with the given frequency tables.
Reference: Garey & Johnson, Computers and Intractability, Appendix A4.3, p.235
GJ Source Entry
[SR35] CONSISTENCY OF DATABASE FREQUENCY TABLES
INSTANCE: Set A of attribute names, domain set D_a for each a E A, set V of objects, collection F of frequency tables for some pairs a,b E A (where a frequency table for a,b E A is a function f_{a,b}: D_a × D_b → Z+ with the sum, over all pairs x E D_a and y E D_b, of f_{a,b}(x,y) equal to |V|), and a set K of triples (v,a,x) with v E V, a E A, and x E D_a, representing the known attribute values.
QUESTION: Are the frequency tables in F consistent with the known attribute values in K, i.e., is there a collection of functions g_a: V → D_a, for each a E A, such that g_a(v) = x if (v,a,x) E K and such that, for each f_{a,b} E F, x E D_a, and y E D_b, the number of v E V for which g_a(v) = x and g_b(v) = y is exactly f_{a,b}(x,y)?
Reference: [Reiss, 1977b]. Transformation from 3SAT.
Comment: Above result implies that no polynomial time algorithm can be given for "compromising" a data base from its frequency tables by deducing prespecified attribute values, unless P = NP (see reference for details).
Reduction Algorithm
Summary:
Given a 3SAT instance with variables x_1, ..., x_n and clauses C_1, ..., C_m (each clause having exactly 3 literals), construct a Consistency of Database Frequency Tables instance as follows:
Object construction: Create one object v_i for each variable x_i in the 3SAT formula. Thus |V| = n (the number of variables).
Attribute construction for variables: Create one attribute a_i for each variable x_i, with domain D_{a_i} = {T, F} (representing True and False). The assignment g_{a_i}(v_i) encodes the truth value of variable x_i.
Attribute construction for clauses: For each clause C_j = (l_{j1} ∨ l_{j2} ∨ l_{j3}), create an attribute b_j with domain D_{b_j} = {1, 2, 3, ..., 7} representing which of the 7 satisfying truth assignments for the 3 literals in C_j is realized. (There are 2^3 - 1 = 7 ways to satisfy a 3-literal clause.)
Frequency table construction: For each clause C_j involving variables x_{p}, x_{q}, x_{r}:
Create frequency tables f_{a_p, b_j}, f_{a_q, b_j}, and f_{a_r, b_j} that encode the relationship between the truth value of each variable and the satisfying assignment chosen for clause C_j.
The frequency table f_{a_p, b_j}(T, k) = 1 if the k-th satisfying assignment of C_j has x_p = True, and 0 otherwise (similarly for F). These tables enforce that the attribute value of object v_p (the truth value of x_p) is consistent with the satisfying assignment chosen for clause C_j.
Known attribute values (K): The set K is initially empty (no attribute values are pre-specified), or may contain specific triples to encode unit propagation constraints.
Marginal consistency constraints: Additional frequency tables between variable-attributes a_p and a_q for variables appearing together in clauses enforce that each object v_i has a unique, globally consistent truth value.
Solution extraction: The frequency tables in F are consistent with K if and only if there exists an assignment of truth values to x_1, ..., x_n that satisfies all clauses. A consistent set of functions g_a corresponds directly to a satisfying assignment.
Key invariant: Each object represents a Boolean variable, each variable-attribute encodes {T, F}, and the frequency tables between variable-attributes and clause-attributes ensure that every clause has at least one true literal — which is exactly the 3SAT satisfiability condition.
Size Overhead
Symbols:
n = number of variables in the 3SAT instance
m = number of clauses in the 3SAT instance
Target metric (code name)
Polynomial (using symbols above)
num_objects
num_variables
num_attributes
num_variables + num_clauses
num_frequency_tables
3 * num_clauses
Derivation:
Objects: one per Boolean variable -> |V| = n
Attributes: one per variable (domain {T, F}) plus one per clause (domain {1,...,7}) -> |A| = n + m
Frequency tables: 3 tables per clause (one for each literal's variable paired with the clause attribute) -> |F| = 3m
Domain sizes: variable attributes have |D| = 2; clause attributes have |D| <= 7
Known values: |K| = O(n) at most (possibly empty)
Validation Method
Closed-loop test: reduce a 3SAT instance to a Consistency of Database Frequency Tables instance, solve the consistency problem by brute-force enumeration of all possible attribute-value assignments, extract the truth assignment, and verify it satisfies all original clauses
Check that the number of objects, attributes, and frequency tables matches the overhead formula
Test with a 3SAT instance that is satisfiable and verify that at least one consistent assignment exists
Test with an unsatisfiable 3SAT instance and verify that no consistent assignment exists
Verify that frequency table marginals sum to |V| as required by the problem definition
Constructed target instance (Consistency of Database Frequency Tables):
Objects V = {v_1, v_2, v_3, v_4, v_5, v_6} (6 objects, one per variable)
Attributes A:
Source: 3SAT
Target: Consistency of Database Frequency Tables
Motivation: Establishes NP-completeness of Consistency of Database Frequency Tables via polynomial-time reduction from 3SAT. This result has practical implications for statistical database security: it shows that no polynomial-time algorithm can determine whether a set of published frequency tables can be used to "compromise" a database by deducing specific attribute values of individual records, unless P = NP. The reduction encodes Boolean variables as attribute values and clauses as frequency table constraints, so that satisfying all clauses corresponds to finding an assignment of attribute values consistent with the given frequency tables.
Reference: Garey & Johnson, Computers and Intractability, Appendix A4.3, p.235
GJ Source Entry
Reduction Algorithm
Summary:
Given a 3SAT instance with variables x_1, ..., x_n and clauses C_1, ..., C_m (each clause having exactly 3 literals), construct a Consistency of Database Frequency Tables instance as follows:
Object construction: Create one object v_i for each variable x_i in the 3SAT formula. Thus |V| = n (the number of variables).
Attribute construction for variables: Create one attribute a_i for each variable x_i, with domain D_{a_i} = {T, F} (representing True and False). The assignment g_{a_i}(v_i) encodes the truth value of variable x_i.
Attribute construction for clauses: For each clause C_j = (l_{j1} ∨ l_{j2} ∨ l_{j3}), create an attribute b_j with domain D_{b_j} = {1, 2, 3, ..., 7} representing which of the 7 satisfying truth assignments for the 3 literals in C_j is realized. (There are 2^3 - 1 = 7 ways to satisfy a 3-literal clause.)
Frequency table construction: For each clause C_j involving variables x_{p}, x_{q}, x_{r}:
Known attribute values (K): The set K is initially empty (no attribute values are pre-specified), or may contain specific triples to encode unit propagation constraints.
Marginal consistency constraints: Additional frequency tables between variable-attributes a_p and a_q for variables appearing together in clauses enforce that each object v_i has a unique, globally consistent truth value.
Solution extraction: The frequency tables in F are consistent with K if and only if there exists an assignment of truth values to x_1, ..., x_n that satisfies all clauses. A consistent set of functions g_a corresponds directly to a satisfying assignment.
Key invariant: Each object represents a Boolean variable, each variable-attribute encodes {T, F}, and the frequency tables between variable-attributes and clause-attributes ensure that every clause has at least one true literal — which is exactly the 3SAT satisfiability condition.
Size Overhead
Symbols:
num_objectsnum_variablesnum_attributesnum_variables + num_clausesnum_frequency_tables3 * num_clausesDerivation:
Validation Method
Example
Source instance (3SAT):
Variables: x_1, x_2, x_3, x_4, x_5, x_6
Clauses (7 clauses):
Satisfying assignment: x_1=T, x_2=T, x_3=F, x_4=T, x_5=F, x_6=T
Constructed target instance (Consistency of Database Frequency Tables):
Objects V = {v_1, v_2, v_3, v_4, v_5, v_6} (6 objects, one per variable)
Attributes A:
Total: 13 attributes
Frequency tables F (21 tables, 3 per clause):
Example frequency table f_{a_1, b_1} (for variable x_1 in clause C_1 = (x_1 ∨ x_2 ∨ x_3)):
The 7 satisfying assignments of (x_1 ∨ x_2 ∨ x_3) are:
1: (T,T,T), 2: (T,T,F), 3: (T,F,T), 4: (T,F,F), 5: (F,T,T), 6: (F,T,F), 7: (F,F,T)
(Entries marked * are determined by the assignment; each column sums to the number of objects that realize that satisfying pattern.)
Known values K = {} (empty)
Solution mapping:
References
Reiss1977b] S. P. Reiss (1977). "Statistical database confidentiality". Dept. of Statistics, University of Stockholm.