Skip to content

[Rule] SubsetSum to ClosestVectorProblem #125

@zazabap

Description

@zazabap

Source: SubsetSum
Target: ClosestVectorProblem
Motivation: Embeds the Subset Sum problem into a lattice structure, enabling solving via lattice reduction algorithms (LLL, BKZ). This is a classical connection between number-theoretic problems and lattice problems, used in cryptanalysis of knapsack-based cryptosystems (Merkle–Hellman).
Reference: Lagarias & Odlyzko, "Solving Low-Density Subset Sum Problems", JACM 32(1), 1985; Coster et al., "Improved low-density subset sum algorithms", Computational Complexity 2, 1992

Note: The original Lagarias-Odlyzko paper reduces Subset Sum to SVP by embedding the target sum inside the lattice basis. The CVP formulation with $\tfrac{1}{2}$ target entries follows the CJLOSS construction (Coster, Joux, LaMacchia, Odlyzko, Schnorr, Stern, 1992), adapted from SVP to CVP: the identity block with target $\tfrac{1}{2}$ enforces binary structure, while the sizes row encodes the subset sum constraint.

Reduction Algorithm

Notation:

  • Source: $n$ elements with sizes $s_0, \ldots, s_{n-1}$ and target sum $B$
  • Target: CVP with lattice basis $\mathbf{B} \in \mathbb{Z}^{(n+1) \times n}$ and target vector $\mathbf{t} \in \mathbb{Q}^{n+1}$

Step 1 — Construct lattice basis:

$$\mathbf{B} = \begin{bmatrix} I_n \ \mathbf{s}^T \end{bmatrix} = \begin{bmatrix} 1 & 0 & \cdots & 0 \ 0 & 1 & \cdots & 0 \ \vdots & \vdots & \ddots & \vdots \ 0 & 0 & \cdots & 1 \ s_0 & s_1 & \cdots & s_{n-1} \end{bmatrix}$$

  • Rows $0$ to $n-1$: identity matrix $I_n$ (enforces binary structure via target $\tfrac{1}{2}$)
  • Row $n$: element sizes $\mathbf{s}$

Step 2 — Construct target vector:

$$\mathbf{t} = \left(\tfrac{1}{2}, \tfrac{1}{2}, \ldots, \tfrac{1}{2}, B\right) \in \mathbb{Q}^{n+1}$$

where $B$ is the SubsetSum target (part of the input, not computed).

Step 3 — Set variable bounds:

Each $x_i \in {0, 1}$ (binary bounds).

CVP problem: Find $\mathbf{x} \in \mathbb{Z}^n$ minimizing $|\mathbf{B}\mathbf{x} - \mathbf{t}|^2$.

Why it works:

For any binary vector $\mathbf{x} \in {0, 1}^n$:
$$|\mathbf{B}\mathbf{x} - \mathbf{t}|^2 = \sum_{i=0}^{n-1}(x_i - \tfrac{1}{2})^2 + \left(\sum_{i} s_i x_i - B\right)^2 = \frac{n}{4} + \left(\sum_{i} s_i x_i - B\right)^2$$

  • The identity block contributes a constant $\tfrac{n}{4}$ for all binary vectors (each $(x_i - \tfrac{1}{2})^2 = \tfrac{1}{4}$), and penalizes non-binary values.
  • The size row contributes $(\sum s_i x_i - B)^2 = 0$ if and only if the subset sums to $B$.
  • Therefore, the minimum distance $\tfrac{n}{4}$ is achieved if and only if the SubsetSum instance is satisfiable, and every CVP minimizer corresponds to a satisfying assignment.

Solution extraction: The CVP solution $\mathbf{x}$ directly gives the SubsetSum selection vector. If $x_i = 1$, element $i$ is included.

Size Overhead

Target metric (code name) Formula (using source getters)
ambient_dimension num_elements + 1
num_basis_vectors num_elements

Validation Method

Closed-loop testing: enumerate all $2^n$ binary vectors for SubsetSum, solve the reduced CVP by brute-force, and verify that the CVP minimizers are exactly the satisfying SubsetSum assignments. The minimum squared distance should equal $\tfrac{n}{4}$ when a solution exists. Test edge cases: no solution (minimum distance $> \tfrac{n}{4}$), multiple solutions, single-element subset.

Example

Source instance: $n = 4$ elements, target $B = 11$.

Element Size
0 3
1 7
2 1
3 8

Satisfying assignments: ${0, 3}$ ($3 + 8 = 11$) and ${0, 1, 2}$ ($3 + 7 + 1 = 11$).

CVP formulation:

Basis $\mathbf{B}$ ($5 \times 4$):

col 0 col 1 col 2 col 3
row 0 1 0 0 0
row 1 0 1 0 0
row 2 0 0 1 0
row 3 0 0 0 1
row 4 3 7 1 8

Target: $\mathbf{t} = (0.5,; 0.5,; 0.5,; 0.5,; 11)$

Verification (all 16 binary vectors):

$\mathbf{x}$ Sum Satisfies? $|\mathbf{B}\mathbf{x} - \mathbf{t}|^2$
$(1,0,0,1)$ 11 yes 1.00
$(1,1,1,0)$ 11 yes 1.00
$(0,1,0,1)$ 15 no 17.00
$(1,1,0,0)$ 10 no 2.00
$(0,0,1,1)$ 9 no 5.00
$(1,0,1,0)$ 4 no 50.00
$(0,1,1,0)$ 8 no 10.00
$(1,1,0,1)$ 18 no 50.00
$(0,0,0,1)$ 8 no 10.00
$(1,0,1,1)$ 12 no 2.00
$(0,1,1,1)$ 16 no 26.00
$(1,1,1,1)$ 19 no 65.00
$(1,0,0,0)$ 3 no 65.00
$(0,1,0,0)$ 7 no 17.00
$(0,0,1,0)$ 1 no 101.00
$(0,0,0,0)$ 0 no 122.00

The two CVP minimizers $(1,0,0,1)$ and $(1,1,1,0)$ both achieve distance $1.00 = \tfrac{n}{4} = \tfrac{4}{4}$, and are exactly the two satisfying SubsetSum assignments. All non-satisfying vectors have strictly larger distance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    GoodAn issue passed all checks.ruleA new reduction rule to be added.

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions