Source: LongestCommonSubsequence
Target: ILP
Motivation: Enables solving LCS instances via ILP solvers; the match-pair formulation naturally encodes subsequence ordering as linear constraints.
Reference: Blum et al., 2021 — discusses ILP formulation with match-pair binary variables and conflict/no-crossing constraints for LCS (Section 2, ILP model); the match-pair ILP formulation for sequence problems is also described in Althaus et al., 2006 in the context of sequence alignment.
Reduction Algorithm
Notation:
- Source: $k = 2$ strings $s_1$ (length $n_1$) and $s_2$ (length $n_2$) over alphabet $\Sigma$
- Target: ILP with binary variables, linear constraints, and a maximize objective
Scope: This reduction handles the 2-string LCS case only.
Variable mapping:
For each pair of positions $(j_1, j_2)$ where $s_1[j_1] = s_2[j_2]$, create a binary variable:
$$m_{j_1, j_2} \in {0, 1}$$
meaning "position $j_1$ of $s_1$ is matched to position $j_2$ of $s_2$ in the common subsequence."
Total variables: $|M|$ where $M = {(j_1, j_2) : s_1[j_1] = s_2[j_2]}$, bounded by $n_1 \cdot n_2$.
Constraints:
-
Each position in $s_1$ matched at most once: for each $j_1 \in {0, \ldots, n_1-1}$,
$$\sum_{j_2 : (j_1,j_2) \in M} m_{j_1, j_2} \leq 1$$
-
Each position in $s_2$ matched at most once: for each $j_2 \in {0, \ldots, n_2-1}$,
$$\sum_{j_1 : (j_1,j_2) \in M} m_{j_1, j_2} \leq 1$$
-
Order preservation (no crossings): for all $(j_1, j_2), (j_1', j_2') \in M$ with $j_1 < j_1'$ and $j_2 > j_2'$,
$$m_{j_1, j_2} + m_{j_1', j_2'} \leq 1$$
Objective: maximize $\sum_{(j_1,j_2) \in M} m_{j_1, j_2}$
Solution extraction: The matched positions with $m_{j_1,j_2} = 1$, sorted by $j_1$, give the characters of the LCS.
Size Overhead
Worst-case bounds (all characters match, i.e., $|M| = n_1 \cdot n_2$):
| Target metric (code name) |
Worst-case bound |
num_vars |
$n_1 \cdot n_2$ |
num_constraints |
$n_1 + n_2 + \binom{n_1 \cdot n_2}{2}$ |
In overhead expression syntax (using source problem getters):
num_vars = num_chars_first * num_chars_second
num_constraints = num_chars_first + num_chars_second + (num_chars_first * num_chars_second) ^ 2
Note: the actual number of variables and crossing constraints depends on the input content (character matches), so these are upper bounds. The crossing constraint count $\binom{|M|}{2}$ is $O(n_1^2 \cdot n_2^2)$ in the worst case.
Validation Method
Closed-loop testing: solve LCS by brute-force, solve the reduced ILP, and verify that both give the same optimal objective. Test with varying alphabet sizes and string lengths, including edge cases (identical strings, no common characters, single-character alphabet).
Example
Source instance: $s_1$ = ABAC, $s_2$ = BACA, alphabet $\Sigma = {A, B, C}$.
Variables (6 match pairs):
| Variable |
$s_1$ pos |
$s_2$ pos |
Character |
| $m_{0,1}$ |
0 |
1 |
A |
| $m_{0,3}$ |
0 |
3 |
A |
| $m_{1,0}$ |
1 |
0 |
B |
| $m_{2,1}$ |
2 |
1 |
A |
| $m_{2,3}$ |
2 |
3 |
A |
| $m_{3,2}$ |
3 |
2 |
C |
Constraints (13 total):
Assignment for $s_1$: $m_{0,1} + m_{0,3} \leq 1$, $m_{2,1} + m_{2,3} \leq 1$ (plus 2 trivial single-variable constraints).
Assignment for $s_2$: $m_{0,1} + m_{2,1} \leq 1$, $m_{0,3} + m_{2,3} \leq 1$ (plus 2 trivial).
No-crossing: $m_{0,1} + m_{1,0} \leq 1$, $m_{0,3} + m_{1,0} \leq 1$, $m_{0,3} + m_{2,1} \leq 1$, $m_{0,3} + m_{3,2} \leq 1$, $m_{2,3} + m_{3,2} \leq 1$.
Objective: maximize $m_{0,1} + m_{0,3} + m_{1,0} + m_{2,1} + m_{2,3} + m_{3,2}$
Optimal ILP solution: objective = 3, with $m_{1,0} = m_{2,1} = m_{3,2} = 1$ (all others 0).
Extracted LCS: $s_1[1] = s_2[0]$ = B, $s_1[2] = s_2[1]$ = A, $s_1[3] = s_2[2]$ = C → BAC (length 3), matching the brute-force LCS.
Source: LongestCommonSubsequence
Target: ILP
Motivation: Enables solving LCS instances via ILP solvers; the match-pair formulation naturally encodes subsequence ordering as linear constraints.
Reference: Blum et al., 2021 — discusses ILP formulation with match-pair binary variables and conflict/no-crossing constraints for LCS (Section 2, ILP model); the match-pair ILP formulation for sequence problems is also described in Althaus et al., 2006 in the context of sequence alignment.
Reduction Algorithm
Notation:
Scope: This reduction handles the 2-string LCS case only.
Variable mapping:
For each pair of positions$(j_1, j_2)$ where $s_1[j_1] = s_2[j_2]$ , create a binary variable:
meaning "position$j_1$ of $s_1$ is matched to position $j_2$ of $s_2$ in the common subsequence."
Total variables:$|M|$ where $M = {(j_1, j_2) : s_1[j_1] = s_2[j_2]}$ , bounded by $n_1 \cdot n_2$ .
Constraints:
Each position in$s_1$ matched at most once: for each $j_1 \in {0, \ldots, n_1-1}$ ,
$$\sum_{j_2 : (j_1,j_2) \in M} m_{j_1, j_2} \leq 1$$
Each position in$s_2$ matched at most once: for each $j_2 \in {0, \ldots, n_2-1}$ ,
$$\sum_{j_1 : (j_1,j_2) \in M} m_{j_1, j_2} \leq 1$$
Order preservation (no crossings): for all$(j_1, j_2), (j_1', j_2') \in M$ with $j_1 < j_1'$ and $j_2 > j_2'$ ,
$$m_{j_1, j_2} + m_{j_1', j_2'} \leq 1$$
Objective: maximize$\sum_{(j_1,j_2) \in M} m_{j_1, j_2}$
Solution extraction: The matched positions with$m_{j_1,j_2} = 1$ , sorted by $j_1$ , give the characters of the LCS.
Size Overhead
Worst-case bounds (all characters match, i.e.,$|M| = n_1 \cdot n_2$ ):
num_varsnum_constraintsIn overhead expression syntax (using source problem getters):
num_vars = num_chars_first * num_chars_secondnum_constraints = num_chars_first + num_chars_second + (num_chars_first * num_chars_second) ^ 2Note: the actual number of variables and crossing constraints depends on the input content (character matches), so these are upper bounds. The crossing constraint count$\binom{|M|}{2}$ is $O(n_1^2 \cdot n_2^2)$ in the worst case.
Validation Method
Closed-loop testing: solve LCS by brute-force, solve the reduced ILP, and verify that both give the same optimal objective. Test with varying alphabet sizes and string lengths, including edge cases (identical strings, no common characters, single-character alphabet).
Example
Source instance:$s_1$ = $s_2$ = $\Sigma = {A, B, C}$ .
ABAC,BACA, alphabetVariables (6 match pairs):
Constraints (13 total):
Assignment for$s_1$ : $m_{0,1} + m_{0,3} \leq 1$ , $m_{2,1} + m_{2,3} \leq 1$ (plus 2 trivial single-variable constraints).
Assignment for$s_2$ : $m_{0,1} + m_{2,1} \leq 1$ , $m_{0,3} + m_{2,3} \leq 1$ (plus 2 trivial).
No-crossing:$m_{0,1} + m_{1,0} \leq 1$ , $m_{0,3} + m_{1,0} \leq 1$ , $m_{0,3} + m_{2,1} \leq 1$ , $m_{0,3} + m_{3,2} \leq 1$ , $m_{2,3} + m_{3,2} \leq 1$ .
Objective: maximize$m_{0,1} + m_{0,3} + m_{1,0} + m_{2,1} + m_{2,3} + m_{3,2}$
Optimal ILP solution: objective = 3, with$m_{1,0} = m_{2,1} = m_{3,2} = 1$ (all others 0).
Extracted LCS:$s_1[1] = s_2[0]$ = B, $s_1[2] = s_2[1]$ = A, $s_1[3] = s_2[2]$ = C → BAC (length 3), matching the brute-force LCS.