[Rule] LongestCommonSubsequence to ILP

**Source:** LongestCommonSubsequence
**Target:** ILP
**Motivation:** Enables solving LCS instances via ILP solvers; the match-pair formulation naturally encodes subsequence ordering as linear constraints.
**Reference:** [Blum et al., 2021](https://doi.org/10.1016/j.cor.2020.105089) — discusses ILP formulation with match-pair binary variables and conflict/no-crossing constraints for LCS (Section 2, ILP model); the match-pair ILP formulation for sequence problems is also described in [Althaus et al., 2006](https://doi.org/10.1007/s10107-005-0659-3) in the context of sequence alignment.

## Reduction Algorithm

**Notation:**
- Source: $k = 2$ strings $s_1$ (length $n_1$) and $s_2$ (length $n_2$) over alphabet $\Sigma$
- Target: ILP with binary variables, linear constraints, and a maximize objective

**Scope:** This reduction handles the 2-string LCS case only.

**Variable mapping:**

For each pair of positions $(j_1, j_2)$ where $s_1[j_1] = s_2[j_2]$, create a binary variable:

$$m_{j_1, j_2} \in \{0, 1\}$$

meaning "position $j_1$ of $s_1$ is matched to position $j_2$ of $s_2$ in the common subsequence."

Total variables: $|M|$ where $M = \{(j_1, j_2) : s_1[j_1] = s_2[j_2]\}$, bounded by $n_1 \cdot n_2$.

**Constraints:**

1. **Each position in $s_1$ matched at most once:** for each $j_1 \in \{0, \ldots, n_1-1\}$,
$$\sum_{j_2 : (j_1,j_2) \in M} m_{j_1, j_2} \leq 1$$

2. **Each position in $s_2$ matched at most once:** for each $j_2 \in \{0, \ldots, n_2-1\}$,
$$\sum_{j_1 : (j_1,j_2) \in M} m_{j_1, j_2} \leq 1$$

3. **Order preservation (no crossings):** for all $(j_1, j_2), (j_1', j_2') \in M$ with $j_1 < j_1'$ and $j_2 > j_2'$,
$$m_{j_1, j_2} + m_{j_1', j_2'} \leq 1$$

**Objective:** maximize $\sum_{(j_1,j_2) \in M} m_{j_1, j_2}$

**Solution extraction:** The matched positions with $m_{j_1,j_2} = 1$, sorted by $j_1$, give the characters of the LCS.

## Size Overhead

Worst-case bounds (all characters match, i.e., $|M| = n_1 \cdot n_2$):

| Target metric (code name) | Worst-case bound |
|---|---|
| `num_vars` | $n_1 \cdot n_2$ |
| `num_constraints` | $n_1 + n_2 + \binom{n_1 \cdot n_2}{2}$ |

In overhead expression syntax (using source problem getters):
- `num_vars = num_chars_first * num_chars_second`
- `num_constraints = num_chars_first + num_chars_second + (num_chars_first * num_chars_second) ^ 2`

Note: the actual number of variables and crossing constraints depends on the input content (character matches), so these are upper bounds. The crossing constraint count $\binom{|M|}{2}$ is $O(n_1^2 \cdot n_2^2)$ in the worst case.

## Validation Method

Closed-loop testing: solve LCS by brute-force, solve the reduced ILP, and verify that both give the same optimal objective. Test with varying alphabet sizes and string lengths, including edge cases (identical strings, no common characters, single-character alphabet).

## Example

**Source instance:** $s_1$ = `ABAC`, $s_2$ = `BACA`, alphabet $\Sigma = \{A, B, C\}$.

**Variables** (6 match pairs):

| Variable | $s_1$ pos | $s_2$ pos | Character |
|:---:|:---:|:---:|:---:|
| $m_{0,1}$ | 0 | 1 | A |
| $m_{0,3}$ | 0 | 3 | A |
| $m_{1,0}$ | 1 | 0 | B |
| $m_{2,1}$ | 2 | 1 | A |
| $m_{2,3}$ | 2 | 3 | A |
| $m_{3,2}$ | 3 | 2 | C |

**Constraints** (13 total):

Assignment for $s_1$: $m_{0,1} + m_{0,3} \leq 1$, $m_{2,1} + m_{2,3} \leq 1$ (plus 2 trivial single-variable constraints).

Assignment for $s_2$: $m_{0,1} + m_{2,1} \leq 1$, $m_{0,3} + m_{2,3} \leq 1$ (plus 2 trivial).

No-crossing: $m_{0,1} + m_{1,0} \leq 1$, $m_{0,3} + m_{1,0} \leq 1$, $m_{0,3} + m_{2,1} \leq 1$, $m_{0,3} + m_{3,2} \leq 1$, $m_{2,3} + m_{3,2} \leq 1$.

**Objective:** maximize $m_{0,1} + m_{0,3} + m_{1,0} + m_{2,1} + m_{2,3} + m_{3,2}$

**Optimal ILP solution:** objective = 3, with $m_{1,0} = m_{2,1} = m_{3,2} = 1$ (all others 0).

Extracted LCS: $s_1[1] = s_2[0]$ = B, $s_1[2] = s_2[1]$ = A, $s_1[3] = s_2[2]$ = C → **BAC** (length 3), matching the brute-force LCS.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Rule] LongestCommonSubsequence to ILP #110

Reduction Algorithm

Size Overhead

Validation Method

Example

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Target metric (code name)	Worst-case bound
`num_vars`	$n_1 \cdot n_2$
`num_constraints`	$n_1 + n_2 + \binom{n_1 \cdot n_2}{2}$

Variable	$s_1$ pos	$s_2$ pos	Character
$m_{0,1}$	0	1	A
$m_{0,3}$	0	3	A
$m_{1,0}$	1	0	B
$m_{2,1}$	2	1	A
$m_{2,3}$	2	3	A
$m_{3,2}$	3	2	C

[Rule] LongestCommonSubsequence to ILP #110

Description

Reduction Algorithm

Size Overhead

Validation Method

Example

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions