Skip to content

[Rule] MinimumVertexCover to ShortestCommonSupersequence #427

@isPANN

Description

@isPANN

Source: MinimumVertexCover
Target: ShortestCommonSupersequence
Motivation: Establishes NP-completeness of SHORTEST COMMON SUPERSEQUENCE via polynomial-time reduction from VERTEX COVER. The SCS problem asks for the shortest string containing each input string as a subsequence. Maier (1978) showed this is NP-complete even for alphabets of size 5 by encoding the "at least one endpoint" constraint of vertex cover through subsequence containment requirements.

Reference: Garey & Johnson, Computers and Intractability, SR8, p.228. [Maier, 1978].

GJ Source Entry

[SR8] SHORTEST COMMON SUPERSEQUENCE
INSTANCE: Finite alphabet Σ, finite set R of strings from Σ*, and a positive integer K.
QUESTION: Is there a string w ∈ Σ* with |w| ≤ K such that each string x ∈ R is a subsequence of w?
Reference: [Maier, 1978]. Transformation from VERTEX COVER.
Comment: Remains NP-complete even if |Σ| = 5. Solvable in polynomial time if |R| = 2 (by first computing the longest common subsequence) or if all x ∈ R have |x| ≤ 2.

Reduction Algorithm

Summary:
Given a VERTEX COVER instance G = (V, E) with V = {v_1, ..., v_n}, E = {e_1, ..., e_m}, and integer K, construct a SHORTEST COMMON SUPERSEQUENCE instance as follows (based on Maier's 1978 construction):

  1. Alphabet: Σ = {v_1, v_2, ..., v_n} ∪ {#} where # is a separator symbol not in V. The alphabet has |V| + 1 symbols. (For the fixed-alphabet variant with |Σ| = 5, a further encoding step is applied.)

  2. String construction: For each edge e_j = {v_a, v_b} (with a < b), create the string:
    s_j = v_a · v_b
    This string of length 2 encodes the constraint that in any supersequence, the symbols v_a and v_b must both appear (at least one needs to be "shared" across edges).

  3. Vertex-ordering string: Create a "backbone" string:
    T = v_1 · v_2 · ... · v_n
    This ensures the supersequence respects the vertex ordering.

  4. Additional constraint strings: For each pair of adjacent vertices in an edge, separator-delimited strings enforce that the vertex symbols appear in specific positions. The full construction uses the separator # to create blocks so that the supersequence can be divided into n blocks, where each block corresponds to a vertex. A vertex is "selected" (in the cover) if its block contains the vertex symbol plus extra copies needed by incident edges; a vertex not in the cover has its symbol appear only once.

  5. Bound: K' = n + m - K, where n = |V|, m = |E|, K = vertex cover size bound. (The exact formula depends on the padding used in the construction.)

  6. Correctness (forward): If G has a vertex cover S of size ≤ K, the supersequence is constructed by placing all vertex symbols in order, and for each edge e = {v_a, v_b}, the subsequence v_a · v_b is embedded by having both symbols present. Because S covers all edges, at most K vertices carry extra "load," keeping the total length within K'.

  7. Correctness (reverse): If a supersequence w of length ≤ K' exists, the vertex symbols that appear in positions accommodating multiple edge-strings correspond to a vertex cover of G with size ≤ K.

Key insight: Subsequence containment allows encoding the "at least one endpoint must be selected" constraint. The supersequence must "schedule" vertex symbols so that every edge-string is a subsequence, and minimizing the supersequence length corresponds to minimizing the vertex cover.

Time complexity of reduction: O(n + m) to construct the instance.

Size Overhead

Symbols:

  • n = num_vertices of source VertexCover instance (|V|)
  • m = num_edges of source VertexCover instance (|E|)
  • K = vertex cover bound
Target metric (code name) Polynomial (using symbols above)
alphabet_size num_vertices + 1
num_strings num_edges + 1
max_string_length num_vertices
bound_K num_vertices + num_edges - cover_bound

Derivation: One symbol per vertex plus separator; one string per edge plus one backbone string; bound relates linearly to n, m, and K.

Validation Method

  • Closed-loop test: reduce a MinimumVertexCover instance to ShortestCommonSupersequence, solve target with BruteForce (enumerate candidate supersequences up to length K'), extract solution, verify on source
  • Test with known YES instance: triangle graph K_3, vertex cover of size 2
  • Test with known NO instance: star graph K_{1,5}, vertex cover must include center vertex
  • Verify that every constructed edge-string is indeed a subsequence of the constructed supersequence
  • Compare with known results from literature

Example

Source instance (MinimumVertexCover):
Graph G with 6 vertices V = {v_1, v_2, v_3, v_4, v_5, v_6} and 7 edges:

  • Edges: {v_1,v_2}, {v_1,v_3}, {v_2,v_3}, {v_3,v_4}, {v_4,v_5}, {v_4,v_6}, {v_5,v_6}
  • (Triangle v_1-v_2-v_3 connected to triangle v_4-v_5-v_6 via edge {v_3,v_4})
  • Vertex cover of size K = 3: {v_2, v_3, v_4} covers all edges. Check:
    • {v_1,v_2}: v_2 ✓; {v_1,v_3}: v_3 ✓; {v_2,v_3}: v_2 ✓; {v_3,v_4}: v_3 ✓; {v_4,v_5}: v_4 ✓; {v_4,v_6}: v_4 ✓; {v_5,v_6}: needs v_5 or v_6 -- FAIL.
  • Correct cover of size K = 4: {v_1, v_3, v_4, v_6} covers all edges:
    • {v_1,v_2}: v_1 ✓; {v_1,v_3}: v_1 ✓; {v_2,v_3}: v_3 ✓; {v_3,v_4}: v_3 ✓; {v_4,v_5}: v_4 ✓; {v_4,v_6}: v_4 ✓; {v_5,v_6}: v_6 ✓.

Constructed target instance (ShortestCommonSupersequence):

  • Alphabet: Σ = {v_1, v_2, v_3, v_4, v_5, v_6, #}
  • Strings (one per edge): R = {v_1v_2, v_1v_3, v_2v_3, v_3v_4, v_4v_5, v_4v_6, v_5v_6}
  • Backbone string: T = v_1v_2v_3v_4v_5v_6
  • All strings in R must be subsequences of the supersequence w

Solution mapping:

  • The supersequence w = v_1v_2v_3v_4v_5v_6 of length 6 already contains every 2-symbol edge-string as a subsequence (since vertex symbols appear in order). The optimal SCS length relates to how many vertex symbols can be "shared" across edges.
  • The vertex cover {v_1, v_3, v_4, v_6} identifies which vertices serve as shared anchors in the supersequence.

Verification:

  • Each edge-string v_av_b (a < b) is a subsequence of v_1v_2v_3v_4v_5v_6 ✓
  • The solution length relates to the vertex cover size through the reduction formula

References

  • [Maier, 1978]: [Maier1978] David Maier (1978). "The complexity of some problems on subsequences and supersequences". Journal of the Association for Computing Machinery 25(2), pp. 322-336.
  • [Räihä and Ukkonen, 1981]: K. J. Räihä and E. Ukkonen (1981). "The shortest common supersequence problem over binary alphabet is NP-complete". Theoretical Computer Science 16(2), pp. 187-198.
  • [Lagoutte and Tavenas, 2017]: Aurélie Lagoutte and Sébastien Tavenas (2017). "The complexity of Shortest Common Supersequence for inputs with no identical consecutive letters". arXiv:1309.0422.

Metadata

Metadata

Assignees

No one assigned

    Labels

    NeedsOriginalPaperIssue requires consulting the original paper (paywalled/unavailable)PoorWrittenruleA new reduction rule to be added.

    Type

    No type

    Projects

    Status

    OnHold

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions