Skip to content

feat: implement Best compression level (zstd level 11, btlazy2 strategy) #7

@polaz

Description

@polaz

Summary

CompressionLevel::Best currently triggers unimplemented!(). Level 11 uses binary tree with lazy matching — highest compression ratio in the current enum.

Analysis

C reference implementation (zstd level 11)

  • Strategy: ZSTD_btlazy2 (binary tree + lazy2)
  • Parameters: windowLog=24, chainLog=25, hashLog=24, searchLog=7, minMatch=3, targetLength=128
  • Key file: lib/compress/zstd_lazy.c (ZSTD_insertDUBT1, ZSTD_DUBT_findBestMatch)

Algorithm

  1. Binary search tree — sorted by suffix content for O(log n) match finding
  2. DUBT (Doubly-Unsorted Binary Tree) — insertion marks entries as sorted/unsorted (ZSTD_DUBT_UNSORTED_MARK = 1)
  3. Common-length optimizationcommonLengthSmaller/Larger to skip already-compared bytes during tree traversal
  4. Lazy2 on top — same 2-step look-ahead as level 7 but with btree match finder instead of hash chains

What needs to be implemented

  1. Binary tree data structure — left/right pointers in chain table, sorted insertion
  2. DUBT insertioninsertDUBT1 equivalent with sorted/unsorted mark handling
  3. Tree searchZSTD_DUBT_findBestMatch with common-length caching
  4. Integration with lazy2 — same lazy evaluation but better matches from btree

Acceptance criteria

  • CompressionLevel::Best no longer panics
  • Roundtrip + interop tests pass
  • Compression ratio within 10% of C zstd level 11 on silesia corpus
  • Noticeably better ratio than Better level

Dependencies

Time estimate

4d

Blocked by

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2-mediumMedium priority — important improvementenhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions