Skip to content

Panic decoding primitive miniblock dictionary page (2.0.1): unreachable in primitive.rs:1305 #5994

@kszlim

Description

@kszlim

Summary

lance==2.0.1 can panic while decoding a single primitive timestamp column when metadata forces
structural-encoding=miniblock + compression=zstd.

Panic site:

  • lance-encoding/src/encodings/logical/primitive.rs:1305
  • unreachable!: Mini-block dictionary encoding must use Variable, Flat, or General compression

Python surface error:

  • ArrowInvalid: External error: RuntimeError: Task was aborted

Versions

  • Python lance==2.0.1
  • Rust crates in writer side: lance = 2.0.1, lance-encoding = 2.0.1
import tempfile
from pathlib import Path
import numpy as np, pyarrow as pa, lance
rng=np.random.default_rng(6); runs=np.minimum(rng.geometric(0.45,500_000),200); incs=rng.integers(1,201,runs.size)
vals=np.repeat(np.cumsum(np.r_[1_586_995_200_000,incs[:-1]]),runs)[:10_000]
meta={b'lance-encoding:structural-encoding':b'miniblock',b'lance-encoding:compression':b'zstd',b'lance-encoding:compression-level':b'3'}
table=pa.Table.from_arrays([pa.array(vals,type=pa.timestamp('ms'))],schema=pa.schema([pa.field('timestamp',pa.timestamp('ms'),False,metadata=meta)]))
with tempfile.TemporaryDirectory() as d:
    uri=str(Path(d)/'repro.lance'); lance.write_dataset(table,uri,mode='create',max_rows_per_file=1_048_576,max_rows_per_group=131_072,data_storage_version='2.2',enable_stable_row_ids=True,enable_v2_manifest_paths=True)
    lance.dataset(uri).to_table(columns=['timestamp'])

Observed

  • Repro fails consistently with background panic + ArrowInvalid.

Expected

  • No panic in reader threads.
  • Either successful decode or a graceful validation error.

Additional data point

Setting LANCE_ENCODING_DICT_TOO_SMALL=99999999 before write avoids this failure in my environment,
which suggests a dictionary/miniblock interaction in this path.

Metadata

Metadata

Assignees

Labels

critical-fixBugs that cause crashes, security vulnerabilities, or incorrect data.

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions