Skip to content

[Bug] Time field for record-level expire may block compaction #4724

@luowanghaoyun

Description

@luowanghaoyun

Search before asking

  • I searched in the issues and found nothing similar.

Paimon version

0.9.0

Compute Engine

Flink

Minimal reproduce step

CREATE TABLE T (pk INT, a INT, ts INT) 
WITH (
  'bucket'='1', 
  'primary-key'='pk', 
  'full-compaction.delta-commits' = '1', 
  'record-level.expire-time'='1s', 
  'record-level.time-field'='ts')
-- batchsql 1, no compact, "dirty data"
INSERT INTO T VALUES (1, 1, CAST(NULL AS INT));

-- batchsql 2,trigger compact
INSERT INTO T VALUES (2, 2, 2);    -- ERROR: Time field for record-level expire should not be null.

What doesn't meet your expectations?

When a row of dirty data (the value of time field for record-level expire is null) writed into L0 files, the following compaction will fail. I think it is a dangerous phenomenon, if there is no snapshot to roll back to.
So should there be stricter constraints on this time field? For example, when creating a table, make sure that the type of this field is not nullable.

Anything else?

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions