Skip to content

Enhance the capabilities of Lance Change Data Feed #5445

@zhangyue19921010

Description

@zhangyue19921010

Currently, CDF allows users to specify a begin version and an end version to retrieve incrementally inserted data or incrementally updated data. In fact, users often have the following requirements when using incremental reading:

  1. Set a time window by specifying a start date and an end date to limit the time range.
  2. Retrieve incremental upsert data (inserts and updates) through a single SQL query.

For the first requirement:

  • Allow users to specify a start date timestamp and an end date timestamp. Iterate through the versions to find the first version greater than the start date timestamp and the last version less than the end date timestamp, then set these as the start version and end version accordingly.

For the second requirement:

  • It is necessary to construct filter conditions for upserts to retrieve both inserted and updated data simultaneously.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions