Skip to content

[Feature] A FileIO API to list files iteratively #4791

@smdsbz

Description

@smdsbz

Search before asking

  • I searched in the issues and found nothing similar.

Motivation

Currently the FileIO interface only supports listing all files / directories under a given path at a time. As a consequence callers of FileIO, e.g. ObjectRefresh, can only choose to load the entire catalog of files into memory, which may lead to poor performance and OOM.

Solution

Introduce paged list API like the following:

Pair<FileStatus[], String> listFilesPaged(
        Path path, boolean recursive, long pageSize, @Nullable String continuationToken)

This should allow implementations to take advantage of batched list APIs that are commonly seen in object stores, e.g. ListObjectsV2 with continuation token.

Anything else?

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions