Skip to content

Expose huggingface hfFileSystem protocol via OpenDAL in pylance #5346

@pavanramkumar

Description

@pavanramkumar

I would like fragment-level control on lance tables on huggingface hub.

import lance

lance_ds_path = f"hf://datasets/{dataset_id}/{path_in_repo}.lance"
ds = lance.dataset(lance_ds_path)
fragments = ds.get_fragments()
fragment_generators = []

for fragment in fragments:
    fragment_generators = fragment.to_batches()

OpenDAL already supports a huggingface endpoint. See recent pull request here.

It would be great to expose the hfFileSystem in pylance via OpenDAL as a next step, potentially by creating a new provider called huggingface.rs or similar here.

Additional context in this huggingface/datasets issue

Tagging @Xuanwo as requested in the above thread.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions