I would like fragment-level control on lance tables on huggingface hub.
import lance
lance_ds_path = f"hf://datasets/{dataset_id}/{path_in_repo}.lance"
ds = lance.dataset(lance_ds_path)
fragments = ds.get_fragments()
fragment_generators = []
for fragment in fragments:
fragment_generators = fragment.to_batches()
OpenDAL already supports a huggingface endpoint. See recent pull request here.
It would be great to expose the hfFileSystem in pylance via OpenDAL as a next step, potentially by creating a new provider called huggingface.rs or similar here.
Additional context in this huggingface/datasets issue
Tagging @Xuanwo as requested in the above thread.
I would like fragment-level control on lance tables on huggingface hub.
OpenDAL already supports a huggingface endpoint. See recent pull request here.
It would be great to expose the
hfFileSystemin pylance via OpenDAL as a next step, potentially by creating a new provider calledhuggingface.rsor similar here.Additional context in this
huggingface/datasetsissueTagging @Xuanwo as requested in the above thread.