Skip to content

Add API to support overriding protocol_config in WrappedFileSystemFlavour #291

@rahuliyer95

Description

@rahuliyer95

I know that WrappedFileSystemFlavour is an internal and experimental class but I am in a situation where I need to override the protocol_config in this class to support my custom UPath which mimicks the behavior of S3Path. Here is a minimal example to reproduce this where I want to create a UPath called foo

test.py
import fsspec  # type: ignore [import-untyped]
from fsspec.implementations.arrow import ArrowFSWrapper  # type: ignore [import-untyped]
from fsspec.utils import infer_storage_options  # type: ignore [import-untyped]
from upath import UPath
from upath import registry as upath_registry
from upath._flavour import WrappedFileSystemFlavour
from upath.implementations.cloud import S3Path


class FooFileSystem(ArrowFSWrapper):
    protocol = "foo"

    def __init__(self, *args, **kwargs):
        from pyarrow.fs import S3FileSystem  # type: ignore [import-untyped]

        fs = S3FileSystem()
        super().__init__(fs=fs, **kwargs)

    @classmethod
    def _strip_protocol(cls, path: str) -> str:
        # upstream fsspec has hardcoded `host + path` for s3/s3a we need this for `foo` as well.
        storage_opts = infer_storage_options(path)
        if host := storage_opts.get("host"):
            storage_opts["path"] = host + storage_opts["path"]
        path_without_protocol = str(storage_opts["path"])
        if path_without_protocol.startswith("//"):
            # special case for "hdfs://path" (without the triple slash)
            path_without_protocol = path_without_protocol[1:]
        return path_without_protocol


class FooPath(S3Path):
    pass


fsspec.register_implementation("foo", FooFileSystem)
upath_registry.register_implementation("foo", FooPath)
path = UPath("foo://bar/baz")

throws the following error

Traceback (most recent call last):
  File "/Users/rahuliyer/test.py", line 47, in <module>
    path = UPath("foo://bar/baz")
  File "/Users/rahuliyer/.venv/lib/python3.10/site-packages/upath/implementations/cloud.py", line 92, in __init__
    raise ValueError("non key-like path provided (bucket/container missing)")
ValueError: non key-like path provided (bucket/container missing)

The same works if I override protocol_config in the following manner before initializing the UPath

WrappedFileSystemFlavour.protocol_config["netloc_is_anchor"] |= {"foo"}
WrappedFileSystemFlavour.protocol_config["supports_empty_parts"] |= {"foo"}

What's the best approach here to get this working without having the override protocol_config of an internal class?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions