spec: implement bucket region caching by isidentical · Pull Request #495 · fsspec/s3fs

isidentical · 2021-06-10T09:41:02Z

Resolves #494

isidentical · 2021-06-10T12:59:07Z

I don't really have tests for this at the moment, but I will try to come up with something (at worst just mocking, but I'd rather use some sort of real stuff, like moto).

martindurant · 2021-06-10T13:03:56Z

I see the clear evidence from the linked issue that caching can significantly speed up calls, but can you summarise, please, why this is? My understanding is, that currently botocore will look up the region for the bucket of a given call every time; but once we know the region for some bucket, we should be able to reuse it and avoid that call. It is surprising (to me) that this could be 3x slower, but maybe it depends on where the region of the bucket is versus the s3 lookup endpoint.

Does this require a separate client for each region (or each bucket?), or could the region be specified at call time. Maybe that's actually what you do.

The case against making this default or the only path would be that HEAD_BUCKET can fail. Is the bucket region not available via an initial region-less call response's metadata?

martindurant · 2021-06-10T13:05:04Z

I don't think moto handles regions and likely neither does minio - not that we use the latter in tests (yet #423 )

isidentical · 2021-06-10T13:19:52Z

I see the clear evidence from the linked issue that caching can significantly speed up calls, but can you summarise, please, why this is?

The most basic explanation is that, every time a region is unknown for HEAD_OBJECT calls boto fails with 400 first and then goes on the route of resolving these. And when a connection gets 400, the underlying aiohttp connection gets terminated so for every call you are now paying the cost of the entire transport creation.

It is surprising (to me) that this could be 3x slower, but maybe it depends on where the region of the bucket is versus the s3 lookup endpoint

If you check the numbers of the linked issue, there is a huge gap which was netted multiple testers. And when the overall connection is getting faster, the gap gets higher.

Does this require a separate client for each region (or each bucket?), or could the region be specified at call time. Maybe that's actually what you do.

That was the first thing that I wanted to do, since client creation costs some time (0.05 seconds or something) also the experience is not great. But from what I can see, you have create a new client for each region you want to support. We do cache those clients, so if you have 6 buckets from 2 different regions, you only create 3 different clients. A generic one (to be used in HEAD_BUCKET calls), and 2 clients for 2 different regions.

The case against making this default or the only path would be that HEAD_BUCKET can fail. Is the bucket region not available via an initial region-less call response's metadata?

I'd rather separate these. I'll look into whether we can retrieve the region from an initial call, though even so I've experienced some problems on DVC side with pinning the region/signature version, some users' settings might be broken etc. Though if we could retrieve the metadata from an existing call, then we could perhaps make it just a bit more faster for the initial case. Need to look into it.

isidentical · 2021-06-10T13:20:28Z

I don't think moto handles regions and likely neither does minio - not that we use the latter in tests (yet #423 )

Hmmm.

martindurant · 2021-06-10T13:28:51Z

Though if we could retrieve the metadata from an existing call, then we could perhaps make it just a bit more faster for the initial case.

Even if not, it should make for simpler code. When debugging some future issues, it would be great not to have to ask whether region client caching was on, and whether, to their knowledge, the HEAD_BUCKET had succeeded.

By the way, is HEAD_BUCKET typically allowed for publicly accessible buckets?

isidentical · 2021-06-11T09:42:12Z

By the way, is HEAD_BUCKET typically allowed for publicly accessible buckets?

AFAIK, yes, ACL public-read/public-read-write both support HEAD_BUCKET.

isidentical · 2021-06-14T07:51:20Z

Even if not, it should make for simpler code. When debugging some future issues, it would be great not to have to ask whether region client caching was on, and whether, to their knowledge, the HEAD_BUCKET had succeeded.

After checking a bit, i don't think it is possible to defer this calls. At least from what I an see, we should make them initially. I'll add a bit of logging to help for better debugging and also a guard around HEAD_BUCKET to ensure that we return the generic (region unbound) client if the call fails.

isidentical · 2021-06-17T11:03:01Z

One idea that might make this even faster is that, permanent caching. It might sound a bit odd at the first glance, but if you think that all bucket names are globally unique, and their regions are constant (you can't change a bucket's region without actually deleting the bucket and re-creating it with the same name which is really really unlikely so I don't even see a need for that, but perhaps it might be handled in an EAFP way, like try with the known region from the permanent cache on the system and if that fails go back and resolve it again and invalidate the current cache, which would be a bit less performant for initial call on a really really rare case), this optimization really does make sense. Especially for short running processes (e.g CLI applications) which use s3fs. If they are making only a couple of info calls, then they might shave %50 of their runtime by simply enabling this option.

martindurant · 2021-06-17T12:47:53Z

I would be fine with persisting bucket regions and invalidating on error (this should indeed be rare). Now you have to worry about writing and reading a file!

By the way, I am not advertising 2x speed (or 3x) with this improvement, but reduced latency; for cases where the data volumes are large, users may not notice the difference.

isidentical · 2021-06-17T13:03:20Z

Now you have to worry about writing and reading a file!

Not really though, this is something totally optional. If the region cache is open, and if we fail to read/write to a file then it is perfectly fine. We can still have the cache in-memory by constructing on demand.

By the way, I am not advertising 2x speed (or 3x) with this improvement, but reduced latency; for cases where the data volumes are large, users may not notice the difference.

Right. As I stressed in the original issue, this is something that mainly happens on HeadObject (not even on GetObject), so if you make 2 3 HeadObject calls and upload idk 5 GiB data, then of course it is unlikely that you will notice the difference.

martindurant

As far as I can tell, a call to list_objects_v2 also gives the bucket region with the key 'x-amz-bucket-region', so maybe we can avoid calls to HEAD_BUCKET for cases when ls (or related) is the first call on a bucket?

martindurant

OK, I am persuaded. I have a couple of small comments, but let's get this merged before release.

martindurant · 2021-06-18T15:42:10Z

+            response = await general_client.head_bucket(Bucket=bucket_name)
+        except ClientError:
+            logger.debug(
+                "RC: HEAD_BUCKET clal for %r has failed, returning the general client",


martindurant · 2021-06-18T15:42:33Z

        return self.url(path, expires=expiration, **kwargs)

+    async def _invalidate_region_cache(self):
+        if not self.cache_regions:


Please add a docstring for this

martindurant · 2021-06-18T18:55:24Z

OK, it's in!
This seems like the kind of thing you might well want to write an article about, since you potentially just increased the efficiency of some workloads by a factor of 3!

isidentical mentioned this pull request Jun 10, 2021

s3: performance implications of s3fs treeverse/dvc#5969

Closed

2 tasks

martindurant reviewed Jun 10, 2021

View reviewed changes

Comment thread s3fs/core.py

isidentical added 6 commits June 10, 2021 15:59

utils: add the bucket cache

2678663

create client in the cache manager

907fcd3

implement cache.clear()

aa74666

implement the caching to the spec

040b883

create the cache invalidation for the fs

2e643f9

implement AsyncExitStack for 3.6<=

c3b27e3

isidentical force-pushed the s3fs-bucket-cache branch from 8a62dbe to c3b27e3 Compare June 10, 2021 13:00

handle when HEAD_BUCKET fails

7f255be

isidentical force-pushed the s3fs-bucket-cache branch from 58dc632 to 7f255be Compare June 15, 2021 08:50

martindurant changed the title ~~spec: implement the region caching~~ spec: implement bucket region caching Jun 16, 2021

martindurant mentioned this pull request Jun 16, 2021

asyn: introduce fsspec.asyn.fsspec_loop fsspec/filesystem_spec#671

Merged

martindurant reviewed Jun 17, 2021

View reviewed changes

Comment thread s3fs/core.py Outdated

Comment thread s3fs/utils.py Outdated

isidentical added 2 commits June 18, 2021 13:22

allow cache invalidation when not using cache_regions

30fd7b8

implicitly disable region cache when region_name is specified

9417d19

isidentical force-pushed the s3fs-bucket-cache branch from 86fdad7 to 9417d19 Compare June 18, 2021 10:50

martindurant reviewed Jun 18, 2021

View reviewed changes

fix some typos and add docs

b37b5e9

martindurant merged commit 9416184 into fsspec:main Jun 18, 2021

isidentical mentioned this pull request Aug 10, 2021

s3: inline bucket region caching treeverse/dvc#6406

Merged

Conversation

isidentical commented Jun 10, 2021

Uh oh!

Uh oh!

isidentical commented Jun 10, 2021

Uh oh!

martindurant commented Jun 10, 2021

Uh oh!

martindurant commented Jun 10, 2021

Uh oh!

isidentical commented Jun 10, 2021

Uh oh!

isidentical commented Jun 10, 2021

Uh oh!

martindurant commented Jun 10, 2021

Uh oh!

isidentical commented Jun 11, 2021

Uh oh!

isidentical commented Jun 14, 2021

Uh oh!

isidentical commented Jun 17, 2021

Uh oh!

martindurant commented Jun 17, 2021

Uh oh!

isidentical commented Jun 17, 2021

Uh oh!

martindurant left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

martindurant left a comment

Choose a reason for hiding this comment

Uh oh!

martindurant Jun 18, 2021

Choose a reason for hiding this comment

Uh oh!

martindurant Jun 18, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

martindurant commented Jun 18, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants