[WIP] gdrive: a pair of cleanup refactors#3376
Conversation
|
@efiop do we skip GDrive tests on travis? |
Employ @Retry(filter_errors) and @wrap_prop() to pack things up a bit.
- cache root transparently - make cache dirs and ids symmetric, i.e. both using full paths - implement .list_cache_paths() instead of .all() like in the rest of the remotes
|
@Suor, it only runs on |
|
@skshetry is correct, but we also run on upstream branches too. So you could create a PR from an upstream branch instead. |
| raise | ||
| return result | ||
|
|
||
| retry_re = re.compile(r"HttpError (403|500|502|503|504)") |
There was a problem hiding this comment.
great!, but we'll need to rebuilt it on the PyDrive2 side to get rid of this fragile stuff
| ), | ||
| _wrap_pydrive_retriable, | ||
| ) | ||
| return retry( |
There was a problem hiding this comment.
is there a simple way to write a debug message on a retry, btw?
There was a problem hiding this comment.
You can wrap a func, which you retry into @log_errors():
# ...
return retry(...)(log_errors(logger.error, stack=False)(func))This will show errors, but there is no non-hacky way to show retries.
| def _get_cached_remote_ids(self, path): | ||
| if not path: | ||
| return [self._bucket] | ||
| if self._cache_initialized: |
There was a problem hiding this comment.
this logic is important, we don't want to initialize cache if it's not initialized yet
There was a problem hiding this comment.
cache property init makes a call to self._get_remote_id itself to get root id. I'm not sure who would it behave with these changes
| cache = {"dirs": defaultdict(list), "ids": {}} | ||
|
|
||
| cache["root_id"] = self._get_remote_id(self.path_info) | ||
| self._cache_path(self.path_info.path, cache["root_id"], cache) |
There was a problem hiding this comment.
I believe we did't cache root id before in the cache itself, it might affect list all below - before it was returning only actual files in cache, now also directories under the root?
There was a problem hiding this comment.
We cached path -> id in a way, made it separately. Plus extra code on cache read side. We did not cache id -> path though. This might affect listing cache, you are right.
| if not self._gdrive: | ||
| from pydrive2.auth import GoogleAuth | ||
| from pydrive2.drive import GoogleDrive | ||
| GoogleAuth.DEFAULT_SETTINGS["client_config_backend"] = "settings" |
There was a problem hiding this comment.
not directly related - would really love this to go away from the class ... is there a good way? or at least assign the whole dict at once to DEFAULT_SETTINGS? This way it is done now is quite ugly and bloated
There was a problem hiding this comment.
Yeah, it is. One can use GoogleAuth.DEFAULT_SETTINGS.update(), but the right way would be to make GoogleAuth accept settings as a dict not only file.
|
Closed in favor if #3382 |
No description provided.