Conversation
hubstorage/resourcetype.py
Outdated
| url = urlpathjoin(self.url, _path) | ||
| logger.error('Failed %d times reading items from %s, params %s, ' | ||
| 'last error was: %s', self.MAX_RETRIES, url, | ||
| apiparams, lastexc) |
There was a problem hiding this comment.
Can you explain why we don't stream msgpack?
The difference should be at the "accept" header and the decode function right? The iteration & retrying should be reused from json-lines.
There was a problem hiding this comment.
I assumed BytesIO would stream the data while reading it, but I noticed requests downloads everything [1] anyway. I re-factored out the common code. Thanks for pointing it out.
Is there any reason the retry logic isn't done using retrying?
[1] https://github.com/kennethreitz/requests/blob/master/requests/models.py#L737
|
How |
bcf7f01 to
fd72001
Compare
|
Here [1] [1] https://github.com/scrapinghub/python-hubstorage/blob/master/tests/test_project.py#L241 |
hubstorage/resourcetype.py
Outdated
| return r.iter_lines() | ||
|
|
||
| def apirequest(self, _path=None, **kwargs): | ||
| if self._supports_msgpack and kwargs.get('method') == 'GET': |
There was a problem hiding this comment.
it's possible to get method='get' or method='gEt' here, it's converted to upper only in requests lib
|
Last push has some improvements (suggestions) and fixes for test failures that were shadowed before #64. |
cd81655 to
f4ef0a8
Compare
|
@chekunkov Updated, review please. |
|
LGTM |
|
Hey guys, there are two issues with this PR:
|
|
@jdemaeyer we will fix py3 compatibility asap I'm afraid for now you have to pin python-hubstorage version in shub setup.py to <= 0.23.2, logs are using msgpack format by default and it seems we have a bug in Hubstorage - it returns _key field only for json and jsonlines formats. |
Any thoughts?
I'm assuming current tests, like [1], still cover this patch.
Addresses #58.
[1] https://github.com/scrapinghub/python-hubstorage/blob/msgpack/tests/test_project.py#L228