Hi
some repositories return packages (*.tar.gz) with a wrong header Content-encoding: gzip and requests automatically decompresses these files (see http://docs.python-requests.org/en/master/user/quickstart/#raw-response-content). So the resulted file is not "gzipped" but only "tarred" and the exception is raised:
tar = tarfile.TarFile(str(filepath), fileobj=gz)
/usr/lib/python3.6/tarfile.py in __init__() at line 1480
self.firstmember = self.next()
/usr/lib/python3.6/tarfile.py in next() at line 2295
tarinfo = self.tarinfo.fromtarfile(self)
/usr/lib/python3.6/tarfile.py in fromtarfile() at line 1090
buf = tarfile.fileobj.read(BLOCKSIZE)
/usr/lib/python3.6/gzip.py in read() at line 276
return self._buffer.read(size)
/usr/lib/python3.6/_compression.py in readinto() at line 68
data = self.read(len(byte_view))
/usr/lib/python3.6/gzip.py in read() at line 463
if not self._read_gzip_header():
/usr/lib/python3.6/gzip.py in _read_gzip_header() at line 411
raise OSError('Not a gzipped file (%r)' % magic)
There are two options:
- Use the
tarfile.open which can deal with it, instead of using directly the tarfile.TarFile(str(filepath), fileobj=gz).
- Use the requests raw stream instead of
iter_content. Replace r.iter_content(chunk_size=1024) with the r.raw.stream(1024) in poetry.repositories.pypi_repository.PyPiRepository#_download and poetry.repositories.legacy_repository.LegacyRepository#_download Btw these spots violate DRY principle.
I suggest the option 1, because it is more robust and it doesn't depend on the download method.
I can prepare pull request even the tests will be tricky, because now the _download methods are not tested at all ;)
Best regards
PS: The pip can deal with it.
Hi
some repositories return packages (*.tar.gz) with a wrong header
Content-encoding: gzipandrequestsautomatically decompresses these files (see http://docs.python-requests.org/en/master/user/quickstart/#raw-response-content). So the resulted file is not "gzipped" but only "tarred" and the exception is raised:There are two options:
tarfile.openwhich can deal with it, instead of using directly thetarfile.TarFile(str(filepath), fileobj=gz).iter_content. Replacer.iter_content(chunk_size=1024)with ther.raw.stream(1024)inpoetry.repositories.pypi_repository.PyPiRepository#_downloadandpoetry.repositories.legacy_repository.LegacyRepository#_downloadBtw these spots violate DRY principle.I suggest the option 1, because it is more robust and it doesn't depend on the download method.
I can prepare pull request even the tests will be tricky, because now the
_downloadmethods are not tested at all ;)Best regards
PS: The
pipcan deal with it.