Skip to content

Storage: TimeoutGuard raises TimeoutException even though upload successful #14

@mcsimps2

Description

@mcsimps2

Environment Details:

  • Using google-cloud-storage==1.23.0 and 1.24.1
  • Using Mac OSX 10.14 and Windows 7 64 bit
  • Using Python 3.7.3

Issue: A file can upload completely to Google Cloud Storage, yet still raise a TimeoutException if the upload process took longer than ~60 seconds (not 100% sure on the timedelta, but I'm guessing that it's 60 seconds from a brief analysis of the code).

Details The use of AuthorizedSession.request for blob uploads in the Google Cloud Storage Python library causes an unwarranted TimeoutException. The TimeoutGuard class raises an unnecessary TimeoutException on file uploads to Cloud Storage even when the Cloud Storage server is responding in a timely manner to file uploads. In fact, a file can completely upload and the TimeoutGuard will still raise a TimeoutException even though a true request timeout never occurred. The reason why is explained below.

Steps to reproduce:
I first encountered this when uploading a large file (1 GB) on a medium upload connection (10 Mbps upload). Although the upload was technically successful, I was still receiving a TimeoutException at the end of the upload from a call to blob.upload_from_filename(filepath) (a resumable upload, not multipart upload).
The stacktrace is below:

  File "site-packages\google\cloud\storage\blob.py", line 1320, in upload_from_filename
  File "site-packages\google\cloud\storage\blob.py", line 1265, in upload_from_file
  File "site-packages\google\cloud\storage\blob.py", line 1175, in _do_upload
  File "site-packages\google\cloud\storage\blob.py", line 1122, in _do_resumable_upload
  File "site-packages\google\resumable_media\requests\upload.py", line 425, in transmit_next_chunk
  File "site-packages\google\resumable_media\requests\_helpers.py", line 136, in http_request
  File "site-packages\google\resumable_media\_helpers.py", line 150, in wait_and_retry
  File "site-packages\google\auth\transport\requests.py", line 287, in request
  File "site-packages\google\auth\transport\requests.py", line 110, in __exit__
requests.exceptions.Timeout" 

The core of the issue is the TimeoutGuard class when used in a context like AuthorizedSession.request. Specifically, look at the following code in the aforementioned method:

with TimeoutGuard(timeout) as guard:
            response = super(AuthorizedSession, self).request(
                method,
                url,
                data=data,
                headers=request_headers,
                timeout=timeout,
                **kwargs
            )
        timeout = guard.remaining_timeout

There are two timeouts going on. One of them is a true request timeout used by the requests library (note AuthorizedSession is a subclass of requests.Session), and this is functioning correctly. The other timeout is a naive timeout set by TimeoutGuard that is causing problems. Essentially, it starts a clock that will raise a TimeoutException if a certain amount of time passes, even if the Google Cloud Storage servers are responding in a timely manner. In this case, the requests library will not raise a TimeoutException (because a true network timeout never occurred), but the TimeoutGuard will.

This causes issues with large files uploads or slow internet connections. If a user tries to upload a file that takes a long time to upload, then, even if the file upload is successful and didn't raise a TimeoutException from the requests library (i.e. server was responding in a timely fashion the entire upload), during TimeoutGuard.__exit__, the TimeoutGuard will raise an unsolicited TimeoutException.

Here's a walkthrough of the error:
(1) File upload initiated
(2) File uploads for a couple minutes, exceeding the default timeout of 60/61 seconds (resumable_media/requests/_helper.py _DEFAULT_CONNECT_TIMEOUT and _DEFAULT_READ_TIMEOUT, although it looks like the TimeGuard will take the minimum of the two) that the TimeGuard uses. The server is responding normally to all chunk uploads. TimeoutException is never thrown from the Python requests library because the server is consistently responding.
(3) File finishes upload, TimeoutGuard raises TimeoutException even though file upload was successful.

I've been able to work around this problem by monkeypatching the TimeoutGuard code, but I believe a proper fix is needed in the codebase. I would be happy to contribute or open a pull request if a maintainer can elaborate on the need for the TimeoutGuard TimeoutException when there is already a TimeoutException being used by the requests.Session class.

Metadata

Metadata

Assignees

Labels

🚨This issue needs some love.api: storageIssues related to the googleapis/python-storage API.priority: p1Important issue which blocks shipping the next release. Will be fixed prior to next release.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions