Needs work: tolerate chunked responses without a DOS vulnerability by roscoemcinerney · Pull Request #8 · good-loop/media

roscoemcinerney · 2023-01-31T10:59:51Z

Reiterating the code comments:

We encountered the file https://logo.clearbit.com/https%3A/duckduckgo.com/ which is a small and very normal PNG but can't be served.
It's because logo.clearbit.com serves via chunked transfer and doesn't provide a Content-Length header, which we currently use to decide if a file is small enough for us to cache.
Accepting chunked transfers is fine (code in this branch already does it), but there are 2 wrinkles it seems wise to resolve before merging.
1: This code will block until the entire file, which could be gigabytes or even endless (for a malicious server) is transferred - rather than aborting the transfer when the size limit is passed, so we need to do something a little fiddlier than just using FileUtils.copy on the connection's InputStream.
2: This code will allow repeated requests for the same file, transferring up to the limit and then returning a 400 error each time. My instinct is that when a cache request is made for a too-large file, we should copy a placeholder image into the destination path with readable text saying "File too large to cache" - so subsequent requests become a cache hit, and (probably more importantly) legitimate users immediately know what the problem is.

We're not in a hurry to tie off this branch (it's due to a problem found in Aidan's testing, not live) and the changes above aren't hard to implement at all - but I'm raising this PR as a sanity check for when I come back to it.

…ansfers. Not ready for merge yet

winterstein · 2023-01-31T19:51:32Z

1: This code will block until the entire file, which could be gigabytes or even endless (for a malicious server) is transferred - rather than aborting the transfer when the size limit is passed, so we need to do something a little fiddlier than just using FileUtils.copy on the connection's InputStream.

We have code for that :)

new LimitedInputStream(conn.getInputStream(), MAX_SIZE)

FYI: the same code is available in FakeBrowser.setMaxDownload()

Sticking with FakeBrowser might be best as it can handle http redirects.

2: This code will allow repeated requests for the same file, transferring up to the limit and then returning a 400 error each time. My instinct is that when a cache request is made for a too-large file, we should copy a placeholder image into the destination path with readable text saying "File too large to cache" - so subsequent requests become a cache hit, and (probably more importantly) legitimate users immediately know what the problem is.

Good idea :)

…ird failure modes around files without extension on source URL

MediaCacheServlet: Accept files from servers which send as chunked tr…

ee5bfde

…ansfers. Not ready for merge yet

roscoemcinerney requested review from geoffballinger, winterstein and wongws11 January 31, 2023 11:00

roscoemcinerney self-assigned this Jan 31, 2023

Chunked transfer / unexpected too-large file: works but still some we…

db90220

…ird failure modes around files without extension on source URL

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Needs work: tolerate chunked responses without a DOS vulnerability#8

Needs work: tolerate chunked responses without a DOS vulnerability#8
roscoemcinerney wants to merge 2 commits intomasterfrom
bugfix/2023-01/tolerate-chunked-transfer

roscoemcinerney commented Jan 31, 2023

Uh oh!

winterstein commented Jan 31, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

roscoemcinerney commented Jan 31, 2023

Uh oh!

winterstein commented Jan 31, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants