-
Notifications
You must be signed in to change notification settings - Fork 113
fix(dataset): enforce max file size for multipart upload #4146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
xuang7
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR. LGTM!
file-service/src/main/scala/org/apache/texera/service/resource/DatasetResource.scala
Outdated
Show resolved
Hide resolved
aicam
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think life cycle of upload session records needs better design, if needed, we can meet
file-service/src/main/scala/org/apache/texera/service/resource/DatasetResource.scala
Show resolved
Hide resolved
file-service/src/main/scala/org/apache/texera/service/resource/DatasetResource.scala
Show resolved
Hide resolved
file-service/src/main/scala/org/apache/texera/service/resource/DatasetResource.scala
Show resolved
Hide resolved
file-service/src/main/scala/org/apache/texera/service/resource/DatasetResource.scala
Show resolved
Hide resolved
aicam
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!, thanks for the great PR
Head branch was pushed to by a user without write access
Signed-off-by: carloea2 <carloea2@uci.edu>
|
@aicam Can you run the testing? Thanks |
What changes were proposed in this PR?
single_file_upload_max_size_miblimit for multipart uploads at init by requiringfileSizeBytes+partSizeBytesand rejecting when the total declared file size exceeds the configured max.file_size_bytesandpart_size_bytestodataset_upload_session, plus constraints to keep them valid.uploadPartagainst size bypasses by computing the expected part size from the stored session metadata and rejecting any request whoseContent-Lengthdoes not exactly match the expected size (including the final part).fileSizeBytesandpartSizeByteswhen initializing multipart uploads.sql/updates/18.sql) to apply the schema change on existing deployments.Any related issues, documentation, discussions?
Close #4147
How was this PR tested?
Added/updated unit tests for multipart upload validation and malicious cases, including:
Content-Lengthmismatch rejection (non-numeric/overflow/mismatch)Was this PR authored or co-authored using generative AI tooling?
Co-authored-by: ChatGPT