Skip to content

Optimize Zipping Process on the backend #6505

@djbrooke

Description

@djbrooke

We currently restrict zip file downloads to a max size (based on a setting) due to performance of zipping large files via glassfish. We should investigate:

  • how to store files in such a way that on-demand zipping is not needed
  • to modularize the functionality so that this does not tax the application server,
  • some other option

This will allow for better system stability and to avoid the issue where the user downloads a zip file and doesn't find out that it's not a complete archive until after the fact (because it exceeds the limit).

For S3, it will be interesting to architect this with an eye towards cost, that is, if we store zipped versions of the dataset so that we can provide zips without processing time, we incur storage cost, but if we optimize the zipping so it takes place on demand we incur some computation cost on lambda/fargate whatever. Not sure what's preferred.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions