Skip to content

Conversation

@tcnichol
Copy link
Contributor

@tcnichol tcnichol commented Jun 4, 2022

The backend endpoint currently works. You can upload a zip and it will create a dataset with folders and subfolders.

Right now the zip file is unzipped in the working directory. That should probably be changed but not sure what would be best practice.

Not connected to the front end.

tcnichol added 2 commits June 4, 2022 16:03
creates dataset with folders and subfolders

not sure if this is a good way of handling unzipping files
@tcnichol tcnichol requested review from lmarini and max-zilla June 4, 2022 22:19
@tcnichol tcnichol linked an issue Jun 4, 2022 that may be closed by this pull request
@tcnichol
Copy link
Contributor Author

tcnichol commented Jun 5, 2022

The back end endpoint works, but there are 2 things that are still needed.

  1. This isn't added to the frontend.
  2. Right now the zip file is unzipped in the current working directory of the backend, and then the files are uploaded, and once the dataset is created the zip and the unzipped folder are deleted. This does not seem like the best practice, so anyone please comment on better solutions.

@max-zilla
Copy link
Contributor

I updated this a little:

  • created a shared function to add file to Mongo and Minio, so we don't have duplicated code for uploading file and uploading from dataset as it is somewhat complex
  • modified iteration to unzip only one file from archive at at time. i tried to get it fully streamed from request but no luck yet - this is slightly less vulnerable however.

Tested round-robin upload zip -> download zip and it seems to work as intended. Gonna test a couple more edge cases before merge

@max-zilla max-zilla merged commit 4366000 into main Jul 27, 2022
@max-zilla max-zilla deleted the 12-create-dataset-from-zip branch July 27, 2022 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

create dataset from zip

3 participants