We have a dataset with 15 files being replaced over 200 versions. (200 x 15 dvobjects; only 15 files in each version). We've known that datasets with many versions are expensive to operate on. So when a user attempts to do a full download, via "Access Dataset" on the page, there's a 10 or so sec. delay before the redirect to the zipper is issued.
However, if a user attempts the same download via /api/access/dataset/..., the delay before the redirect becomes ~ 15 * 10 sec. On close inspection, it is this call in the multi-file download method, repeated for every file, that is responsible for that increase:
GuestbookResponse gbr = guestbookResponseService.initAPIGuestbookResponse(file.getOwner(), file, session, apiTokenUser);
guestbookResponseService.save(gbr);
Since the method in FileDownloadService creates the same guestbook entries without causing this delay, when the download is initiated from the page, there has to be an easy way to avoid it in the API as well.
As described above, the scenario involves the external zipper. Writing these guestbook entries will cost the same amount of extra time without it in place. The difference however is that when the application zips the content itself, the guestbook entry is generated for each file as it's being streamed. Meaning, the performance would still be atrocious. But there would not be a combined, continuous idle wait, long enough to cause a timeout on the load balancer. Such as the case was in our prod. instance.