Skip to content

pull from external repo: support --j to limit the number of parallel connections #3396

@pommedeterresautee

Description

@pommedeterresautee

We are introducing DVC in our company and were quite happy until we started using it on a large project containing few hundred of thousands of files representing approximatively 300 Gb.
We use S3 as storage.
When someone from our team did a dvc pull of this project, it sucked the whole internet bandwidth of our office.

We tried to mitigate the issue by limiting the number of concurrent jobs to 1 (option -j 1) but it was not enough.
Our IT Ops team told us that dvc has opened hundred of concurrent connections to download files from our S3 bucket, and that it explains why we have been able to suck most of the bandwidth.

Is there other option than --jobs to limit the number of parallel connections we should take care of?
Is there some existing workaround for this situation?

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions