Skip to content

performance: what to do if DVC is slow #4167

@jorgeorpinel

Description

@jorgeorpinel

Generalise performance issues, e.g. based on this support case:

I try to dvc pull from my AWS bucket to an ubuntu machine and seems really slow although the internet connection should be fast... On my own mac the download works well... Seems like ubuntu is downloading data sequentially while MAC is able to do it in parallel?

Troubleshooting (1/2) GENERAL

Try same DVC version in both OS
If you do aws s3 cp it copies things faster?
Could you also please run dvc version on that Ubuntu machine in the repo
+ run first cprofile to check the results - https://github.com/iterative/dvc/wiki/Debugging,-Profiling-and-Benchmarking-DVC#profiling-dvc


Yesterday I moved the directory of data to the ubuntu computer manually (very quick - downloading took 10 s) and run dvc add on that directory to test - it was also incredibly slow. So this is clearly not related to AWS

attached is the cprofile file for reference (looks like a thread locking issue?)

Troubleshooting (2/2) SPECIFIC

is your repo/workspace located on a network mounted drive? (Yes) - you may need to configure state.dir and index.dir to be located in a local directory


that worked for dvc add ✅ but dvc pull ❌ still has the same issue as before...
Actually even dvc pull now ✅ works as expected.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A: docsArea: user documentation (gatsby-theme-iterative)C: guideContent of /doc/user-guidep2-nice-to-haveLess of a priority at the moment. We don't usually deal with this immediately.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions