Skip to content

blog: remote optimization post#1451

Closed
pmrowla wants to merge 2 commits into
treeverse:masterfrom
pmrowla:blog-remote-optimization
Closed

blog: remote optimization post#1451
pmrowla wants to merge 2 commits into
treeverse:masterfrom
pmrowla:blog-remote-optimization

Conversation

@pmrowla
Copy link
Copy Markdown
Contributor

@pmrowla pmrowla commented Jun 19, 2020

You may disregard these recommendations if you used the Edit on GitHub button from dvc.org to improve a doc in place.

❗ Please read the guidelines in the Contributing to the Documentation list if you make any substantial changes to the documentation or JS engine.

🐛 Please make sure to mention Fix #issue (if applicable) in the description of the PR. This causes GitHub to close it automatically when the PR is merged.

Please choose to allow us to edit your branch when creating the PR.

Thank you for the contribution - we'll try to review it as soon as possible. 🙏

Initial draft for the remote optimization write up

TODO

  • improve introduction
  • needs conclusion
  • update placeholder image
  • update placeholder date

@pmrowla pmrowla self-assigned this Jun 19, 2020
@pmrowla
Copy link
Copy Markdown
Contributor Author

pmrowla commented Jun 19, 2020

Not sure if the initial draft is too in depth/technical.

@andronovhopf I'd appreciate it if you can take a look at this and give some suggestions on how to make it more interesting/applicable for users from an ML perspective

@@ -0,0 +1,174 @@
---
title: Optimizing DVC Remotes
date: 2020-06-29
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

placeholder date

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be whatever you prefer.

date: 2020-06-29
description: |
An overview of how syncing data to and from remote storage is optimized in DVC.
picture: 2020-05-04/owl.png
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

placeholder image

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the impression if you leave it blank it uses a default img BTW.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was having issues running the dev server (via yarn develop) when picture was unset, maybe that's just some problem with my local environment though?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH I'm not sure exactly how the blog engine works! You can create a bug report though and Ivan or Roger will probably answer to that 🙂

@@ -0,0 +1,174 @@
---
title: Optimizing DVC Remotes
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably needs a more interesting title

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from the intro think this post is more about "Optimization improvements in DVC 1.0"

@skshetry skshetry added A: docs Area: user documentation (gatsby-theme-iterative) and removed A: docs Area: user documentation (gatsby-theme-iterative) labels Jun 19, 2020
Comment on lines +7 to +13
author: peter_rowlands
---

One of the key features provided by DVC is the ability to efficiently sync
versioned datasets between a user's local machine and
[remote storage](https://dvc.org/doc/command-reference/remote), and version 1.0
includes several performance optimizations related to syncing data with remotes.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would start if possible with something like "Our users have presented the need for optimizing remotes blah blah" and give some examples e.g. Discord message screenshots.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, minor: I personally prefer "synchronizing" or "syncing". The pronunciation of the latter is questionable, no?

Copy link
Copy Markdown
Contributor

@jorgeorpinel jorgeorpinel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick review of blog intro. Some of these suggestions can probably be applied to other places in the blog.

Comment on lines +7 to +13
author: peter_rowlands
---

One of the key features provided by DVC is the ability to efficiently sync
versioned datasets between a user's local machine and
[remote storage](https://dvc.org/doc/command-reference/remote), and version 1.0
includes several performance optimizations related to syncing data with remotes.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, minor: I personally prefer "synchronizing" or "syncing". The pronunciation of the latter is questionable, no?

Comment thread content/blog/2020-06-29-optimizing-dvc-remotes.md Outdated
Comment thread content/blog/2020-06-29-optimizing-dvc-remotes.md Outdated
Comment thread content/blog/2020-06-29-optimizing-dvc-remotes.md Outdated
3. Determine the difference between the two sets of files

Commonly used cloud sync utilities, such as [rclone](https://rclone.org/), must
be generalized to support any arbitrary file structure, which can come at the
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
be generalized to support any arbitrary file structure, which can come at the
be generalized to support any file structure, which can come at the

Comment thread content/blog/2020-06-29-optimizing-dvc-remotes.md Outdated
Comment on lines +32 to +35
operations (i.e. `status -c`,
[push](https://dvc.org/doc/command-reference/push),
[pull](https://dvc.org/doc/command-reference/pull),
[fetch](https://dvc.org/doc/command-reference/fetch)). In DVC version 1.0, these
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
operations (i.e. `status -c`,
[push](https://dvc.org/doc/command-reference/push),
[pull](https://dvc.org/doc/command-reference/pull),
[fetch](https://dvc.org/doc/command-reference/fetch)). In DVC version 1.0, these
operations (i.e. `dvc status -c`,
`dvc push`,
`dvc pull`,
`dvc fetch`). In DVC version 1.0, these

@jorgeorpinel
Copy link
Copy Markdown
Contributor

jorgeorpinel commented Jun 19, 2020

@pmrowla very nice! Please note on this repo we don't mind if you push a branch directly to upstream, in fact that's usually better because it fires up a review app automatically. I created one manually for this PR, you can see your post here: https://dvc-landing-blog-remote-uhiudf.herokuapp.com/blog/optimizing-dvc-remotes Cheers

@shcheklein shcheklein temporarily deployed to dvc-landing-blog-remote-uhiudf June 22, 2020 05:45 Inactive
@pmrowla pmrowla closed this Jun 22, 2020
@pmrowla pmrowla deleted the blog-remote-optimization branch June 22, 2020 05:48
@pmrowla pmrowla mentioned this pull request Jun 22, 2020
4 tasks
@shcheklein shcheklein temporarily deployed to dvc-landing-blog-remote-rbug3z June 22, 2020 05:50 Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants