Skip to content

June gems#1510

Merged
shcheklein merged 5 commits into
masterfrom
june_gems
Jun 30, 2020
Merged

June gems#1510
shcheklein merged 5 commits into
masterfrom
june_gems

Conversation

@elleobrien
Copy link
Copy Markdown
Contributor

June Gems are here- going to try to get it out while it's still June! :)

@shcheklein shcheklein temporarily deployed to dvc-landing-june-gems-fvxu7n5o June 29, 2020 19:26 Inactive
@elleobrien elleobrien requested a review from shcheklein June 29, 2020 19:26
Comment thread content/blog/2020-06-29-june-20-community-gems copy.md
consuming, and the dependencies and outputs haven't changed. You can use the
`--no-exec` flag to get around this:

```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add dvc

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add $ before the command - here and in other places

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are some bugs like this in the previous Gems btw

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch. might be good to revise previous gems then too

Comment thread content/blog/2020-06-29-june-20-community-gems copy.md

_Just like this but with technical documentation._

### Q: After I pushed my local data to remote S3 storage, I noticed the file names are different in S3- they're hash values. [Can I make them more meaningful names?](https://discord.com/channels/485586884165107732/563406153334128681/717737163122540585)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to mention S3 - we can generalize it

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, would be great to briefly provide motivation - e.g. deduplication , security - file are immutable, etc, GitFlow ...

In addition to dvc list mention data registry article and/or other commands dvc get, dvc import, Python dvc.api - - all of them provide a holistic data access layer for DVC-tracked objects (files, ML models, directories) which can be used usually as a drop-in replacement for regular data access libraries (e.g. aws boto,aws cli, in case of S3)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK have developed this answer more in the next version, let me know what you think

Comment thread content/blog/2020-06-29-june-20-community-gems copy.md Outdated
Comment thread content/blog/2020-06-29-june-20-community-gems copy.md Outdated
Comment thread content/blog/2020-06-29-june-20-community-gems copy.md Outdated
@shcheklein shcheklein temporarily deployed to dvc-landing-june-gems-fvxu7n5o June 29, 2020 21:46 Inactive
@elleobrien
Copy link
Copy Markdown
Contributor Author

@shcheklein revisions are pushed

### Q: After I pushed my local data to remote storage, I noticed the file names are different in my storage repository- they're hash values. [Can I make them more meaningful names?](https://discord.com/channels/485586884165107732/563406153334128681/717737163122540585)

No, but for a good reason! What you're seeing are cached files, and they're
stored in a special format that makes DVC versioning and addressing possible-
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

format -> way (we don't change format, it might confuse some folks). CSV stays CSV, we only change its name

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm going to say "naming convention"

Copy link
Copy Markdown
Contributor Author

@elleobrien elleobrien Jun 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been updated in the latest commit

@restyled-io restyled-io Bot mentioned this pull request Jun 29, 2020
@shcheklein shcheklein temporarily deployed to dvc-landing-june-gems-fvxu7n5o June 29, 2020 23:04 Inactive
@elleobrien
Copy link
Copy Markdown
Contributor Author

I think all issues are addressed. Aiming to publish tomorrow AM so let's merge then?

Co-authored-by: Restyled.io <commits@restyled.io>
@shcheklein shcheklein temporarily deployed to dvc-landing-june-gems-fvxu7n5o June 29, 2020 23:33 Inactive
@shcheklein shcheklein merged commit 194d5e2 into master Jun 30, 2020
Copy link
Copy Markdown
Contributor

@casperdcl casperdcl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comments

DVC cache and remote if the contents of your dataset change frequently.

Generally, we would recommend first trying a plain unzipped directory. DVC is
designed to work with large numbers of files (on the order of millions) and has
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra has :)

must set the `endpointurl` too. For example:

```dvc
$ dvc remote add -d myremote s3://mybucket/path/to/dir
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't it best to use long flag names for commands in documentation? --default better than -d? Otherwise people may accidentally change their default remote when copy-pasting this command.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants