Skip to content

Add chunking to put_rows and delete_rows#90

Merged
JackWilb merged 2 commits intomainfrom
chunked-upload
Jan 14, 2022
Merged

Add chunking to put_rows and delete_rows#90
JackWilb merged 2 commits intomainfrom
chunked-upload

Conversation

@JackWilb
Copy link
Copy Markdown
Member

Depends #89

Does this PR close any open issues?

Closes #85

Give a longer description of what this PR addresses and why it's needed

This chunks the upload using hints from #85. I used almost exactly the same syntax, while making sure to not use the readonly user. The upload runs with chunks of 5000 rows, which may be too small.

Provide pictures/videos of the behavior before and after these changes (optional)

N/A

Are there any additional TODOs before this PR is ready to go?

No.

@JackWilb JackWilb requested review from jjnesbitt and waxlamp January 13, 2022 21:07
@waxlamp waxlamp removed their request for review January 13, 2022 21:18
@waxlamp
Copy link
Copy Markdown
Contributor

waxlamp commented Jan 13, 2022

Removed myself from the reviewer list--please go with Jake's recommendations.

Copy link
Copy Markdown
Member

@jjnesbitt jjnesbitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there's an import failure in a sub-dependency, I can look into fixing that (unless you know what to do @JackWilb). Otherwise, I think this PR looks good.

One thing I'm curious about, is 5000 the right size for DOCUMENT_CHUNK_SIZE? It's pretty arbitrary, although I think any number we set will be arbitrary.

@JackWilb
Copy link
Copy Markdown
Member Author

Looks like there's an import failure in a sub-dependency, I can look into fixing that (unless you know what to do @JackWilb). Otherwise, I think this PR looks good.

CircleCI was trying to use django 4, so I just rebased onto main after #89. Now the tests are failing because of #91

One thing I'm curious about, is 5000 the right size for DOCUMENT_CHUNK_SIZE? It's pretty arbitrary, although I think any number we set will be arbitrary.

Yeah, I'm not sure where the sweet spot is. Let's leave it at this for now and if we need to do some fine-tuning in the future, we can.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Large table uploads fail

3 participants