perf: speed up uuid column generation#11209
Conversation
5c1155b to
dcd4c79
Compare
Codecov Report
@@ Coverage Diff @@
## master #11209 +/- ##
==========================================
- Coverage 65.65% 61.47% -4.18%
==========================================
Files 829 829
Lines 39213 39244 +31
Branches 3593 3593
==========================================
- Hits 25744 24126 -1618
- Misses 13357 14937 +1580
- Partials 112 181 +69
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
50428bc to
d76af43
Compare
There was a problem hiding this comment.
This is not the bigger the better. I tested many different numbers (100, 200, 500, 1000, 2000) and 200 seems to be working the best (but obviously this will depend on the machine so providing a way to override it via env variables).
There was a problem hiding this comment.
MySQL will throw an error if type_ is not specified.
c63c1af to
96e3637
Compare
5090b41 to
9162f28
Compare
9162f28 to
a2a86e2
Compare
betodealmeida
left a comment
There was a problem hiding this comment.
This looks great, Jessie! Thanks for improving the perf and making the migration more resilient!
|
got a bit too late to this pr, thanks @ktmud for the fix! |
SUMMARY
We have tens of thousands of dashboards, more than 200k slices, and 1.3 million table columns in our Superset deployment. It takes forever to run the db migration for #11098 . This PR tries to speed up the migration process by
I also tried to utilize
ThreadPoolExecutorto parallelize the db operations and Python uuid generation, but it doesn't seem to help much.Also fixes certain API errors while downgrading in MySQL and added the option to adjust batch size from command line.
BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
TEST PLAN
Make a copy of your very large database, then try:
ADDITIONAL INFORMATION