Reduce memory usage in TF building#24046
Conversation
|
The documentation is not available anymore as the PR was closed or merged. |
|
Let me run it on CI and see. |
amyeroberts
left a comment
There was a problem hiding this comment.
Change LGTM - thanks for updating!
Happy to merge once @ydshieh gives the 👍 from CI runs
| for key, spec in sig.items(): | ||
| # 3 is the most correct arbitrary size. I will not be taking questions | ||
| dummies[key] = tf.ones(shape=[dim if dim is not None else 3 for dim in spec.shape], dtype=spec.dtype) | ||
| # 2 is the most correct arbitrary size. I will not be taking questions |
There was a problem hiding this comment.
I wish to file this diff as evidence to the contrary #team3
|
Sorry for the delay - there's an issue with Funnel that wasn't reproducing on my machine. I eventually figured out that the problem is the classic TF one: indices for |
|
I also tried to run the change in this PR, and got and 5 other ones (probably due to the above one). @Rocketknight1 I think we will have to reiterate (change->run->change->run) a bit more before we merge. |
|
Yep, working on it now! |
|
The I am OK to trigger the run (a subset) whenever you feel it's time. Otherwise I can show you a modified workflow file for you to trigger manually. |
|
@ydshieh the issues with Funnel have been resolved, so this should be ready for a CI run now! |
|
You can watch it live here. It will take 20-30 min to finish. |
|
Looks like they're still failing even with very small dummies. I'll investigate those models and try to figure out why - the new dummies should be smaller than the old ones! |
|
Maybe this is a sign that we should transition the dummies to symbolic tensors for those models, even if it's probably too slow for our tests to do it across the whole codebase. |
* Make the default dummies (2, 2) instead of (3, 3) * Fix for Funnel * Actually fix Funnel
This PR reduces the default shape of dummy inputs from (3, 3) to (2, 2). This slightly reduces the memory usage when building TF models, which should hopefully fix some of our pipeline tests.
We could replace the dummy inputs with symbolic tensors, which would mean we could build TF models with 0 memory usage, but this would make TF model building slower (~4X) because it would implicitly compile the model when building, which is probably not an acceptable tradeoff.
cc @ydshieh and @amyeroberts as core maintainer