fixed model output function when computing gradients in float16#36
Merged
kristian-georgiev merged 3 commits into0.2.0from May 31, 2023
Merged
fixed model output function when computing gradients in float16#36kristian-georgiev merged 3 commits into0.2.0from
kristian-georgiev merged 3 commits into0.2.0from
Conversation
Member
|
Great catch, thanks! |
kristian-georgiev
added a commit
that referenced
this pull request
Jun 1, 2023
* clean up old nb * trak scores quickstart fig * clean up quickstart * minor docs updates * no-op projector * bump version * test for scoring in shards * test for featurizing in shards * tie experiment name to scoring targets; simplify saver; add logging * support dataset sharding during featurizing and scoring * save scores as mmap * migrate to torch.func * bump torch dep requirement to 2.0.0 bc of torch.func * project and store in float16 by default * test autocast vs .half() on the model with functional_call * test_install function * minor edits in tests and install docs * pass in an instance of a class for tasks, rather than init inside of gradientcomputer * bug fix * normalization factor for numerical stability * fixed model output function when computing gradients in float16 (#36) * fixed model output function when computing gradients in float16 * also fix for text clsf MOF * instantiate on device directly --------- Co-authored-by: alaakh <alaakh@mit.edu> Co-authored-by: Kristian Georgiev <krisgrg@mit.edu> * _is_featurized array * handle pre-emption for featurizing * vectorize without stacking to save memory * add assertion to load ckpt * python >=3.8 for pytorch 2.0 * make it easy to use GPU with smaller cuda mem * pytest cuda markers * fix CLIP modelout function * bring back iter gradient computer --------- Co-authored-by: alaakh <alaakh@mit.edu>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When computing the margins from
image_classificationtask, the default dtype ofch.tensor(-ch.inf)is float32. This leads to a datatype mismatch if the model gradients and output were computed in float16.