Skip to content

fixed model output function when computing gradients in float16#36

Merged
kristian-georgiev merged 3 commits into0.2.0from
0.2.0_float16
May 31, 2023
Merged

fixed model output function when computing gradients in float16#36
kristian-georgiev merged 3 commits into0.2.0from
0.2.0_float16

Conversation

@AlaaKhaddaj
Copy link
Copy Markdown

When computing the margins from image_classification task, the default dtype of ch.tensor(-ch.inf) is float32. This leads to a datatype mismatch if the model gradients and output were computed in float16.

@kristian-georgiev
Copy link
Copy Markdown
Member

Great catch, thanks!

@kristian-georgiev kristian-georgiev merged commit e18838a into 0.2.0 May 31, 2023
@kristian-georgiev kristian-georgiev deleted the 0.2.0_float16 branch May 31, 2023 17:12
kristian-georgiev added a commit that referenced this pull request Jun 1, 2023
* clean up old nb

* trak scores quickstart fig

* clean up quickstart

* minor docs updates

* no-op projector

* bump version

* test for scoring in shards

* test for featurizing in shards

* tie experiment name to scoring targets; simplify saver; add logging

* support dataset sharding during featurizing and scoring

* save scores as mmap

* migrate to torch.func

* bump torch dep requirement to 2.0.0 bc of torch.func

* project and store in float16 by default

* test autocast vs .half() on the model with functional_call

* test_install function

* minor edits in tests and install docs

* pass in an instance of a class for tasks, rather than init inside of gradientcomputer

* bug fix

* normalization factor for numerical stability

* fixed model output function when computing gradients in float16 (#36)

* fixed model output function when computing gradients in float16

* also fix for text clsf MOF

* instantiate on device directly

---------

Co-authored-by: alaakh <alaakh@mit.edu>
Co-authored-by: Kristian Georgiev <krisgrg@mit.edu>

* _is_featurized array

* handle pre-emption for featurizing

* vectorize without stacking to save memory

* add assertion to load ckpt

* python >=3.8 for pytorch 2.0

* make it easy to use GPU with smaller cuda mem

* pytest cuda markers

* fix CLIP modelout function

* bring back iter gradient computer

---------

Co-authored-by: alaakh <alaakh@mit.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants