Skip to content

Erm Hyper Init#843

Merged
smilesun merged 26 commits intomhof_dev_mergefrom
erm_hyper_init
Jul 2, 2024
Merged

Erm Hyper Init#843
smilesun merged 26 commits intomhof_dev_mergefrom
erm_hyper_init

Conversation

@MatteoWohlrapp
Copy link
Copy Markdown
Collaborator

@MatteoWohlrapp MatteoWohlrapp commented May 28, 2024

Added functionality to use ERM with the hyperparam scheduling. Alternatively to adding the hyper init and hyper update method to ERM, we could also add them to the a_model superclass, or check if the method exists before invoking in the scheduler.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Jun 11, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 90.90%. Comparing base (d2ac388) to head (976c25a).

Additional details and impacted files
@@                Coverage Diff                 @@
##           mhof_dev_merge     #843      +/-   ##
==================================================
+ Coverage           90.77%   90.90%   +0.12%     
==================================================
  Files                 137      137              
  Lines                5853     5858       +5     
==================================================
+ Hits                 5313     5325      +12     
+ Misses                540      533       -7     
Flag Coverage Δ
unittests 90.90% <100.00%> (+0.12%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Copy Markdown
Collaborator

@smilesun smilesun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need a comment to "flag_info" so that the code reader/ reviewer know what this variable does in general

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has this yaml file been tested?

Copy link
Copy Markdown
Collaborator Author

@MatteoWohlrapp MatteoWohlrapp Jul 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, was a separate issue 831 which is also linked in this PR

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested it, and resulted some error. Will print downstairs.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks nothing big: zdata does not have pacs yet.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

domainlab/zdata/pacs/PACS/art_painting

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now i meet:

OutOfMemoryError in file 
/ictstr01/home/aih/xudong.sun/domainlab_master/domainlab/exp_protocol/benchmark.smk, line 154:      
zoutput/slurm_logs/run_experiment/run_experiment-index=14-21649209.err-251-CUDA out of memory. Tried to allocate 1.98 GiB. GPU 0 has a total capacty of 19.50 GiB of which 221.88 MiB is free. Including
non-PyTorch memory, this process has 19.24 GiB memory in use. Process 1322808 has 19.24 GiB memory in use. Of the allocated memory 18.93 GiB is allocated by PyTorch, and 71.99 MiB is rese

Is it because some GPU has larger mem so your run has gone through? @MatteoWohlrapp

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it say which of the two experiements in the yaml? We could try a different dataset to see if it works then. I do remember that it ran on the cluster.

@marrlab marrlab deleted a comment from smilesun Jul 2, 2024
@MatteoWohlrapp
Copy link
Copy Markdown
Collaborator Author

MatteoWohlrapp commented Jul 2, 2024

You introduced 'flag_info' in your mhof_dev branch. Can you give a brief explanation, I dont think I fully understand the naming. I added it because otherwise training was not possible. It is set to self.flag_setpoint_updated in train_fbopt_b.py.

@smilesun smilesun self-assigned this Jul 2, 2024
@smilesun smilesun marked this pull request as ready for review July 2, 2024 10:52
@smilesun smilesun merged commit a99c9f5 into mhof_dev_merge Jul 2, 2024
@smilesun smilesun deleted the erm_hyper_init branch July 2, 2024 13:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

3 participants