Requesting to make it easier to restart the tracker. In the HuggingFace BigScience project we need to be able to restart the tracker in various places. After writing the code I discovered start/stop aren't a pair. One can't start() again after stop(). So the current stop() is similar to destroy() but one can't tell that from the name. start() after stop() gives:
[codecarbon WARNING @ 19:55:24] <class 'Exception'>
Traceback (most recent call last):
File "/home/stas/anaconda3/envs/py38-pt19/lib/python3.8/site-packages/codecarbon/core/util.py", line 10, in suppress
yield
File "/home/stas/anaconda3/envs/py38-pt19/lib/python3.8/contextlib.py", line 75, in inner
return func(*args, **kwds)
File "/home/stas/anaconda3/envs/py38-pt19/lib/python3.8/site-packages/codecarbon/emissions_tracker.py", line 314, in stop
self._scheduler.shutdown()
File "/home/stas/anaconda3/envs/py38-pt19/lib/python3.8/site-packages/apscheduler/schedulers/background.py", line 41, in shutdown
super(BackgroundScheduler, self).shutdown(*args, **kwargs)
File "/home/stas/anaconda3/envs/py38-pt19/lib/python3.8/site-packages/apscheduler/schedulers/blocking.py", line 24, in shutdown
super(BlockingScheduler, self).shutdown(wait)
File "/home/stas/anaconda3/envs/py38-pt19/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 189, in shutdown
raise SchedulerNotRunningError
apscheduler.schedulers.SchedulerNotRunningError: Scheduler is not running
Any chance you could adjust the code so that the following code could work?
cc = codecarbon.OfflineEmissionsTracker(...)
cc.start()
cc.stop(); cc.start() # restart
...
cc.stop(); cc.start() # restart
cc.stop()
or the longer version in the context of how I'm trying to write the wrapper to fit into Megatron-LM's setup:
_GLOBAL_CODECARBON_TRACKER = None
def _set_codecarbon_tracker(args):
global _GLOBAL_CODECARBON_TRACKER
if hasattr(args, 'codecarbon_dir'):
import codecarbon
print('> setting codecarbon ...')
output_dir = args.codecarbon_dir
output_file = f"emissions-{args.rank:03d}.csv"
log_level = "info"
country_iso_code="FRA"
Path(output_dir).mkdir(parents=True, exist_ok=True)
_GLOBAL_CODECARBON_TRACKER = codecarbon.OfflineEmissionsTracker(
output_dir=output_dir,
output_file=output_file,
log_level=log_level,
country_iso_code=country_iso_code,
)
def codecarbon_tracker_start():
global _GLOBAL_CODECARBON_TRACKER
if _GLOBAL_CODECARBON_TRACKER is None:
return
print('codecarbon START')
_GLOBAL_CODECARBON_TRACKER.start()
def codecarbon_tracker_stop():
global _GLOBAL_CODECARBON_TRACKER
if _GLOBAL_CODECARBON_TRACKER is None:
return
print('codecarbon STOP')
_GLOBAL_CODECARBON_TRACKER.stop()
def codecarbon_tracker_restart():
global _GLOBAL_CODECARBON_TRACKER
if _GLOBAL_CODECARBON_TRACKER is None:
return
# output_dir = _GLOBAL_CODECARBON_TRACKER._output_dir
# output_file = _GLOBAL_CODECARBON_TRACKER._output_file
# log_level = _GLOBAL_CODECARBON_TRACKER._log_level
# country_iso_code = _GLOBAL_CODECARBON_TRACKER._country_iso_code
codecarbon_tracker_stop()
codecarbon_tracker_start()
Otherwise I have to re-create the tracker all the time, but to do that, I can't even access the args passed to the tracker without going to private variables.
even better if there were an API method restart() which simply flushes the data to the file and continues. That would probably be a simpler change to do.
Thank you!
@JetRunner, @whobbes
Requesting to make it easier to restart the tracker. In the HuggingFace BigScience project we need to be able to restart the tracker in various places. After writing the code I discovered start/stop aren't a pair. One can't
start()again afterstop(). So the currentstop()is similar todestroy()but one can't tell that from the name.start()afterstop()gives:Any chance you could adjust the code so that the following code could work?
or the longer version in the context of how I'm trying to write the wrapper to fit into Megatron-LM's setup:
Otherwise I have to re-create the tracker all the time, but to do that, I can't even access the args passed to the tracker without going to private variables.
even better if there were an API method
restart()which simply flushes the data to the file and continues. That would probably be a simpler change to do.Thank you!
@JetRunner, @whobbes