-
Notifications
You must be signed in to change notification settings - Fork 7
Closed
Description
Description
We have written the startup script for Transferqueue + Datasystem backend as follows:
class Trainer:
def __init__(self, config: dict):
self.config = config
self._initialize_transferqueue()
def _initialize_transferqueue(self):
# 1. Initialize TransferQueueController (single controller only)
self.tq_controller = TransferQueueController.remote()
# 2. Prepare necessary information of the controller
self.tq_controller_info = process_zmq_server_info(self.tq_controller)
tq_config = OmegaConf.create({}, flags={"allow_objects": True}) # Note: Need to generate a new DictConfig
# with allow_objects=True to maintain ZMQServerInfo instance. Otherwise it will be flattened to dict
tq_config.controller_info = self.tq_controller_info
self.config = OmegaConf.merge(tq_config, self.config)
# 3. Create TransferQueueClient
self.tq_client = TransferQueueClient(
client_id="Trainer",
controller_info=self.tq_controller_info,
)
# 4. Connect to DataSystem
self.tq_client.initialize_storage_manager(manager_type=self.config["manager_type"], config=self.config)
return self.tq_clientWe found TransferQueueClient requires controller_info during initialization and holds it, but StorageManager also needs to pass controller_info during initialization.
From the user's perspective, the relationship between StorageManager and controller may not be directly perceptible. Users might forget to add controller_info when passing in the configuration, resulting in ValueError. Perhaps we should tolerate this behavior instead of throwing an exception directly.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels