From b1abd22d26fce02a1eaccc7291fe7f831304a474 Mon Sep 17 00:00:00 2001 From: Elena Rastorgueva Date: Mon, 22 Jul 2024 16:05:43 -0700 Subject: [PATCH 01/12] add nemo fundamentals page Signed-off-by: Elena Rastorgueva --- docs/source/index.rst | 1 + docs/source/starthere/fundamentals.rst | 242 +++++++++++++++++++++++++ 2 files changed, 243 insertions(+) create mode 100644 docs/source/starthere/fundamentals.rst diff --git a/docs/source/index.rst b/docs/source/index.rst index f10ae126267b..ae0d5692c286 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -41,6 +41,7 @@ For quick guides and tutorials, see the "Getting started" section below. :titlesonly: starthere/intro + starthere/fundamentals starthere/tutorials For more information, browse the developer docs for your area of interest in the contents section below or on the left sidebar. diff --git a/docs/source/starthere/fundamentals.rst b/docs/source/starthere/fundamentals.rst new file mode 100644 index 000000000000..96f4a643dd63 --- /dev/null +++ b/docs/source/starthere/fundamentals.rst @@ -0,0 +1,242 @@ +NeMo fundamentals +================= + +On this page we will go a little bit deeper on how NeMo works, to give you a good foundation for using NeMo for your :ref:`desired usecase `. + +.. _nemo_model: + +NeMo Models +----------- + +A NeMo "model" includes all of the below components wrapped into a singular, cohesive unit: + +* neural network architecture, + +* dataset & data loaders, + +* data preprocessing & postprocessing, + +* optimizer & schedulers, + +* any other supporting infrastructure: tokenizers, language model configuration, data augmentation etc. + +NeMo models are based on PyTorch. Many of their components are subclasses of ``torch.nn.Module``. NeMo models use PyTorch Lightning (PTL) for training, thus reducing the amount of boilerplate code needed. + +NeMo models are also designed to be easily configurable; often this is done with YAML files. Below we show simplified examples of a NeMo model defined in pseudocode, and a config defined in YAML. We highlight the lines where the Python config parameter is read from the YAML file. + +.. list-table:: Simplified examples of a model and config. + :widths: 1 1 + :header-rows: 0 + + * - .. code-block:: python + :caption: NeMo model definition (Python pseudocode) + :linenos: + :emphasize-lines: 4, 7, 10, 13, 16, 20 + + class ExampleEncDecModel: + # cfg is passed so it only contains "model" section + def __init__(self, cfg, trainer): + self.tokenizer = init_from_cfg(cfg.tokenizer) + + + self.encoder = init_from_cfg(cfg.encoder) + + + self.decoder = init_from_cfg(cfg.decoder) + + + self.loss = init_from_cfg(cfg.loss) + + + # optimizer configured via parent class + + + def setup_training_data(self, cfg): + self.train_dl = init_dl_from_cfg(cfg.train_ds) + + def forward(self, batch): + # forward pass defined, + # as is standard for PyTorch models + ... + + def training_step(self, batch): + log_probs = self.forward(batch) + loss = self.loss(log_probs, labels) + return loss + + + - .. code-block:: yaml + :caption: Experiment config (YAML) + :linenos: + :emphasize-lines: 4, 7, 10, 13, 16, 20 + + # + # desired configuration of the NeMo model + model: + tokenizer: + ... + + encoder: + ... + + decoder: + ... + + loss: + ... + + optim: + ... + + + train_ds: + ... + + # desired configuration of the + # PyTorch Lightning trainer object + trainer: + ... + + +Configuring and training NeMo models +------------------------------------ + +During initialization of the model, a lot of key parameters are read from the config (``cfg``), which gets passed in to the model construtor (left panel above, line 2). + +The other object that passed into the constructor is a PyTorch Lightning ``trainer`` object, which handles the training process. The trainer will take care of the standard training `boilerplate `__. For things that are not-standard, PTL will refer to any specific methods that we may have defined in our NeMo model. For example, PTL requires every model to have a specified ``training_step`` method (left panel above, line 15). + +The configuration of the trainer is also specified in the config (right panel above, line 20 onwards). This will include parameters such as (number of) ``devices``, ``max_steps``, (numerical) ``precision`` and `more `__. + + +Example training script +----------------------- + +Putting it all together, here is an example training script for our ``ExampleEncDecModel`` model. We highlight the 3 most important lines, which put together everything we discussed in the previous section. + +.. code-block:: python + :caption: run_example_training.py + :linenos: + :emphasize-lines: 10, 11, 12 + + import pytorch_lightning as pl + from nemo.collections.path_to_model_class import ExampleEncDecModel + from nemo.core.config import hydra_runner + + @hydra_runner( + config_path="config_file_dir_path", + config_name="config_file_name" + ) + def main(cfg): + trainer = pl.Trainer(**cfg.trainer) + model = ExampleEncDecModel(cfg.model, trainer) + trainer.fit(model) + + if __name__ == '__main__': + main(cfg) + + +Let's go through the code: + +* *Lines 1-3*: import statements (second one is made up for the example). +* *Lines 5-8*: a decorator on lines 5-8 of ``run_example_training.py`` will look for a config file at ``{config_path}/{config_name}.yaml``, and load its contents into the ``cfg`` object that is passed into the ``main`` function. This functionality is provided by `Hydra `__. Instead of a YAML file, we could also have specified the default config as a dataclass, and passed that into the ``@hydra_runner`` decorator. +* *Line 7*: initialize a PTL trainer object, using the parameters specified in the ``trainer`` section of the config. +* *Line 8*: initialize a NeMo model, passing in both the parameters in the ``model`` section of the config, and a PTL trainer. +* *Line 9*: call ``trainer.fit`` on the model. This one unassuming line will carry out our entire training process. PTL will make sure we iterate over our data and call the ``training_step`` we define for each batch (as well as any other PTL `callbacks `__ that may have been defined). + + + +Overriding configs +------------------ + +The ``cfg`` object in the script above is a dictionary-like object that contains our configuration parameters. Specifically, it is an `OmegaConf `__ ``DictConfig`` object. These objects have special features such as dot-notation `access `__, `variable interpolation `__ and ability to set `mandatory values `__. + +We can run the script above like this: + +.. code-block:: bash + + python run_example_training.py + +This will use the default config file specified inside the ``@hydra_runner`` decorator. + +We can specify a different config file to use by calling the script like this: + +.. code-block:: diff + + python run_example_training.py \ + + --config_path="different_config_file_dir_path" \ + + --config_name="different_config_file_name" + +We can also override, delete or add elements to the config when we call the script like this: + + +.. code-block:: diff + + python run_example_training.py \ + --config_path="different_config_file_dir_path" \ + --config_name="different_config_file_name" \ + + model.optim.lr=0.001 \ # overwriting + + model.train_ds.manifest_filepath="your_train_data.json" \ # overwriting + + ~trainer.max_epochs \ # deleting + + +trainer.max_steps=1000 # adding + +Running NeMo scripts +-------------------- + +NeMo scripts typically take on the form shown above, where the Python script relies on a config object which has some specified default values that you can choose to override. + +The NeMo `examples `__ directory contains many scripts for training and inference of various existing NeMo models. Note that this includes default configs whose default values for model, optimizer and trainer parameters were tuned over the course of many GPU-hours of the NeMo team's experiments. We thus recommend using these as a starting point for your own experiments. + +.. note:: + **NeMo inference scripts** + + The examples scripts directory also contains many inference scripts, e.g. `transcribe_speech.py `_. These normally have a different structure to the training scripts, as they have a lot of additional utilities for reading and saving files. The inference scripts also use configs, but these naturally do not require the ``trainer``, ``model``, ``exp_manager`` sections. Additionally, due to having fewer elements, the default configs for inference scripts are normally specified as dataclasses rather than separate files. Elements also can be overwritten/added/deleted via the command line. + + +Specifying training data +------------------------ + +NeMo will handle creation of data loaders for you, as long as you put your data into the expected input format. You may also need to train a tokenizer before starting training. Learn more about data formats for :doc:`LLM <../nlp/nemo_megatron/gpt/gpt_training>`, :doc:`Multimodal <../multimodal/mllm/datasets>`, :ref:`Speech AI `, and :doc:`Vision models <../vision/datasets>`. + + +Model checkpoints +----------------- + +Throughout training, model checkpoints will be saved inside ``.nemo`` files. These are archive files containing all the necessary components to restore a usable model, e.g.: + +* model weights (``.ckpt`` files), +* model configuration (``.yaml`` files), +* tokenizer files + +The NeMo team also releases pretrained models which you browse on `NGC `_ and `HuggingFace Hub `_. + + +Finetuning +---------- + +NeMo allows you to finetune models as well as train them from scratch. + +You can do this by initializing a model with random weights, replacing some/all the weights with those of a pretrained model, and then continuing training as normal, potentially with some small changes such as reducing your learning rate or freezing some model parameters. + + +.. _where_next: + +Where next? +----------- + +Here are some options: + +* dive in to `examples `_ or :doc:`tutorials <./tutorials>` +* read docs of the domain (e.g. :doc:`LLM <../nlp/nemo_megatron/intro>`, :doc:`Multimodal <../multimodal/mllm/intro>`, :doc:`ASR <../asr/intro>`, :doc:`TTS <../tts/intro>`, :doc:`Vision Models <../vision/intro>`) you want to work with +* learn more about the inner workings of NeMo: + + * `NeMo Primer `_ notebook tutorial + + * hands-on intro to NeMo, PyTorch Lightning, and OmegaConf + * shows how to use, modify, save, and restore NeMo models + + * `NeMo Models `__ notebook tutorial + + * explains the fundamentals of how NeMo models are created + + * :doc:`NeMo Core <../core/core>` documentation + From 1b10a74be4effca5a8c6769bc74658f656d83162 Mon Sep 17 00:00:00 2001 From: Elena Rastorgueva Date: Mon, 22 Jul 2024 16:11:42 -0700 Subject: [PATCH 02/12] remove unused reference tag Signed-off-by: Elena Rastorgueva --- docs/source/starthere/fundamentals.rst | 2 -- 1 file changed, 2 deletions(-) diff --git a/docs/source/starthere/fundamentals.rst b/docs/source/starthere/fundamentals.rst index 96f4a643dd63..b65902925494 100644 --- a/docs/source/starthere/fundamentals.rst +++ b/docs/source/starthere/fundamentals.rst @@ -3,8 +3,6 @@ NeMo fundamentals On this page we will go a little bit deeper on how NeMo works, to give you a good foundation for using NeMo for your :ref:`desired usecase `. -.. _nemo_model: - NeMo Models ----------- From 6ee5fad04b293cc30eb8a407e30d53e8eb15028a Mon Sep 17 00:00:00 2001 From: Elena Rastorgueva Date: Tue, 23 Jul 2024 14:23:49 -0700 Subject: [PATCH 03/12] add link to checkpoints intro Signed-off-by: Elena Rastorgueva --- docs/source/starthere/fundamentals.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/starthere/fundamentals.rst b/docs/source/starthere/fundamentals.rst index b65902925494..97d6a66a738f 100644 --- a/docs/source/starthere/fundamentals.rst +++ b/docs/source/starthere/fundamentals.rst @@ -199,7 +199,7 @@ NeMo will handle creation of data loaders for you, as long as you put your data Model checkpoints ----------------- -Throughout training, model checkpoints will be saved inside ``.nemo`` files. These are archive files containing all the necessary components to restore a usable model, e.g.: +Throughout training, model :doc:`checkpoints <../checkpoints/intro>` will be saved inside ``.nemo`` files. These are archive files containing all the necessary components to restore a usable model, e.g.: * model weights (``.ckpt`` files), * model configuration (``.yaml`` files), From 85660ab45b9d9f08cab25ee206366407f1182243 Mon Sep 17 00:00:00 2001 From: Elena Rastorgueva Date: Tue, 23 Jul 2024 14:38:48 -0700 Subject: [PATCH 04/12] clarify postprocessing and mention loss function Signed-off-by: Elena Rastorgueva --- docs/source/starthere/fundamentals.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/source/starthere/fundamentals.rst b/docs/source/starthere/fundamentals.rst index 97d6a66a738f..c818f44aa193 100644 --- a/docs/source/starthere/fundamentals.rst +++ b/docs/source/starthere/fundamentals.rst @@ -12,9 +12,9 @@ A NeMo "model" includes all of the below components wrapped into a singular, coh * dataset & data loaders, -* data preprocessing & postprocessing, +* preprocessing of input data & postprocessing of model outputs, -* optimizer & schedulers, +* loss function, optimizer & schedulers, * any other supporting infrastructure: tokenizers, language model configuration, data augmentation etc. From 5fe5e2d449728106f326c77213be5d2f77d7b03a Mon Sep 17 00:00:00 2001 From: Elena Rastorgueva Date: Tue, 23 Jul 2024 15:25:23 -0700 Subject: [PATCH 05/12] rephrase key parameters Signed-off-by: Elena Rastorgueva --- docs/source/starthere/fundamentals.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/source/starthere/fundamentals.rst b/docs/source/starthere/fundamentals.rst index c818f44aa193..ede000829956 100644 --- a/docs/source/starthere/fundamentals.rst +++ b/docs/source/starthere/fundamentals.rst @@ -99,9 +99,9 @@ NeMo models are also designed to be easily configurable; often this is done with Configuring and training NeMo models ------------------------------------ -During initialization of the model, a lot of key parameters are read from the config (``cfg``), which gets passed in to the model construtor (left panel above, line 2). +During initialization of the model, key parameters are read from the config (``cfg``), which gets passed in to the model construtor (left panel above, line 2). -The other object that passed into the constructor is a PyTorch Lightning ``trainer`` object, which handles the training process. The trainer will take care of the standard training `boilerplate `__. For things that are not-standard, PTL will refer to any specific methods that we may have defined in our NeMo model. For example, PTL requires every model to have a specified ``training_step`` method (left panel above, line 15). +The other object that passed into the constructor is a PyTorch Lightning ``trainer`` object, which handles the training process. The trainer will take care of the standard training `boilerplate `__. For things that are not standard, PTL will refer to any specific methods that we may have defined in our NeMo model. For example, PTL requires every model to have a specified ``training_step`` method (left panel above, line 15). The configuration of the trainer is also specified in the config (right panel above, line 20 onwards). This will include parameters such as (number of) ``devices``, ``max_steps``, (numerical) ``precision`` and `more `__. From 372139e740a0457f3b0d0389625a5778f941aec5 Mon Sep 17 00:00:00 2001 From: Elena Rastorgueva Date: Tue, 23 Jul 2024 15:35:46 -0700 Subject: [PATCH 06/12] fix typo Signed-off-by: Elena Rastorgueva --- docs/source/starthere/fundamentals.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/starthere/fundamentals.rst b/docs/source/starthere/fundamentals.rst index ede000829956..1cf5cea3b789 100644 --- a/docs/source/starthere/fundamentals.rst +++ b/docs/source/starthere/fundamentals.rst @@ -101,7 +101,7 @@ Configuring and training NeMo models During initialization of the model, key parameters are read from the config (``cfg``), which gets passed in to the model construtor (left panel above, line 2). -The other object that passed into the constructor is a PyTorch Lightning ``trainer`` object, which handles the training process. The trainer will take care of the standard training `boilerplate `__. For things that are not standard, PTL will refer to any specific methods that we may have defined in our NeMo model. For example, PTL requires every model to have a specified ``training_step`` method (left panel above, line 15). +The other object passed into the model's constructor is a PyTorch Lightning ``trainer`` object, which handles the training process. The trainer will take care of the standard training `boilerplate `__. For things that are not standard, PTL will refer to any specific methods that we may have defined in our NeMo model. For example, PTL requires every model to have a specified ``training_step`` method (left panel above, line 15). The configuration of the trainer is also specified in the config (right panel above, line 20 onwards). This will include parameters such as (number of) ``devices``, ``max_steps``, (numerical) ``precision`` and `more `__. From a663337f25773cd3462e8f8f2067c52bdee5eedd Mon Sep 17 00:00:00 2001 From: Elena Rastorgueva Date: Wed, 24 Jul 2024 10:52:43 -0700 Subject: [PATCH 07/12] mention trainer accelerator param Signed-off-by: Elena Rastorgueva --- docs/source/starthere/fundamentals.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/starthere/fundamentals.rst b/docs/source/starthere/fundamentals.rst index 1cf5cea3b789..9bc88cfa5e52 100644 --- a/docs/source/starthere/fundamentals.rst +++ b/docs/source/starthere/fundamentals.rst @@ -103,7 +103,7 @@ During initialization of the model, key parameters are read from the config (``c The other object passed into the model's constructor is a PyTorch Lightning ``trainer`` object, which handles the training process. The trainer will take care of the standard training `boilerplate `__. For things that are not standard, PTL will refer to any specific methods that we may have defined in our NeMo model. For example, PTL requires every model to have a specified ``training_step`` method (left panel above, line 15). -The configuration of the trainer is also specified in the config (right panel above, line 20 onwards). This will include parameters such as (number of) ``devices``, ``max_steps``, (numerical) ``precision`` and `more `__. +The configuration of the trainer is also specified in the config (right panel above, line 20 onwards). This will include parameters such as ``accelerator``, (number of) ``devices``, ``max_steps``, (numerical) ``precision`` and `more `__. Example training script From d2362d37eaa29ee3629d8a40a2967f0d15808898 Mon Sep 17 00:00:00 2001 From: Elena Rastorgueva Date: Wed, 24 Jul 2024 10:58:12 -0700 Subject: [PATCH 08/12] fix bulletpoint formatting Signed-off-by: Elena Rastorgueva --- docs/source/starthere/fundamentals.rst | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/source/starthere/fundamentals.rst b/docs/source/starthere/fundamentals.rst index 9bc88cfa5e52..90d573f89bd3 100644 --- a/docs/source/starthere/fundamentals.rst +++ b/docs/source/starthere/fundamentals.rst @@ -1,4 +1,4 @@ -NeMo fundamentals +NeMo Fundamentals ================= On this page we will go a little bit deeper on how NeMo works, to give you a good foundation for using NeMo for your :ref:`desired usecase `. @@ -8,15 +8,15 @@ NeMo Models A NeMo "model" includes all of the below components wrapped into a singular, cohesive unit: -* neural network architecture, +* neural network architecture -* dataset & data loaders, +* dataset and data loaders -* preprocessing of input data & postprocessing of model outputs, +* preprocessing of input data and postprocessing of model outputs -* loss function, optimizer & schedulers, +* loss function, optimizer, and schedulers -* any other supporting infrastructure: tokenizers, language model configuration, data augmentation etc. +* any other supporting infrastructure such as tokenizers, language model configuration, data augmentation NeMo models are based on PyTorch. Many of their components are subclasses of ``torch.nn.Module``. NeMo models use PyTorch Lightning (PTL) for training, thus reducing the amount of boilerplate code needed. From dcfc9d46bb3eb171d40246a6c3290132a40ded52 Mon Sep 17 00:00:00 2001 From: Elena Rastorgueva Date: Wed, 24 Jul 2024 10:59:31 -0700 Subject: [PATCH 09/12] fix bullet points part 2 Signed-off-by: Elena Rastorgueva --- docs/source/starthere/fundamentals.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/starthere/fundamentals.rst b/docs/source/starthere/fundamentals.rst index 90d573f89bd3..cef370c3d891 100644 --- a/docs/source/starthere/fundamentals.rst +++ b/docs/source/starthere/fundamentals.rst @@ -16,7 +16,7 @@ A NeMo "model" includes all of the below components wrapped into a singular, coh * loss function, optimizer, and schedulers -* any other supporting infrastructure such as tokenizers, language model configuration, data augmentation +* any other supporting infrastructure, such as tokenizers, language model configuration, and data augmentation NeMo models are based on PyTorch. Many of their components are subclasses of ``torch.nn.Module``. NeMo models use PyTorch Lightning (PTL) for training, thus reducing the amount of boilerplate code needed. From 0bba8bb4857dd564b1fbadd43e1631d6c44f9ea0 Mon Sep 17 00:00:00 2001 From: Elena Rastorgueva Date: Wed, 24 Jul 2024 11:16:15 -0700 Subject: [PATCH 10/12] quick formatting fixes Signed-off-by: Elena Rastorgueva --- docs/source/starthere/fundamentals.rst | 41 +++++++++++++------------- 1 file changed, 20 insertions(+), 21 deletions(-) diff --git a/docs/source/starthere/fundamentals.rst b/docs/source/starthere/fundamentals.rst index cef370c3d891..524670b661d6 100644 --- a/docs/source/starthere/fundamentals.rst +++ b/docs/source/starthere/fundamentals.rst @@ -18,9 +18,9 @@ A NeMo "model" includes all of the below components wrapped into a singular, coh * any other supporting infrastructure, such as tokenizers, language model configuration, and data augmentation -NeMo models are based on PyTorch. Many of their components are subclasses of ``torch.nn.Module``. NeMo models use PyTorch Lightning (PTL) for training, thus reducing the amount of boilerplate code needed. +NeMo models are built on PyTorch, with many of their components being subclasses of ``torch.nn.Module``. Additionally, NeMo models utilize PyTorch Lightning (PTL) for training, which helps reduce the boilerplate code required. -NeMo models are also designed to be easily configurable; often this is done with YAML files. Below we show simplified examples of a NeMo model defined in pseudocode, and a config defined in YAML. We highlight the lines where the Python config parameter is read from the YAML file. +NeMo models are also designed to be easily configurable; often this is done with YAML files. Below we show simplified examples of a NeMo model defined in pseudocode and a config defined in YAML. We highlight the lines where the Python config parameter is read from the YAML file. .. list-table:: Simplified examples of a model and config. :widths: 1 1 @@ -96,7 +96,7 @@ NeMo models are also designed to be easily configurable; often this is done with ... -Configuring and training NeMo models +Configuring and Training NeMo Models ------------------------------------ During initialization of the model, key parameters are read from the config (``cfg``), which gets passed in to the model construtor (left panel above, line 2). @@ -106,7 +106,7 @@ The other object passed into the model's constructor is a PyTorch Lightning ``tr The configuration of the trainer is also specified in the config (right panel above, line 20 onwards). This will include parameters such as ``accelerator``, (number of) ``devices``, ``max_steps``, (numerical) ``precision`` and `more `__. -Example training script +Example Training Script ----------------------- Putting it all together, here is an example training script for our ``ExampleEncDecModel`` model. We highlight the 3 most important lines, which put together everything we discussed in the previous section. @@ -136,27 +136,26 @@ Putting it all together, here is an example training script for our ``ExampleEnc Let's go through the code: * *Lines 1-3*: import statements (second one is made up for the example). -* *Lines 5-8*: a decorator on lines 5-8 of ``run_example_training.py`` will look for a config file at ``{config_path}/{config_name}.yaml``, and load its contents into the ``cfg`` object that is passed into the ``main`` function. This functionality is provided by `Hydra `__. Instead of a YAML file, we could also have specified the default config as a dataclass, and passed that into the ``@hydra_runner`` decorator. -* *Line 7*: initialize a PTL trainer object, using the parameters specified in the ``trainer`` section of the config. +* *Lines 5-8*: a decorator on lines 5-8 of ``run_example_training.py`` will look for a config file at ``{config_path}/{config_name}.yaml`` and load its contents into the ``cfg`` object that is passed into the ``main`` function. This functionality is provided by `Hydra `__. Instead of a YAML file, we could also have specified the default config as a dataclass and passed that into the ``@hydra_runner`` decorator. +* *Line 7*: initialize a PTL trainer object using the parameters specified in the ``trainer`` section of the config. * *Line 8*: initialize a NeMo model, passing in both the parameters in the ``model`` section of the config, and a PTL trainer. * *Line 9*: call ``trainer.fit`` on the model. This one unassuming line will carry out our entire training process. PTL will make sure we iterate over our data and call the ``training_step`` we define for each batch (as well as any other PTL `callbacks `__ that may have been defined). -Overriding configs +Overriding Configs ------------------ -The ``cfg`` object in the script above is a dictionary-like object that contains our configuration parameters. Specifically, it is an `OmegaConf `__ ``DictConfig`` object. These objects have special features such as dot-notation `access `__, `variable interpolation `__ and ability to set `mandatory values `__. - -We can run the script above like this: +The ``cfg`` object in the script above is a dictionary-like object that contains our configuration parameters. Specifically, it is an `OmegaConf `__ ``DictConfig`` object. These objects have special features such as dot-notation `access `__, `variable interpolation `__, and the ability to set `mandatory values `__. +You can run the script above by running the following: .. code-block:: bash python run_example_training.py -This will use the default config file specified inside the ``@hydra_runner`` decorator. +The script will use the default config file specified inside the ``@hydra_runner`` decorator. -We can specify a different config file to use by calling the script like this: +To specify a different config file, you can call the script like this: .. code-block:: diff @@ -164,7 +163,7 @@ We can specify a different config file to use by calling the script like this: + --config_path="different_config_file_dir_path" \ + --config_name="different_config_file_name" -We can also override, delete or add elements to the config when we call the script like this: +You can also override, delete, or add elements to the config by calling a script like this: .. code-block:: diff @@ -177,7 +176,7 @@ We can also override, delete or add elements to the config when we call the scri + ~trainer.max_epochs \ # deleting + +trainer.max_steps=1000 # adding -Running NeMo scripts +Running NeMo Scripts -------------------- NeMo scripts typically take on the form shown above, where the Python script relies on a config object which has some specified default values that you can choose to override. @@ -196,29 +195,29 @@ Specifying training data NeMo will handle creation of data loaders for you, as long as you put your data into the expected input format. You may also need to train a tokenizer before starting training. Learn more about data formats for :doc:`LLM <../nlp/nemo_megatron/gpt/gpt_training>`, :doc:`Multimodal <../multimodal/mllm/datasets>`, :ref:`Speech AI `, and :doc:`Vision models <../vision/datasets>`. -Model checkpoints +Model Checkpoints ----------------- Throughout training, model :doc:`checkpoints <../checkpoints/intro>` will be saved inside ``.nemo`` files. These are archive files containing all the necessary components to restore a usable model, e.g.: -* model weights (``.ckpt`` files), -* model configuration (``.yaml`` files), +* model weights (``.ckpt`` files) +* model configuration (``.yaml`` files) * tokenizer files -The NeMo team also releases pretrained models which you browse on `NGC `_ and `HuggingFace Hub `_. +The NeMo team also releases pretrained models which you can browse on `NGC `_ and `HuggingFace Hub `_. -Finetuning +Fine-Tuning ---------- -NeMo allows you to finetune models as well as train them from scratch. +NeMo allows you to fine-tune models as well as train them from scratch. You can do this by initializing a model with random weights, replacing some/all the weights with those of a pretrained model, and then continuing training as normal, potentially with some small changes such as reducing your learning rate or freezing some model parameters. .. _where_next: -Where next? +Where To Go Next? ----------- Here are some options: From 521ea3d365a3318b7982cca1c633f0af1289f5a7 Mon Sep 17 00:00:00 2001 From: Elena Rastorgueva Date: Wed, 24 Jul 2024 14:24:53 -0700 Subject: [PATCH 11/12] fix phrasing Signed-off-by: Elena Rastorgueva --- docs/source/starthere/fundamentals.rst | 45 ++++++++++++++------------ 1 file changed, 24 insertions(+), 21 deletions(-) diff --git a/docs/source/starthere/fundamentals.rst b/docs/source/starthere/fundamentals.rst index 524670b661d6..2612726f6d27 100644 --- a/docs/source/starthere/fundamentals.rst +++ b/docs/source/starthere/fundamentals.rst @@ -1,12 +1,12 @@ NeMo Fundamentals ================= -On this page we will go a little bit deeper on how NeMo works, to give you a good foundation for using NeMo for your :ref:`desired usecase `. +On this page, we’ll look into how NeMo works, providing you with a solid foundation to effectively use NeMo for you :ref:`specific use case `. NeMo Models ----------- -A NeMo "model" includes all of the below components wrapped into a singular, cohesive unit: +NVIDIA NeMo is a powerful framework for building and deploying neural network models, including those used in generative AI, speech recognition, and natural language processing. NeMo stands for “Neural Modules,” which are the building blocks of the models created using this platform. NeMo includes all of the following components wrapped into a singular, cohesive unit: * neural network architecture @@ -101,15 +101,15 @@ Configuring and Training NeMo Models During initialization of the model, key parameters are read from the config (``cfg``), which gets passed in to the model construtor (left panel above, line 2). -The other object passed into the model's constructor is a PyTorch Lightning ``trainer`` object, which handles the training process. The trainer will take care of the standard training `boilerplate `__. For things that are not standard, PTL will refer to any specific methods that we may have defined in our NeMo model. For example, PTL requires every model to have a specified ``training_step`` method (left panel above, line 15). +The other object passed into the model's constructor is a PyTorch Lightning ``trainer`` object, which manages the training process. The trainer handles the standard training `boilerplate `__. For non-standard tasks, PyTorch Lightning (PTL) relies on specific methods defined in our NeMo model. For example, PTL mandates that every model must have a specified ``training_step`` method (left panel above, line 15). -The configuration of the trainer is also specified in the config (right panel above, line 20 onwards). This will include parameters such as ``accelerator``, (number of) ``devices``, ``max_steps``, (numerical) ``precision`` and `more `__. +The trainer’s configuration is also specified in the config (right panel above, line 20 onwards). This includes parameters such as ``accelerator``, (number of) ``devices``, ``max_steps``, (numerical) ``precision`` and `more `__. Example Training Script ----------------------- -Putting it all together, here is an example training script for our ``ExampleEncDecModel`` model. We highlight the 3 most important lines, which put together everything we discussed in the previous section. +Below is an example training script for our ``ExampleEncDecModel`` model. We highlight the three most important lines that combine everything we discussed in the previous section: .. code-block:: python :caption: run_example_training.py @@ -149,6 +149,7 @@ Overriding Configs The ``cfg`` object in the script above is a dictionary-like object that contains our configuration parameters. Specifically, it is an `OmegaConf `__ ``DictConfig`` object. These objects have special features such as dot-notation `access `__, `variable interpolation `__, and the ability to set `mandatory values `__. You can run the script above by running the following: + .. code-block:: bash python run_example_training.py @@ -181,24 +182,24 @@ Running NeMo Scripts NeMo scripts typically take on the form shown above, where the Python script relies on a config object which has some specified default values that you can choose to override. -The NeMo `examples `__ directory contains many scripts for training and inference of various existing NeMo models. Note that this includes default configs whose default values for model, optimizer and trainer parameters were tuned over the course of many GPU-hours of the NeMo team's experiments. We thus recommend using these as a starting point for your own experiments. +The NeMo `examples `__ directory provides numerous scripts for training and inference of various existing NeMo models. It’s important to note that these scripts include default configurations for model, optimize, and training parameters, which have been fine-tuned by the NeMo team over extensive GPU-hours of experimentation. As a result, we recommend using these default configurations as a starting point for your own experiments -.. note:: - **NeMo inference scripts** - The examples scripts directory also contains many inference scripts, e.g. `transcribe_speech.py `_. These normally have a different structure to the training scripts, as they have a lot of additional utilities for reading and saving files. The inference scripts also use configs, but these naturally do not require the ``trainer``, ``model``, ``exp_manager`` sections. Additionally, due to having fewer elements, the default configs for inference scripts are normally specified as dataclasses rather than separate files. Elements also can be overwritten/added/deleted via the command line. +NeMo Inference Scripts +###################### +The examples scripts directory also contains many inference scripts such as `transcribe_speech.py `_. These inference scripts typically differ in structure from training scripts, as they include additional utilities for file I/O (reading and saving files). While inference scripts still use configurations (configs), they don’t require the ``trainer`` and ``model`` sections. Additionally, the default configs for inference scripts are usually specified as dataclasses rather than separate files. You can also modify elements via the command line. Specifying training data ------------------------ -NeMo will handle creation of data loaders for you, as long as you put your data into the expected input format. You may also need to train a tokenizer before starting training. Learn more about data formats for :doc:`LLM <../nlp/nemo_megatron/gpt/gpt_training>`, :doc:`Multimodal <../multimodal/mllm/datasets>`, :ref:`Speech AI `, and :doc:`Vision models <../vision/datasets>`. +NeMo will handle creation of data loaders for you, as long as you put your data into the expected input format. You may also need to train a tokenizer before starting training. To learn more about data formats, see :doc:`LLM <../nlp/nemo_megatron/gpt/gpt_training>`, :doc:`Multimodal <../multimodal/mllm/datasets>`, :ref:`Speech AI `, and :doc:`Vision models <../vision/datasets>`. Model Checkpoints ----------------- -Throughout training, model :doc:`checkpoints <../checkpoints/intro>` will be saved inside ``.nemo`` files. These are archive files containing all the necessary components to restore a usable model, e.g.: +Throughout training, the model :doc:`checkpoints <../checkpoints/intro>` will be saved inside ``.nemo`` files. These are archive files containing all the necessary components to restore a usable model. For example: * model weights (``.ckpt`` files) * model configuration (``.yaml`` files) @@ -212,7 +213,7 @@ Fine-Tuning NeMo allows you to fine-tune models as well as train them from scratch. -You can do this by initializing a model with random weights, replacing some/all the weights with those of a pretrained model, and then continuing training as normal, potentially with some small changes such as reducing your learning rate or freezing some model parameters. +You can achieve this by initializing a model with random weights, then replacing some or all of those weights with the pretrained model’s weights. Afterward, continue training as usual, possibly making minor adjustments like reducing the learning rate or freezing specific model parameters. .. _where_next: @@ -222,18 +223,20 @@ Where To Go Next? Here are some options: -* dive in to `examples `_ or :doc:`tutorials <./tutorials>` -* read docs of the domain (e.g. :doc:`LLM <../nlp/nemo_megatron/intro>`, :doc:`Multimodal <../multimodal/mllm/intro>`, :doc:`ASR <../asr/intro>`, :doc:`TTS <../tts/intro>`, :doc:`Vision Models <../vision/intro>`) you want to work with -* learn more about the inner workings of NeMo: +* Explore Examples or Tutorials: dive into NeMo by exploring our `examples `_ or :doc:`tutorials <./tutorials>` + +* Domain-Specific Documentation: - * `NeMo Primer `_ notebook tutorial + * For Large Language Models (LLMs), checkout out the :doc:`LLM <../nlp/nemo_megatron/intro>` documentation. + * For Multimodal tasks, refer to the :doc:`Multimodal <../multimodal/mllm/intro>` documentation. - * hands-on intro to NeMo, PyTorch Lightning, and OmegaConf - * shows how to use, modify, save, and restore NeMo models + * If you’re interested in Automatic Speech Recognition (ASR), explore the :doc:`ASR <../asr/intro>`` documentation. + * For Text-to-Speech (TTS), find details in the :doc:`TTS <../tts/intro>` documentation. + * Lastly, for Vision Models, consult the :doc:`Vision Models <../vision/intro>` documentation. - * `NeMo Models `__ notebook tutorial +* `NeMo Primer `__: This tutorial provides a hands-on introduction to NeMo, PyTorch Lightning, and OmegaConf. It covers how to use, modify, save, and restore NeMo models. - * explains the fundamentals of how NeMo models are created +* `NeMo Models `__: In this tutorial, you'll learn the fundamentals of creating NeMo models. - * :doc:`NeMo Core <../core/core>` documentation +* NeMo Core Documentation: Explore the :doc:`NeMo Core <../core/core>` documentation for NeMo, which explains the inner workings of the framework. From 2350c00c3a5ae89120325a04a8e2346c6853a6e1 Mon Sep 17 00:00:00 2001 From: Elena Rastorgueva Date: Wed, 24 Jul 2024 15:29:38 -0700 Subject: [PATCH 12/12] update based on review plus other small fixes Signed-off-by: Elena Rastorgueva --- docs/source/starthere/fundamentals.rst | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/docs/source/starthere/fundamentals.rst b/docs/source/starthere/fundamentals.rst index 2612726f6d27..6413cb9d376a 100644 --- a/docs/source/starthere/fundamentals.rst +++ b/docs/source/starthere/fundamentals.rst @@ -69,7 +69,7 @@ NeMo models are also designed to be easily configurable; often this is done with :emphasize-lines: 4, 7, 10, 13, 16, 20 # - # desired configuration of the NeMo model + # configuration of the NeMo model model: tokenizer: ... @@ -90,7 +90,7 @@ NeMo models are also designed to be easily configurable; often this is done with train_ds: ... - # desired configuration of the + # configuration of the # PyTorch Lightning trainer object trainer: ... @@ -99,11 +99,11 @@ NeMo models are also designed to be easily configurable; often this is done with Configuring and Training NeMo Models ------------------------------------ -During initialization of the model, key parameters are read from the config (``cfg``), which gets passed in to the model construtor (left panel above, line 2). +During initialization of the model, the "model" section of the config is passed into the model's constructor (as the variable ``cfg``, see line 3 of the left panel above). The model class will read key parameters from the ``cfg`` variable to configure the model (see highlighted lines in the left panel above). -The other object passed into the model's constructor is a PyTorch Lightning ``trainer`` object, which manages the training process. The trainer handles the standard training `boilerplate `__. For non-standard tasks, PyTorch Lightning (PTL) relies on specific methods defined in our NeMo model. For example, PTL mandates that every model must have a specified ``training_step`` method (left panel above, line 15). +The other object passed into the model's constructor is a PyTorch Lightning ``trainer`` object, which manages the training process. The trainer handles the standard training `boilerplate `__. For non-standard tasks, PyTorch Lightning (PTL) relies on specific methods defined in our NeMo model. For example, PTL mandates that every model must have a specified ``training_step`` method (left panel above, line 27). -The trainer’s configuration is also specified in the config (right panel above, line 20 onwards). This includes parameters such as ``accelerator``, (number of) ``devices``, ``max_steps``, (numerical) ``precision`` and `more `__. +The trainer’s configuration is also specified in the config (right panel above, line 25 onwards). This includes parameters such as ``accelerator``, (number of) ``devices``, ``max_steps``, (numerical) ``precision`` and `more `__. Example Training Script @@ -136,10 +136,10 @@ Below is an example training script for our ``ExampleEncDecModel`` model. We hig Let's go through the code: * *Lines 1-3*: import statements (second one is made up for the example). -* *Lines 5-8*: a decorator on lines 5-8 of ``run_example_training.py`` will look for a config file at ``{config_path}/{config_name}.yaml`` and load its contents into the ``cfg`` object that is passed into the ``main`` function. This functionality is provided by `Hydra `__. Instead of a YAML file, we could also have specified the default config as a dataclass and passed that into the ``@hydra_runner`` decorator. -* *Line 7*: initialize a PTL trainer object using the parameters specified in the ``trainer`` section of the config. -* *Line 8*: initialize a NeMo model, passing in both the parameters in the ``model`` section of the config, and a PTL trainer. -* *Line 9*: call ``trainer.fit`` on the model. This one unassuming line will carry out our entire training process. PTL will make sure we iterate over our data and call the ``training_step`` we define for each batch (as well as any other PTL `callbacks `__ that may have been defined). +* *Lines 5-8*: the decorator will look for a config file at ``{config_path}/{config_name}.yaml`` and load its contents into the ``cfg`` object that is passed into the ``main`` function on line 9. This functionality is provided by `Hydra `__. Instead of a YAML file, we could also have specified the default config as a dataclass and passed that into the ``@hydra_runner`` decorator. +* *Line 10*: initialize a PTL trainer object using the parameters specified in the ``trainer`` section of the config. +* *Line 11*: initialize a NeMo model, passing in both the parameters in the ``model`` section of the config, and a PTL ``trainer`` object. +* *Line 12*: call ``trainer.fit`` on the model. This one unassuming line will carry out our entire training process. PTL will make sure we iterate over our data and call the ``training_step`` we define for each batch (as well as any other PTL `callbacks `__ that may have been defined). @@ -230,7 +230,7 @@ Here are some options: * For Large Language Models (LLMs), checkout out the :doc:`LLM <../nlp/nemo_megatron/intro>` documentation. * For Multimodal tasks, refer to the :doc:`Multimodal <../multimodal/mllm/intro>` documentation. - * If you’re interested in Automatic Speech Recognition (ASR), explore the :doc:`ASR <../asr/intro>`` documentation. + * If you’re interested in Automatic Speech Recognition (ASR), explore the :doc:`ASR <../asr/intro>` documentation. * For Text-to-Speech (TTS), find details in the :doc:`TTS <../tts/intro>` documentation. * Lastly, for Vision Models, consult the :doc:`Vision Models <../vision/intro>` documentation.