Add Bert HF checkpoint converter by yaoyu-33 · Pull Request #8088 · NVIDIA-NeMo/NeMo

yaoyu-33 · 2023-12-27T16:53:03Z

What does this PR do ?

Add BERT HF checkpoint converter and HF Bert Support in NeMo

Collection: [Note which collection this PR will affect]

Changelog

Add specific line by line info of high level changes in this PR.

Usage

You can potentially add a usage example below

# Add a code snippet demonstrating how to use this

Jenkins CI

To run Jenkins, a NeMo User with write access must comment jenkins on the PR.

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

New Feature
Bugfix
Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

Related to # (issue)

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

for more information, see https://pre-commit.ci

scripts/nlp_language_modeling/convert_hf_bert_to_nemo.py

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

for more information, see https://pre-commit.ci

scripts/nlp_language_modeling/convert_nemo_bert_to_hf.py

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

for more information, see https://pre-commit.ci

yaoyu-33 · 2024-01-22T22:31:49Z

jenkins

yaoyu-33 · 2024-01-22T22:38:04Z

jenkins

yaoyu-33 · 2024-01-22T22:56:23Z

jenkins

shanmugamr1992 · 2024-01-22T23:36:50Z

examples/nlp/language_modeling/conf/megatron_bert_config.yaml

 restore_from_path: null # used when starting from a .nemo file

 trainer:
-  devices: 2


Why make changes to this file ?

default conversion script uses this yaml file and use 1 gpu by default. and other models like https://github.com/NVIDIA/NeMo/blob/36a31c03497985c8f2165d4f9be788b574443391/examples/nlp/language_modeling/conf/megatron_gpt_config.yaml uses 1 gpu as well. Is there a reason to do 2?

shanmugamr1992 · 2024-01-22T23:36:59Z

examples/nlp/language_modeling/conf/megatron_bert_config.yaml

    #   - /raid/data/pile/my-gpt3_00_text_document
    #   - .5
    #   - /raid/data/pile/my-gpt3_01_text_document
-    data_prefix: ???


Here as well ?

similar to above reason, try to make conversion work easier, since it's directly load this yaml. If put ??? it will raise an error while loading I think

shanmugamr1992 · 2024-01-22T23:37:40Z

examples/nlp/language_modeling/conf/megatron_bert_config.yaml

  ffn_hidden_size: 3072 # Transformer FFN hidden size. Usually 4 * hidden_size.
  num_attention_heads: 12
+  skip_head: False
+  transformer_block_type: post_ln


Default is preln?

previous default is preln. you want me to change it back? I am okay with it

@ericharper What do you think ?

shanmugamr1992 · 2024-01-22T23:38:08Z

nemo/collections/nlp/models/language_modeling/megatron/bert_model.py

        activations_checkpoint_layers_per_pipeline=None,
        layernorm_epsilon=1e-5,
+        normalization='layernorm',
+        transformer_block_type='pre_ln',


Not necessary to change, but should we consider making these values enums ? So that people dont do typos ?

yeah, that will affect all models, maybe add in nemo refactor plan

shanmugamr1992 · 2024-01-22T23:42:17Z

nemo/collections/nlp/models/language_modeling/megatron/bert_model.py


+        if skip_head:
+            self.post_process = False
        if self.post_process:


if not skip_head and if self.post_process : ?? Or maybe if self.post_process is used in other areas just leave it as such.

yeah, it affect other places as well. But I can change it all to if not skip_head and if self.post_process. Do you want me to change? I remember current way is cleaner in terms of code but not necessarily more readable

Well, if you think its cleaner then leave it as such.

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

yaoyu-33 · 2024-01-26T19:35:13Z

jenkins

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

yaoyu-33 · 2024-01-30T23:08:19Z

jenkins

Add Bert HF checkpoint converter

fe87ed8

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

github-actions bot added the NLP label Dec 27, 2023

[pre-commit.ci] auto fixes from pre-commit.com hooks

0cbdf71

for more information, see https://pre-commit.ci

github-advanced-security bot found potential problems Dec 27, 2023

View reviewed changes

yaoyu-33 and others added 5 commits January 8, 2024 15:12

Reformat

1c85c19

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

Add BERT ONNX export

40ff7b4

[pre-commit.ci] auto fixes from pre-commit.com hooks

11f3d76

for more information, see https://pre-commit.ci

Add NeMo BERT to HF BERT script

b03839b

[pre-commit.ci] auto fixes from pre-commit.com hooks

eab88a9

for more information, see https://pre-commit.ci

github-advanced-security bot found potential problems Jan 12, 2024

View reviewed changes

scripts/nlp_language_modeling/convert_nemo_bert_to_hf.py Fixed Show fixed Hide fixed

yaoyu-33 and others added 2 commits January 22, 2024 13:38

Clean code

93ee6f9

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

[pre-commit.ci] auto fixes from pre-commit.com hooks

a4402f3

for more information, see https://pre-commit.ci

yaoyu-33 changed the title ~~[DRAFT] Add Bert HF checkpoint converter~~ Add Bert HF checkpoint converter Jan 22, 2024

Merge branch 'main' into yuya/add_bert_hf_converter

36a31c0

shanmugamr1992 reviewed Jan 22, 2024

View reviewed changes

yaoyu-33 mentioned this pull request Jan 22, 2024

Add support for HF style Bert #8219

Closed

shanmugamr1992 reviewed Jan 22, 2024

View reviewed changes

yaoyu-33 mentioned this pull request Jan 22, 2024

[ENHANCEMENT] Add support for HF style Bert NVIDIA/Megatron-LM#671

Closed

shanmugamr1992 reviewed Jan 22, 2024

View reviewed changes

shanmugamr1992 previously approved these changes Jan 23, 2024

View reviewed changes

Update argument names

471b802

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

yaoyu-33 dismissed shanmugamr1992’s stale review via 471b802 January 26, 2024 19:34

Update build_transformer_config in Bert

c61c7d6

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

Conversation

yaoyu-33 commented Dec 27, 2023

What does this PR do ?

Changelog

Usage

Jenkins CI

Before your PR is "Ready for review"

Who can review?

Additional Information

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yaoyu-33 commented Jan 22, 2024

Uh oh!

yaoyu-33 commented Jan 22, 2024

Uh oh!

yaoyu-33 commented Jan 22, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yaoyu-33 commented Jan 26, 2024

Uh oh!

yaoyu-33 commented Jan 30, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants