Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

Bert ONNX export produces nan on Tensor RT in fp16 mode #19747

@Zha0q1

Description

@Zha0q1

I am trying to enable the path mxnet/gluon-nlp --> onnx --> tensorrt.
There is a bug that if I use a pretrained bert model, then running inference with tensor rt in fp16 mode will produce nan's.

Using pretrained weights:

bert, _ = nlp.model.get_model(
    name=model_name,
    ctx=ctx,
    dataset_name=dataset,
    **pretrained=True,**
    use_pooler=True,
    use_decoder=False,
    num_layers=3, # hardcode this as 3 layer since this is what the customer uses
    use_classifier=False,
    hparam_allow_override=True)
model = bert

Not using pretrained weights:

bert, _ = nlp.model.get_model(
    name=model_name,
    ctx=ctx,
    dataset_name=dataset,
    **pretrained=False,**
    use_pooler=True,
    use_decoder=False,
    num_layers=3, # hardcode this as 3 layer since this is what the customer uses
    use_classifier=False,
    hparam_allow_override=True)
model = bert
**model.initialize(ctx=ctx)**

More specifically, WITHOUT pretrained weights, tensor rt can produce reasonable outputs in both fp16 mode and regular fp32 mode. However, WITH pretrained weights, tensor rt will produce nan ouputs in fp16 mode, but fp32 mode seems to work fine. Furthermore, it seems like this nan issue is triggered by the size of seq_length: when seq_length<=16 even fp16 mode will produce reasonable outputs; when seq_length>17, fp 16 mode will start to produce nan's. batch batch size seems to not affect the nan behavior.

Reproducible code and steps can be found here #19746. Because we have a customer requesting this feature, it would be great if friends at Nvidia can help look into this issue. Please let me know how I can provide further info/help

@sandeep-krishnamurthy @MoisesHer @Kh4L @chinakook

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions