Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

[Performance Regression] GPU memory increase for training and inference models #18280

@karan6181

Description

@karan6181

Description

  • There is an MXNet nightly benchmark which runs CV and NLP models on MXNet Nightly pip wheel and report the metrics and it showed a performance regression on GPU Memory.
  • After bisecting the PRs, the PR due to which there is an increase in GPU memory of 120-130 MB happen is 17767. I ran the SSD training with and without the PR commit and it showed around 120MB increase in GPU Memory. I haven't ran all those models personally but the reproducing script is basic and it is applicable to all those models.
  • List of models that affected are as below. The GPU Memory increase data we got from our internal benchmarking system:
model GPU Memory (From:to)
VGG16_training 9.63k to 9.78k
SSD_training 4.75k to 4.91k
LSTM_inference: Gluon and Module G: 1.55k to 1.71
M: 1.47k to 1.64k
Caffenet_inference: Gluon and Module G: 2.17k to 2.31k
M: 2.0k to 2.16k
YoloV3_GPU: Gluon and Module G: 1.87k to 2.03k
M: 1.84k to 2.0k
Resnet50_v2_FP16_inference_GPU: Gluon and Module G: 1.58k to 1.74k
M: 1.55k to 1.71k
Resnet50_v2_inference_GPU: Gluon and Module G: 1.55k to 1.71k
M: 1.48k to 1.64k
Resnet152_v2_inference_GPU: Gluon and Module G: 1.92k to 2.08k
M: 1.86k to 2.02k
Inception_Inference_GPU: Gluon and Module G: 1.43k to 1.59k
M: 1.40 to 1.56k
SSD_inference_GPU: Gluon and Module G: 1.65k to 1.81k
M: 1.54k to 1.70k
A3C_inference_GPU: Gluon and Module G: 1.32k to 1.48k
M: 1.31k to 1.47k
word_language_model_hybrid_p3.16_training 2.25k to 2.43k
Mobile_pose_training 2.38k to 2.54k

To Reproduce

  • Run the below lines of code and monitor GPU usage using nvidia-smi command.
import mxnet as mx
a = mx.nd.zeros((1,), ctx=mx.gpu())

Output:

MXNet build from source:

Instance: p3.16xLarge

Without PR [17767] Commit id: f882de0c7ecd6ff1f0fdba492865afc6d7e29271
GPU usage: gpu(0): 1279MiB / 16160MiB

With PR [17767] Commit id: 5542d03695b4a2589afb88acf128d4ba8ac94d0d
GPU usage: gpu(0): 1407MiB / 16160MiB
  • I am tagging the PR author here: @ptrendx
  • Thanks @ptrendx for sharing the simple reproducible script.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions