-
Notifications
You must be signed in to change notification settings - Fork 260
Description
See NVIDIA/DIGITS#310.
/cc @ajsander
I've trained a couple models (Alexnet and GoogleNet) using DIGITS successfully with statistics shown for test and validation accuracy, but when I try to classify a single image using the web interface I get the following error:
WARNING: Logging before InitGoogleLogging() is written to STDERR
F0915 14:10:45.809661 98789 common.cpp:266] Check failed: error == cudaSuccess (2 vs. 0) out of memory
*** Check failure stack trace: ***When I check nvidia-smi it appears that it the amount of memory is increasing by around 100MB but it's still nowhere near the full memory capacity of the card at 3GB.
NVIDIA/DIGITS#310 (comment)
Here is some information about his system:
Running on an Amazon g2.8xlarge
GPU[s]: 4x GRID K520
CUDA 7.0
cuDNN 7.0
Caffe version 0.12 NVIDIA fork
DIGITS 2.1Both Alexnet and GoogleNet Experienced the same problem
NVIDIA/DIGITS#310 (comment)
Here's how I reproduced it:
- Start up a Ubuntu 14.04 on
g2.8xlargeEC2 instance - Install the 346 driver
- Installed DIGITS 2.0 and Caffe 0.13.1 (with CNMeM) using the web installer
- Create a small dataset of 256x256 images
- Train AlexNet on it
- Try to classify an image
The big question
Why would we run out of memory during inference but not while training?