Skip to content

novasearch/cuda

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

66 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

.. hightlight:: rest

This roll installs:

  • NVIDIA CUDA Toolkit 9.2.148 + Patch 1
  • NVIDIA Driver 396.54
  • NVIDIA CUDA Deep Neural Network library (cuDNN) 7.4.2.24

For more information about the NVIDIA CUDA Toolkit please see the official NVIDIA developer website

To build/install this roll you need to download cuda toolkit and driver source files (*.run format) and plase them in respective directories in src/:

The toolkit distro is ~1Gb. Must have enough space (~ 1.5GB) in / when building the roll.

To build the roll, execute :

# make 2>&1 | tee build.log

A successful build will create cuda-*.x86_64*.iso file.

To add this roll to existing cluster, execute these instructions on a Rocks frontend node:

# rocks add roll *.iso
# rocks enable roll cuda
# cd /export/rocks/install
# rocks create distro
# rocks run roll cuda > add-roll.sh

And on login node execute resulting add-roll.sh:

# bash add-roll.sh 2>&1 | tee  add-roll.out

Reinstall compute nodes (only GPU-enabled):

# rocks set host attr compute-X-Y cuda true
# rocks set host boot compute-X-Y action=install
# rocks run host compute-X-Y reboot

After the compute node comes up reboot it again to initiate the driver installation and loading.

In addition to the software, the roll installs cuda environment module files in:

/opt/modulefiles/applications/cuda

To use the modules:

% module load cuda

The following is installed with cuda roll:

/opt/cuda/driver - NVIDIA driver
/etc/init.d/nvidia  - nvidia startup/shutdown script (disabled on login node)
/opt/cuda   - toolkit (without samples on compute nodes)
/opt/modules/applications/cuda - module environment

On login nodes:

/opt/cuda/samples  - code samples
/var/www/html/cuda - link to cuda html documentation

The tests commands are run on GPU-enabled nodes.

To find information about installed GPU card execute:

nvidia-smi

Run GPU device tests :

% /opt/cuda/bin/deviceQuery
% /opt/cuda/bin/deviceQueryDrv
% /opt/cuda/bin/bandwidthTest
% /opt/cuda/bin/p2pBandwidthLatencyTest

Some users reposrt increase in virtual memory use when using CUDA. See following links for additional info.

Useful commands:

pmap -x PID
more /proc/PID/smaps

GPU monitoring plugin for gmond

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Makefile 72.4%
  • Shell 27.6%