CUDA

CUDA (formerly Compute Unified Device Architecture) is a parallel computing platform and programming model created by NVIDIA and implemented by the graphics processing units (GPUs) that they produce.  CUDA gives developers access to the virtual instruction set and memory of the parallel computational elements in CUDA GPUs.

Policy

The CUDA Toolkit is freely available to users at HPC2N.

Citations

The following paper is frequently cited:

John Nickolls, Ian Buck, Michael Garland, Kevin Skadron
Scalable Parallel Programming with CUDA
ACM Queue, vol. 6 no. 2, March/April 2008, pp. 40-53 

Overview

The NVIDIA® CUDA® Toolkit provides a development environment for creating high performance GPU-accelerated applications. With the CUDA Toolkit, you can develop, optimize and deploy your applications on GPU-accelerated embedded systems, desktop workstations, enterprise data centers, cloud-based platforms and HPC supercomputers. The toolkit includes GPU-accelerated libraries, debugging and optimization tools, a C/C++ compiler and a runtime library to deploy your application.

Usage at HPC2N

On HPC2N we have CUDA available as a module.

Loading

To use the CUDA, add it to your environment. You can find versions with

module spider CUDA 

and then you can find how to load a specific version (including prerequisites) with

module spider CUDA/<VERSION>

Loading the module should set any needed environmental variables as well as the path.

Compiling and linking

After you have loaded the compiler toolchain module and CUDA, you compile and link with CUDA:

Language GCC, OpenMPI Intel, Intel MPI NVCC
Fortran calling CUDA functions 1) nvcc -c CUDAPROGRAM.cu
2) gfortran -lcudart -lcuda PROGRAM.f90 CUDAPROGRAM.o
C / C++ with CUDA mpicc CUDAPROGRAM.cu -lcuda -lcudart mpiicc CUDAPROGRAM.cu -lcuda -lcudart nvcc CUDAPROGRAM.cu

You can add other flags, like for instance -o my-binary to name the output differently than the standard a.out.

NOTE: CUDA functions can be called directly from Fortran programs:

  1. First use the nvcc compiler to create an object file from the .cu file.
  2. Then compile the Fortran code together with the object file from the .cu file.

  3. External info: CUDA Toolkit

Example, nvcc

To compile a CUDA program with the NVIDIA CUDA compiler driver nvcc, you first need to load CUDA.

We will be compiling the small test program “hello-world.cu” (and naming the executable “hello”):

nvcc hello-world.cu -o hello

Submitting as a batch job

Let use submit a small job that compiles and runs the above program:

#!/bin/bash 
# Remember to change this to your own project ID! 
#SBATCH -A hpc2nXXXX-YYY
#SBATCH --time=00:05:00
#SBATCH --error=job.%J.err
#SBATCH --output=job.%J.out
# We need to run on GPUs. Here asking for 1 Nvidia GPU 
#SBATCH --gpus=1 
#SBATCH -C nvidia_gpu

ml purge /dev/null 2&>1 
ml CUDA/12.6.0 

nvcc hello-world.cu -o hello
./hello

We submit the above job script with

sbatch <jobscript>

Additional info