CUDA¶

CUDA (formerly Compute Unified Device Architecture) is a parallel computing platform and programming model created by NVIDIA and implemented by the graphics processing units (GPUs) that they produce. CUDA gives developers access to the virtual instruction set and memory of the parallel computational elements in CUDA GPUs.

Policy¶

The CUDA Toolkit is freely available to users at HPC2N.

Citations

The following paper is frequently cited:

John Nickolls, Ian Buck, Michael Garland, Kevin Skadron
Scalable Parallel Programming with CUDA
ACM Queue, vol. 6 no. 2, March/April 2008, pp. 40-53

Overview¶

The NVIDIA® CUDA® Toolkit provides a development environment for creating high performance GPU-accelerated applications. With the CUDA Toolkit, you can develop, optimize and deploy your applications on GPU-accelerated embedded systems, desktop workstations, enterprise data centers, cloud-based platforms and HPC supercomputers. The toolkit includes GPU-accelerated libraries, debugging and optimization tools, a C/C++ compiler and a runtime library to deploy your application.

Usage at HPC2N¶

On HPC2N we have CUDA available as a module.

Loading¶

To use the CUDA, add it to your environment. You can find versions with

module spider CUDA

and then you can find how to load a specific version (including prerequisites) with

module spider CUDA/<VERSION>

Loading the module should set any needed environmental variables as well as the path.

Compiling and linking¶

After you have loaded the compiler toolchain module and CUDA, you compile and link with CUDA:

Language	GCC, OpenMPI	Intel, Intel MPI	NVCC
Fortran calling CUDA functions			1) nvcc -c CUDAPROGRAM.cu 2) gfortran -lcudart -lcuda PROGRAM.f90 CUDAPROGRAM.o
C / C++ with CUDA	mpicc CUDAPROGRAM.cu -lcuda -lcudart	mpiicc CUDAPROGRAM.cu -lcuda -lcudart	nvcc CUDAPROGRAM.cu

You can add other flags, like for instance -o my-binary to name the output differently than the standard a.out.

NOTE: CUDA functions can be called directly from Fortran programs:

First use the nvcc compiler to create an object file from the .cu file.
Then compile the Fortran code together with the object file from the .cu file.
External info: CUDA Toolkit

Example, nvcc¶

To compile a CUDA program with the NVIDIA CUDA compiler driver nvcc, you first need to load CUDA.

We will be compiling the small test program “hello-world.cu” (and naming the executable “hello”):

nvcc hello-world.cu -o hello

Submitting as a batch job¶

Let use submit a small job that compiles and runs the above program:

#!/bin/bash 
# Remember to change this to your own project ID! 
#SBATCH -A hpc2nXXXX-YYY
#SBATCH --time=00:05:00
#SBATCH --error=job.%J.err
#SBATCH --output=job.%J.out
# We need to run on GPUs. Here asking for 1 Nvidia GPU 
#SBATCH --gpus=1 
#SBATCH -C nvidia_gpu

ml purge /dev/null 2&>1 
ml CUDA/12.6.0 

nvcc hello-world.cu -o hello
./hello

We submit the above job script with

sbatch <jobscript>

Additional info¶

CUDA Toolkit homepage
CUDA Toolkit Programming guide
Parallel Programming with CUDA (Slides from SC08)