CUDA¶
CUDA (formerly Compute Unified Device Architecture) is a parallel computing platform and programming model created by NVIDIA and implemented by the graphics processing units (GPUs) that they produce. CUDA gives developers access to the virtual instruction set and memory of the parallel computational elements in CUDA GPUs.
Policy¶
The CUDA Toolkit is freely available to users at HPC2N.
Citations
The following paper is frequently cited:
John Nickolls, Ian Buck, Michael Garland, Kevin Skadron Scalable Parallel Programming with CUDA ACM Queue, vol. 6 no. 2, March/April 2008, pp. 40-53
Overview¶
The NVIDIA® CUDA® Toolkit provides a development environment for creating high performance GPU-accelerated applications. With the CUDA Toolkit, you can develop, optimize and deploy your applications on GPU-accelerated embedded systems, desktop workstations, enterprise data centers, cloud-based platforms and HPC supercomputers. The toolkit includes GPU-accelerated libraries, debugging and optimization tools, a C/C++ compiler and a runtime library to deploy your application.
Usage at HPC2N¶
On HPC2N we have CUDA available as a module.
Loading¶
To use the CUDA, add it to your environment. You can find versions with
and then you can find how to load a specific version (including prerequisites) with
Loading the module should set any needed environmental variables as well as the path.
Compiling and linking¶
After you have loaded the compiler toolchain module and CUDA, you compile and link with CUDA:
Language | GCC, OpenMPI | Intel, Intel MPI | NVCC |
---|---|---|---|
Fortran calling CUDA functions | 1) nvcc -c CUDAPROGRAM.cu 2) gfortran -lcudart -lcuda PROGRAM.f90 CUDAPROGRAM.o |
||
C / C++ with CUDA | mpicc CUDAPROGRAM.cu -lcuda -lcudart | mpiicc CUDAPROGRAM.cu -lcuda -lcudart | nvcc CUDAPROGRAM.cu |
You can add other flags, like for instance -o my-binary
to name the output differently than the standard a.out
.
NOTE: CUDA functions can be called directly from Fortran programs:
- First use the nvcc compiler to create an object file from the .cu file.
-
Then compile the Fortran code together with the object file from the .cu file.
-
External info: CUDA Toolkit
Example, nvcc¶
To compile a CUDA program with the NVIDIA CUDA compiler driver nvcc, you first need to load CUDA.
We will be compiling the small test program “hello-world.cu” (and naming the executable “hello”):
Submitting as a batch job¶
Let use submit a small job that compiles and runs the above program:
#!/bin/bash
# Remember to change this to your own project ID!
#SBATCH -A hpc2nXXXX-YYY
#SBATCH --time=00:05:00
#SBATCH --error=job.%J.err
#SBATCH --output=job.%J.out
# We need to run on GPUs. Here asking for 1 Nvidia GPU
#SBATCH --gpus=1
#SBATCH -C nvidia_gpu
ml purge /dev/null 2&>1
ml CUDA/12.6.0
nvcc hello-world.cu -o hello
./hello
We submit the above job script with
Additional info¶
- CUDA Toolkit homepage
- CUDA Toolkit Programming guide
- Parallel Programming with CUDA (Slides from SC08)