Pytorch and CUDA11
Nam Pho
Director for Research Computinginfo
During the January 12, 2021 mox maintenance period long overdue package updates will be applied. The most user impactful upgrade is the GPU driver from to 418.40.04 to 460.27.04 that will allow for CUDA11 support (up from CUDA10).
The single biggest research use for GPUs on HYAK is for machine learning and artificial intelligence and the community has been clammoring for CUDA11 support for some time. Unfortunately, it's not easy to separate the GPU driver from the node images so it had to wait until the next maintenance window and some testing for non-ML GPU workflows on HYAK like our gromacs users in the molecular dynamics community.
tl;dr your existing Pytorch codes should work but if you wanted to use the new features in Pytorch that required CUDA11 you can upgrade Pytorch and it will work.
#
Installing Pytorch with CUDA11Since this is now the latest and greatest on HYAK I've taken the opportunity to update the Python documentation on how to install Pytorch with CUDA11 support within a miniconda3 environment, check out the step-by-step here.
#
Reverse compatibility with CUDA10Before the January 12, 2021 cluster maintenance every GPU on HYAK had a driver with CUDA10 and all of your codes were previously compiled against it. To test that the GPU driver update to CUDA11 wouldn't impact the most popular machine learning libraries we are compiling Pytorch against our pre-maintenance CUDA10 and testing it against a GPU with the newer CUDA11 installed.
Activate your new pytorch-cuda10
environment:
The Pytorch website [www] has a nice getting started matrix that generates the requisite install commands against CUDA10.
The command shown above to copy-and-paste below:
Now we can load the Python interpreter and confirm Pytorch is installed and the CUDA10 compiled library recognizes this GPU with CUDA11 [www].
Success!
Previously compiled libraries against CUDA10 from pre-January 12, 2021 maintenance times should still work on the GPUs now with CUDA11. However, if you want to use the full features of libraries that take advantage of newer capabilities in CUDA11 then you should definitely upgrade your libraries.