Learn Docker GPU on Ubuntu 24
Verify that you have a CUDA-capable GPU
lspci | grep -i nvidia
if you have a nvidia gpu you should see something like this
01:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1080] (rev a1)
Clean your docker leftovers
sudo snap remove --purge docker # removes docker without making snapshots
sudo apt-get remove docker docker-engine docker.io containerd runc
sudo apt-get purge docker-ce docker-ce-cli containerd.io
Remove nvidia-container-runtime
sudo apt-get remove nvidia-container-toolkit
Install nvidia drivers
sudo apt install nvidia-driver-550-server
Install docker using convience script
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
Change docker group to allow user to run docker without sudo
sudo groupadd docker
sudo usermod -aG docker $USER
newgrp docker
Verify that you can run docker without sudo
docker run hello-world
Install nvidia-container-runtime
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update
sudo apt-get install nvidia-container-toolkit
Configure docker to use nvidia-container-runtime
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
Configure nvidia rootless
nvidia-ctk runtime configure --runtime=docker --config=$HOME/.config/docker/daemon.json
sudo nvidia-ctk config --set nvidia-container-cli.no-cgroups --in-place
sudo systemctl restart docker
REBOOT
sudo reboot
Verify that you can run docker with nvidia runtime
docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
Troubleshooting
Failed to initialize NVML: Unknown Error
disable cgroups in nvidia-container-runtime
nano /etc/nvidia-container-runtime/config.toml
# change the following line
no-cgroups = true
Restart docker
sudo systemctl restart docker