WSL + Windows

👉 Read more: ‣.

With Tensorflow or PyTorch

👉 Official doc for TF + docker. 👉 Note about Docker and Tensorflow: Tensorflow. 👉 An example of docker pytorch with gpu support.

Basic installation

<aside> 🚨 You must (successfully) install the GPU driver on your (Linux) machine before proceeding with the steps in this note. Go to the "Check info" section to check the availability of your drivers.

</aside>

<aside> ☝ (Maybe just for me) It works perfectly on Pop!_OS 20.04, I tried it and we have a lot of problems with Pop!_OS 21.10 so stay with 20.04!

</aside>

sudo apt update

sudo apt install -y nvidia-container-runtime
# You may need to replace above line with
sudo apt install nvidia-docker2
sudo apt install nvidia-container-toolkit

sudo apt install -y nvidia-cuda-toolkit
# restard required

If you have problems installing nvidia-docker2, read this section!

Check info

# Verify that your computer has a graphic card
lspci | grep -i nvidia
# First, install drivers and check
nvidia-smi
# output: NVIDIA-SMI 450.80.02 Driver Version: 450.80.02    CUDA Version: 11.0
# It's the maximum CUDA version that your driver supports
# Check current version of cuda
nvcc --version
# If nvcc is not available, it may be in /usr/local/cuda/bin/
# Add this location to PATH
# modify ~/.zshrc or ~/.bashrc
export PATH=/usr/local/cuda/bin:$PATH

# You may need to install
sudo apt install -y nvidia-cuda-toolkit

If below command doesn't work, try to install nvidia-docker2 (read this section).

# Install and check nvidia-docker
dpkg -l | grep nvidia-docker
# or
nvidia-docker version
# Verifying –gpus option under docker run
docker run --help | grep -i gpus
# output: --gpus gpu-request GPU devices to add to the container ('all' to pass all GPUs)

Does Docker work with GPU?

# List all GPU devices
docker run -it --rm --gpus all ubuntu nvidia-smi -L
# output: GPU 0: GeForce GTX 1650 (...)
# ERROR ?
# docker: Error response from daemon: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown.
# ERROR ?
# Error response from daemon: could not select device driver "" with capabilities: [[gpu]]

# Solution: install nvidia-docker2