Quickstart – Installing nvidia docker in one guide

I have been frustrated when trying to install nvidia-docker, because the guides are split over several documents. NVIDIA’s own documents tell you to go install Docker first, but don’t tell you how.

For Ubuntu 16 and 18, here are all the steps you need, one by one:

Install Docker CE

  1. sudo apt update
  2. sudo apt install apt-transport-https ca-certificates curl software-properties-common
  3. curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
  4. sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu bionic stable"
  5. sudo apt update
  6. apt-cache policy docker-ce
  7. sudo apt install docker-ce
  8. Verify docker is running: sudo systemctl status docker

Optional – ensure Docker can run without sudo

  1. sudo usermod -aG docker ${USER}
  2. su - ${USER}
  3. Enter your password and now we’ll verify that the user has been included in the docker group:
  4. id -nG
  5. You should see something like: username adm cdrom sudo dip plugdev lpadmin sambashare docker
  6. exit

Install NVIDIA Docker

  1. curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
  2. distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
  3. curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
  4. sudo apt-get update
  5. sudo apt-get install nvidia-docker2
  6. sudo pkill -SIGHUP dockerd

Verify that you’ve installed it correctly with nvidia-smi

sudo docker run --runtime=nvidia --rm nvidia/cuda:10.1-base nvidia-smi

You should see something like:

Unable to find image 'nvidia/cuda:10.1-base' locally
 10.1-base: Pulling from nvidia/cuda
 898c46f3b1a1: Pull complete
 63366dfa0a50: Pull complete
 041d4cd74a92: Pull complete
 6e1bee0f8701: Pull complete
 c15c863cc43e: Pull complete
 4a9de8159c48: Pull complete
 0b62278979d8: Pull complete
 Digest: sha256:686a849123ab369523400e699bfe5d653a063c8ef983a76e24ab18a03be27f26
 Status: Downloaded newer image for nvidia/cuda:10.1-base

Followed by a successful run of nvidia-smi:

Tue Apr  2 08:23:30 2019
 +-----------------------------------------------------------------------------+
 | NVIDIA-SMI 418.40.04    Driver Version: 418.40.04    CUDA Version: 10.1     |
 |-------------------------------+----------------------+----------------------+
 | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
 | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
 |===============================+======================+======================|
 |   0  Tesla V100-PCIE…  On   | 00000000:25:00.0 Off |                    0 |
 | N/A   38C    P0    27W / 250W |      0MiB / 32480MiB |      0%      Default |
 +-------------------------------+----------------------+----------------------+
 |   1  Tesla V100-PCIE…  On   | 00000000:5B:00.0 Off |                    0 |
 | N/A   38C    P0    26W / 250W |      0MiB / 32480MiB |      0%      Default |
 +-------------------------------+----------------------+----------------------+
 |   2  Tesla V100-PCIE…  On   | 00000000:9B:00.0 Off |                    0 |
 | N/A   37C    P0    27W / 250W |      0MiB / 32480MiB |      0%      Default |
 +-------------------------------+----------------------+----------------------+
 |   3  Tesla V100-PCIE…  On   | 00000000:C8:00.0 Off |                    0 |
 | N/A   35C    P0    26W / 250W |      0MiB / 16130MiB |      0%      Default |
 +-------------------------------+----------------------+----------------------+
 +-----------------------------------------------------------------------------+
 | Processes:                                                       GPU Memory |
 |  GPU       PID   Type   Process name                             Usage      |
 |=============================================================================|
 |  No running processes found                                                 |
 +-----------------------------------------------------------------------------+ 

That’s it. Easy

Troubleshooting

If you get an error like this:

$ sudo apt-get install -y nvidia-docker2
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 nvidia-docker2 : Depends: docker-ce (= 5:18.09.5~3-0~ubuntu-xenial) but 18.06.0~ce~3-0~ubuntu is to be installed or
                           docker-ee (= 5:18.09.5~3-0~ubuntu-xenial) but it is not installable
E: Unable to correct problems, you have held broken packages.

You may need to pin some versions. Remove the previously installed docker ($ sudo apt remove docker-ce), and reinstall by forcing versions 5:18.09.5~3-0 like so:

$ sudo apt-get install docker-ce=5:18.09.5~3-0~ubuntu-xenial docker-ce-cli=5:18.09.5~3-0~ubuntu-xenial containerd.io

$ sudo apt-get install -y nvidia-docker2=2.0.3+docker18.09.5-3 nvidia-container-runtime=2.0.0+docker18.09.5-3


Posted

in

by

Tags:

Comments

One response to “Quickstart – Installing nvidia docker in one guide”

  1. jdunham22 Avatar
    jdunham22

    Thanks this really helped get nvidia-docker2 running for me.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.