Setup a NVIDIA DevContainer with GPU Support for Tensorflow/Keras on Windows

Say Goodbye to Complex CUDA Installations and Frustrating Python Environment

8 min readFeb 11, 2023

Note: I’ve made an update for Tensorflow 2.15. See the end of the post for details.

From a few years ago ther are already some articles talking about how to support NVIDIA graphic cards with a CUDA container. However, these articles all end with running the nvidia-smi utility and said: “look, the GPU shows up so it’s working”.

As someone who often play computer games, I’m a long time Windows user since there’s not much to play on macOS or Linux. I did installed and used CUDA support for Tensorflow on Windows before, but the latest versions of Tensorflow and AutoKeras start to cause more and more issues on Windows. So a while ago I try to find out how to move the environment to a DevContainer.

Truth to be told, setting up the CUDA DevContainer is no different from installing CUDA on a Linux machine. The real key is that using the official CUDA image (especially the correct one) is only half of the solution — You still have to install a Python environment as well as cuDNN. It’s just seems that no one ever ask this question before.

Anyway, here’s how I did it.

But of course: since I only have one computer with GPU now, I can’t test it elsewhere. Please let me (and other people) know if you have successfully test it out too!
You can also find the (updated) example repository here: https://github.com/alankrantas/cuda-cudnn-gpu-devcontainer

Prerequisites

An amd64 (x64) machine with a CUDA-compatible NVIDIA GPU card
Docker engine (and setup .wslconfig to use more cores and memory than default)
Latest version of NVIDIA driver for the graphic card
NVIDIA Container Toolkit (which is already included in Windows’ Docker Desktop; Linux users have to install it)
VS Code with DevContainer extension installed

Setup DevContainer

Create the following files under your project directory (after opening the directory in VS Code):

.devcontainer/devcontainer.json

This is the DevContainer definition, which uses a CUDA developer image (not base or runtime), which supports AMD64 and ARM64 and have CUDA installed. It will run a script to install other stuff (including VS Code extensions) and finally run nvidia-smi after started up.

// For format details, see https://aka.ms/devcontainer.json. For config options, see the
// README at: https://github.com/devcontainers/templates/tree/main/src/python
{
  "name": "CUDA",
  // Or use a Dockerfile or Docker Compose file. More info: https://containers.dev/guide/dockerfile
  "image": "nvidia/cuda:11.8.0-devel-ubuntu22.04", // https://hub.docker.com/r/nvidia/cuda/tags
  "runArgs": [
    "--gpus=all"
  ],
  "remoteEnv": {
    "PATH": "${containerEnv:PATH}:/usr/local/cuda/bin",
    "LD_LIBRARY_PATH": "$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64",
    "XLA_FLAGS": "--xla_gpu_cuda_data_dir=/usr/local/cuda"
  },
  // Features to add to the dev container. More info: https://containers.dev/features.
  // "features": {}
  // Use 'forwardPorts' to make a list of ports inside the container available locally.
  // "forwardPorts": [],
  // Use 'postCreateCommand' to run commands after the container is created.
  "updateContentCommand": "bash .devcontainer/install-dev-tools.sh",
  "postCreateCommand": [
    "nvidia-smi"
  ],
  // Configure tool-specific properties.
  "customizations": {
    "vscode": {
      "extensions": [
        "ms-python.python",
        "ms-toolsai.jupyter",
        "ms-toolsai.vscode-jupyter-cell-tags",
        "ms-toolsai.jupyter-keymap",
        "ms-toolsai.jupyter-renderers",
        "ms-toolsai.vscode-jupyter-slideshow",
        "ms-python.vscode-pylance"
      ]
    }
  }
  // Uncomment to connect as root instead. More info: https://aka.ms/dev-containers-non-root.
  // "remoteUser": "root"
}

.devcontainer/install-dev-tools.sh

This is the script for installing basic Linux tools, Python 3, Python packages and cuDNN. Downloaded file will be removed so it won’t appear in your local directory.

# see https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html#package-manager-ubuntu-install for latest cuDNN version
cudnn_ver="8.9.7.*-1+cuda11.8"
# update system
apt update
apt upgrade -y
# install Linux tools and Python 3
apt install -y software-properties-common wget curl python3-dev python3-pip python3-wheel python3-setuptools
# install Python packages
python3 -m pip install --upgrade pip
pip3 install --user -r .devcontainer/requirements.txt
# update CUDA Linux GPG repository key
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.0-1_all.deb
dpkg -i cuda-keyring_1.0-1_all.deb
rm cuda-keyring_1.0-1_all.deb
# install cuDNN
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/3bf863cc.pub
add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/ /" -y
apt update
apt install -y libcudnn8=${cudnn_ver} libcudnn8-dev=${cudnn_ver}
# install additional recommended packages
apt install -y zlib1g g++ freeglut3-dev libx11-dev libxmu-dev libxi-dev libglu1-mesa libglu1-mesa-dev libfreeimage-dev

See the cuDNN documentation to find out the latest libcudnn8 and libcudnn8-dev versions (and their corresponding cuda versions, like cuda11.8).
I still have no success on installing TensorRT so I’ll just skip it here.

.devcontainer/requirements.txt

This contains all third party Python packages you wish to install. Modify the list as you like.

numpy
scikit-learn
matplotlib
tensorflow==2.11
autokeras
ipykernel

Note: Tensorflow also installs Numpy and Pandas, and AutoKeras actually installs Numpy, Pandas, scikit-learn and Tensorflow, etc.
In the DevContainer I got Python 3.10.6 and Tensorflow 2.11.0. Tensorflow supoorts GPU natively in Linux (but no longer on Windows!).
The ipykernel is for the Jupyter Notebook extension. If you don’t install it here. VS Code will prompt to install it when you tried to execute a cell in a notebook for the first time.

Start the DevContainer

In VS Code press F1 or Ctrl + Shift + P to bring up the Command Palette. Enter and find “Dev Containers: Reopen in Container”. VS Code will starts to download the CUDA image, run the script and install everything, and finish opening the directory in DevContainer.

The DevContainer would then run nvidia-smi to show what GPU can be seen by the container. Be noted that this works even without setting up cuDNN or any environment variables.

[860685 ms] Start: Run in container: nvidia-smi
Sat Feb 11 17:16:53 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.05    Driver Version: 528.24       CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:01:00.0 Off |                  N/A |
| N/A   46C    P0    30W / 110W |      0MiB /  8192MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
Done. Press any key to close the terminal.

After the image is built, you can re-start the project in DevContainer as long as you did not delete it from Docker.

(You can select “Dev Containers: Open Folder Locally” to close the DevContainer, or rebuild the container with “Dev Containers: Rebuild Container” if you have modified the setup files.)

Access GPU with Tensorflow

While in the DevContainer, if I open a new terminal and enter

python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"

Some error or warning messages would be shown, but at the end I can see my GPU is indeed detected in Tensorflow as well:

[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

Since the DevContainer also installed Jupyter Notebook extensions, you can open both .py and .ipynb files without using Anaconda.

Here’s a short AutoKeras test script for you (requires Numpy, scikit-learn, Tensorflow and AutoKeras), which trains with the famous MNIST handwriting digit dataset with a pre-defined CNN model:

import autokeras as ak
from tensorflow.keras.datasets import mnist
from sklearn.metrics import classification_report

(x_train, y_train), (x_test, y_test) = mnist.load_data()clf = ak.ImageClassifier(max_trials=1, overwrite=True)
clf.fit(x_train, y_train)loss, accuracy = clf.evaluate(x_test, y_test)
predicted = clf.predict(x_test).flatten().astype('uint8')print(f'Prediction loss: {loss:.4f}')
print(f'Prediction accuracy: {accuracy:.4f}')
print(classification_report(y_test, predicted))

While starting to train a Keras deep learning model, you should see the following messages, indicating that cuDNN is loaded correctly and Tensorflow is utilizing the GPU:

2023-02-11 18:42:24.936185: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:428] Loaded cuDNN version 8800
...
2023-02-11 18:42:38.281573: I tensorflow/compiler/xla/service/service.cc:173] XLA service 0x564a6464b1a0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2023-02-11 18:42:38.281764: I tensorflow/compiler/xla/service/service.cc:181]   StreamExecutor device (0): NVIDIA GeForce RTX 3070 Ti Laptop GPU, Compute Capability 8.6
...

I’m using a gaming laptop with a built-in RTX 3070 Ti 8GB card.

After a short while (ignore the error messages, which are removed from the result below) we’ll have the result:

Trial 1 Complete [00h 02m 56s]
val_loss: 0.03709220513701439

Best val_loss So Far: 0.03709220513701439
Total elapsed time: 00h 02m 56sEpoch 1/9
1875/1875 [==============================] - 10s 5ms/step - loss: 0.1622 - accuracy: 0.9506
Epoch 2/9
1875/1875 [==============================] - 9s 5ms/step - loss: 0.0744 - accuracy: 0.9773
Epoch 3/9
1875/1875 [==============================] - 9s 5ms/step - loss: 0.0603 - accuracy: 0.9807
Epoch 4/9
1875/1875 [==============================] - 9s 5ms/step - loss: 0.0511 - accuracy: 0.9838
Epoch 5/9
1875/1875 [==============================] - 9s 5ms/step - loss: 0.0475 - accuracy: 0.9851
Epoch 6/9
1875/1875 [==============================] - 9s 5ms/step - loss: 0.0409 - accuracy: 0.9875
Epoch 7/9
1875/1875 [==============================] - 9s 5ms/step - loss: 0.0365 - accuracy: 0.9879
Epoch 8/9
1875/1875 [==============================] - 9s 5ms/step - loss: 0.0365 - accuracy: 0.9880
Epoch 9/9
1875/1875 [==============================] - 9s 5ms/step - loss: 0.0320 - accuracy: 0.9898313/313 [==============================] - 1s 2ms/step
313/313 [==============================] - 0s 1ms/stepPrediction loss: 0.0407
Prediction accuracy: 0.9884              precision    recall  f1-score   support           0       0.98      0.99      0.99       980
           1       0.99      1.00      0.99      1135
           2       0.98      0.99      0.99      1032
           3       1.00      0.99      0.99      1010
           4       0.99      0.99      0.99       982
           5       0.99      0.99      0.99       892
           6       0.99      0.99      0.99       958
           7       0.99      0.98      0.98      1028
           8       0.99      0.98      0.99       974
           9       0.99      0.98      0.99      1009    accuracy                           0.99     10000
   macro avg       0.99      0.99      0.99     10000
weighted avg       0.99      0.99      0.99     10000

Post Node: Updating to Tensorflow 2.15

The example above works for Tensorflow 2.11, but afterwards the newer version supports installing CUDA and cuDNN together with the following syntax:

pip3 install --extra-index-url https://pypi.nvidia.com tensorflow[and-cuda]==2.15

This is actually a better solution, since Tensorflow will now install compatible CUDA and cuDNN versions for you via pip, and we don’t have to use the NVIDIA CUDA images — any normal Linux images will do (which is in fact smaller), as long as you can install Python on it.

I’ve updated my Github repo for the new version and have successfully tested it on my machine.