For trying out some LSTM Machine Learning algorithms with my TradeFrame Algorithmic Trading Library, I wanted to install LibTorch with NVidia/Cuda support for hardware accelerating learning.
I used Installing C++ Distributions of PyTorch as a starting point. However, their example is CPU based. My desire is for a Cuda based installation. This meant going to the CUDA Zone and start the Download process. My configuration options were: Linux, x86_64, Debian, 12, deb (local).
Using the "deb (local)" with a complete file seemed to be the only way to ensure all components were available.
The steps, as of this writing, were:
wget https://developer.download.nvidia.com/compute/cuda/12.9.0/local_installers/cuda-repo-debian12-12-9-local_12.9.0-575.51.03-1_amd64.deb sudo dpkg -i cuda-repo-debian12-12-9-local_12.9.0-575.51.03-1_amd64.deb sudo cp /var/cuda-repo-debian12-12-9-local/cuda-*-keyring.gpg /usr/share/keyrings/ sudo apt-get update sudo apt-get -y install cuda-toolkit-12-9
Since I have the nvidia binary video driver installed, I chose the non-nouveau flavour of Cuda Drivers. This could be run prior to the previous steps.
sudo apt-get install -y cuda-drivers
There is also the NVIDIA CUDA Installation Guide for Linux for further information.
The PyTorch LibTorch library can be downloaded from PyTorch Start Locally. Choose the C++/Java option with Cuda 12.8 (as of this writing). An appropriate link is presented. Download and expand the file into a development directory. LibTorch doesn't have at this moment a build for Cuda 12.9, but is referenced as 12.8.
The most recent can be found at https://download.pytorch.org/libtorch/nightly/cu128/libtorch-shared-with-deps-latest.zip.
It is probably advised to NOT use the Debian package, as it may be out of date: pytorch-cuda.
To test out the installation, I then created a subdirectory containing a couple of files. The first is the test code example-app.cpp:
#include <torch/torch.h> #include <iostream> int main() { torch::Tensor tensor = torch::rand({2, 3}); std::cout << tensor << std::endl; }
The second file is the CMakeLists.txt file. This is my version:
cmake_minimum_required(VERSION 3.18 FATAL_ERROR) cmake_policy(SET CMP0104 NEW) cmake_policy(SET CMP0105 NEW) project(example-app) find_package(Torch REQUIRED) set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${TORCH_CXX_FLAGS}") set(CMAKE_CUDA_STANDARD 17) add_executable(example-app example-app.cpp) target_link_libraries(example-app "${TORCH_LIBRARIES}") set_property(TARGET example-app PROPERTY CXX_STANDARD 17)
Then to build the example:
mkdir build cmake \ -DCMAKE_PREFIX_PATH=~/data/projects/libs-build/libtorch \ -DCMAKE_CUDA_ARCHITECTURES=native \ -DCMAKE_BUILD_TYPE=DEBUG \ -DCMAKE_CUDA_COMPILER=/etc/alternatives/cuda/bin/nvcc \ -Dnvtx3_dir=/usr/local/cuda/targets/x86_64-linux/include/nvtx3 \ .. make ./example-app
Notes:
- PREFIX_PATH points to the directory of your expanded libtorch download
- CMAKE_CUDA_ARCHITECTURES provides a 'native' cuda solution, the build process will determine the specific gpu for which to build
- CMAKE_BUILD_TYPE can be DEBUG or RELEASE
- CMAKE_CUDA_COMPILER needs to be set, by using /etc/alternatives, these are softlinks to the version you desire (as were installed by the cuda installation)
- nvtx3_dir is required, as the current libtorch library seems to still refer to nvtx and not nvtx3
If you get output along the lines of:
-- Automatic GPU detection failed. Building for common architectures. -- Autodetected CUDA architecture(s): 5.0;8.0;8.6;8.9;9.0;9.0a;10.0;10.0a;10.1a;12.0;12.0a
Even though you may have the nvidia driver installed:
$ dpkg -l |grep nvidia-driv ii nvidia-driver 575.51.03-1 amd64 NVIDIA metapackage ii nvidia-driver-cuda 575.51.03-1 amd64 NVIDIA driver CUDA integration components ii nvidia-driver-libs:amd64 575.51.03-1 amd64 NVIDIA metapackage (OpenGL/GLX/EGL/GLES libraries)
See if the nouveau driver is installed.
$ lsmod |grep nouveau
If so, then run these commands to enable the nvidia driver and to blacklist the nouveau driver and reboot:
sudo mv /etc/modprobe.d/nvidia.conf.dpkg-new /etc/modprobe.d/nvidia.conf sudo update-initramfs -u
My system has two RTX 4070 cards, and can be verified with (an extract is shown with important parts, noticing that the nvidia driver is properly shown):
$ sudo lshw -c video *-display product: Arrow Lake-S [Intel Graphics] configuration: depth=32driver=i915 latency=0 mode=3840x2160 resolution=3840,2160 visual=truecolor xres=3840 yres=2160 *-display product: AD103 [GeForce RTX 4070] configuration:driver=nvidia latency=0 *-display product: AD103 [GeForce RTX 4070] configuration:driver=nvidia latency=0
Therefore, the output of my cmake process will include gpu specific selections:
-- Autodetected CUDA architecture(s): 8.9 8.9 -- Added CUDA NVCC flags for: -gencode;arch=compute_89,code=sm_89
And running the generated binary results in valid output:
Continue reading "Installing LibTorch" »$ ./example-app 0.7141 0.9744 0.3179 0.7794 0.9281 0.7529 [ CPUFloatType{2,3} ]