tencent cloud

Feedback

Installing NVIDIA Driver

Last updated: 2024-03-27 11:47:40

    Overview

    The GPU instance must be installed with the necessary infrastructure software in advance. For an NVIDIA GPU instance, the following software packages are required:
    Hardware driver for the GPU
    Libraries required by upper-level applications
    To use NVIDIA GPU instances for general computing tasks, you must install Tesla driver and Compute Unified Device Architecture (CUDA) driver. This document only describes how to install a Tesla driver. For more information on CUDA driver, please see Installing CUDA Driver.

    Directions

    Installing an NVIDIA Tesla driver on a Linux instance

    You can use the Shell script to install a driver on the Linux instance. This method is applicable to all Linux distributions, including CentOS and Ubuntu.
    When installing an NVIDIA Tesla driver for Linux, the driver needs to compile the kernel module. You must install gcc and packages required to compile the Linux kernel module in advance, such as kernel-devel-$(uname -r).
    1. Run the following command to check whether dkms has been installed in the operating system:
    rpm -qa | grep -i dkms
    If the returned result is as shown in the following figure, dkms has been installed.
    
    
    
    If dkms is not installed, run the following command to install dkms:
    sudo yum install -y dkms
    2. Go to NVIDIA Driver Downloads or visit http://www.nvidia.com/Download/Find.aspx.
    3. Configure the GPU type and operating system, and click SEARCH to search for the driver you need to download, as shown in the following figure. Below uses Tesla V100 as an example.
    Note:
    You can configure Operating System as Linux 64-bit to download shell setup files. If you configure Operating System to a specific Linux distribution, the corresponding installation files will be downloaded.
    
    
    
    4. Select the required version to go to the driver download page, and click DOWNLOAD, as shown in the following figure.
    
    
    
    5. 
    You
    can skip the page for entering personal information. If the following page appears, right-click AGREE & DOWNLOAD and select Copy link address.
    
    
    
    6. To log in to GPU instances, see Log into Linux Instance Using Standard Login Method. You can also use other login methods:
    7. Run the wget command to download the installation package using the URL copied in Step 5, as shown in the following figure.
    
    
    
    You can also download the installation package to your local computer and upload it to the GPU instance.
    8. Add execution permissions to the installation package. For example, run the following command to add execution permissions to the NVIDIA-Linux-x86_64-418.126.02.run file:
    chmod +x NVIDIA-Linux-x86_64-418.126.02.run
    9. Run the following commands in sequence to check whether kernel-devel and gcc have been installed in the operating system:
    rpm -qa | grep kernel-devel
    rpm -qa | grep gcc
    If the returned result is as shown in the following figure, kernel-devel and gcc have been installed.
    
    
    
    If kernel-devel and gcc are not installed, run the following command to install them:
    sudo yum install -y gcc kernel-devel
    Note:
    If the kernel version has been upgraded, you must upgrade kernel-devel to the same version.
    10. Run the following command to install the driver as instructed:
    sudo sh NVIDIA-Linux-x86_64-418.126.02.run
    11. After the installation is completed, run the following command to verify.
    nvidia-smi
    If GPU information similar to that shown in the following figure is returned, the installation is successful.
    
    
    

    Installing an NVIDIA Tesla driver on a Windows instance

    3. Configure the GPU type and operating system, and click SEARCH to search for the driver you need to download, as shown in the following figure. Below uses Tesla V100 as an example.
    
    
    
    4. Go to the directory where the downloaded installation package is located, double-click on it to install the driver as instructed, and restart the GPU instance as required. After the installation is completed, go to Device Manager to check whether the GPU works properly.

    Reasons for installation failures

    If nvidia-smi does not run properly, the driver has not been installed correctly. Common reasons include:
    1. The operating system does not have the required packages installed for compiling the kernel module, such as gcc and kernel-devel.
    2. The operating system has kernels in multiple versions. Due to incorrect DKMS configuration, the driver compiles a kernel module that is not in the version of the current kernel, causing kernel module installation to fail.
    3. After the driver is installed, kernel version upgrade causes the original installation to fail.
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support