Enabling GPU Scheduling for a Cluster

Last updated: 2019-07-19 17:32:12

PDF

Operation Scenario

If your business involves scenarios such as deep learning and high-performance computing, you can use TKE to support the GPU feature, which can help you quickly use a GPU container. If you need to activate the GPU feature, you can apply by submitting a ticket.
There are multiple ways to enable GPU scheduling:

Prerequisites

You are logged in to the TKE console.

Considerations

  • GPU scheduling is supported only if the Kubernetes version of the cluster is above 1.8.*.
  • GPUs are not shared among containers. A container can request one or more GPUs. However, it cannot request a portion of one GPU.
  • It is recommended to use the GPU feature together with affinity scheduling.

Directions

Adding a GPU Node to a Cluster

There are two ways to add a GPU node:

Creating a GPU CVM Instance

  1. In the left sidebar, click Clusters to go to the cluster management page.
  2. In the row of the cluster for which to create a GPU CVM instance, click Create a node.
  3. On the "Select a model" page, set the "Instance family" to "GPU model" and select "GPU compute type" for "Instance type". See the figure below:
  4. Follow the on-screen prompts to complete the creation.
    When selecting on the "CVM configuration" page, TKE will automatically perform the initial processes such as GPU driver installation according to the selected model, and you do not need to care about the basic image.

Adding an Existing GPU CVM Instance

  1. In the left sidebar, click Clusters to go to the cluster management page.
  2. In the row of the cluster for which to add an existing GPU CVM instance, click Add an existing node.
  3. On the "Select a node" page, select the existing GPU node and click Next. See the figure below:
  4. Follow the on-screen prompts to complete the addition.

    When selecting on the "CVM configuration" page, TKE will automatically perform the initial processes such as GPU driver installation according to the selected model, and you do not need to care about the basic image.

Creating a GPU Service Container

There are two ways to create a GPU service container:

Creating in the Console

  1. In the left sidebar, click Clusters to go to the cluster management page.
  2. Click the ID/name of the cluster where Workload needs to be created to enter the cluster management page.
  3. Under "Workload", select a workload type to go to the corresponding information page. For example, select "Workload" > "DaemonSet" to go to the DaemonSet information page. See the figure below:
  4. Click Create to go to the "Create a workload" page.
  5. Set the workload name, namespace and other information as instructed. In "GPU limit", set the GPU quantity limit. See the figure below:
  6. Click Create a workload to complete the creation.

Creating Through a kubectl Command

You can add a GPU field in the YAML file through a kubectl command. See the figure below: