tencent cloud

Feedback

Using AVX-512 Instructions to Accelerate AI Applications on CVM

Last updated: 2024-01-06 17:49:55

    Overview

    The fifth-generation Tencent Cloud CVM instances (including S6, S5, M5, C4, IT5, D3, etc.) all come with the 2nd generation Intel® Xeon® scalable processor Cascade Lake. These instances provides more instruction sets and features, which can accelerate the artificial intelligence (AI) applications. The integrated hardware enhancement technology, like Advanced Vector Extensions 512 (AVX-512), can boost the parallel computing performance for AI inference and produce a better deep learning result.
    This document describes how to use AVX-512 on S5 and M5 CVM instances to accelerate AI application.
    Tencent Cloud provides various types of CVMs for different application development. The Standard S6, Standard S5 and Memory Optimized M5 instance types come with the 2nd generation Intel® Xeon® processor and support Intel® DL Boost, making them suitable for machine learning or deep learning. The recommended configurations are as follows:
    Scenario
    Instance Specifications
    Deep learning training platform
    84vCPU Standard S5 or 48vCPU Memory Optimized M5
    Deep learning inference platform
    8/16/24/32/48vCPU Standard S5 or Memory Optimized M5
    Deep learning training or inference platform
    48vCPU Standard S5 or 24vCPU Memory Optimized M5

    Advantages

    Running the workloads for machine learning or deep learning on Intel® Xeon® scalable processors has the following advantages:
    Suitable for processing 3D-CNN topologies used in scenarios such as big-memory workloads, medical imaging, GAN, seismic analysis, gene sequencing, etc.
    Flexible core support simply using the numactl command, and applicable to small-scale online inference.
    Powerful ecosystem to directly perform distributed training on large clusters, without the need for a large-scale architecture containing additional large-capacity storage and expensive caching mechanisms.
    Support for many workloads (such as HPC, BigData, and AI) in a single cluster to deliver better TCO.
    Support for SIMD acceleration to meet the computing requirements of various deep learning applications.
    The same infrastructure for direct training and inference.

    Directions

    Creating an instance

    Create an instance as instructed in Creating Instances via CVM Purchase Page. Select a recommended model that suits your actual use case.
    
    
    Note:
    For more information on instance specifications, see Instance Types.

    Logging in to the instance

    Deploying a platform

    Deploy an AI platform as instructed below to perform the machine learning or deep learning task:
    
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support