Using Alluxio in Tencent Cloud

Last updated: 2021-04-09 11:05:51

    Overview

    Tencent Cloud EMR comes with the ready-to-use Alluxio service, helping you accelerate distributed memory-level caching and simplify data management. You can also use the configuration delivery feature to configure multi-level caching and manage metadata via the EMR console or APIs. In addition, EMR offers one-stop monitoring and alarming.

    Preparations

    • Tencent Cloud EMR Hadoop Standard v2.1.0 or above
    • Tencent Cloud EMR Hadoop TianQiong v1.0 or above

    For specific Alluxio versions supported in EMR, see Component Version.

    Creating an Alluxio-based EMR Cluster

    This section describes how to create a ready-to-use Alluxio-based EMR cluster. You can create an EMR cluster via the purchase page or API.

    Creating a cluster via the purchase page

    Go to the EMR purchase page, choose an Alluxio-supported version, and select the Alluxio component in Optional Components.

    Select other options as needed to meet your business needs. For reference, see Creating EMR Cluster.

    Creating a cluster via API

    Tencent Cloud EMR also allows you to build a big data cluster based on Alluxio. For details, see DescribeClusterNodes.

    Basic Configurations

    When you create an EMR cluster containing the Alluxio component, HDFS will be mounted to Alluxio and memory will be used for single-level (level 0) storage by default. You can use the configuration delivery feature to change the storage mode to multi-level storage or make other optimizations.

    After delivering configurations, you need to restart the Alluxio service for some configurations to take effect.

    For more details on configuration delivery and restarting policies, see Configuration Management and Restarting Services.

    Storage and compute separation based on Alluxio acceleration

    Tencent Cloud EMR provides the compute and storage separation capability based on Tencent Cloud COS. By default, when directly accessing the data in COS, applications do not have node-level data locality or cross-application caching. Alluxio acceleration helps alleviate these issues.

    COS is deployed on Tencent Cloud EMR clusters by default and serves as the dependent JAR package of UFS. You only need to grant EMR clusters the permission to access COS and mount COS to Alluxio.

    Authorization

    If COS is not enabled for the current cluster, you can go to CAM console > Roles to grant permission. After authorization, EMR nodes can access the data in COS using temporary keys.

    Mounting

    Log in to any machine of EMR and mount COS to Alluxio.

    bin/alluxio fs mount <alluxio-path> <source-path>
    //TODO,
    

    For more information on using Alluxio in Tencent Cloud EMR, see Alluxio Development Documentation.