Strengths

Last updated: 2020-05-28 16:00:17

PDF

Compared to self-created Hadoop clusters, Tencent Cloud EMR provides simpler, more stable, and more reliable Hadoop services.

In addition to the Hadoop cluster, Druid and ClickHouse big data clusters are also supported to provide more choices of big data architectures.

Flexibility

  • A secure and reliable Hadoop cluster can be created in just a few minutes to run mainstream open-source big data computing frameworks such as Hive, Spark, Presto, Impala, ClickHouse, Druid, and Flink, meeting your needs in scenarios such as interactive BI, data warehousing, and real-time computation.
  • Existing EMR clusters can be elastically and quickly scaled, and in-cloud computing resources can be scheduled in real time to respond to fast changes in your business data, reducing the high costs for reserving IT hardware.

Reliability

  • The master node is designed with disaster recovery in mind, and if it fails, a slave node will be started in seconds to ensure the availability of big data services.
  • A comprehensive monitoring system is in place, which can send SMS messages for exceptions in cluster components and tasks in a matter of seconds.
  • Hive metadata can be stored in MetaDB with a metadata reliability of 99.9996%.
  • Petabytes of high-persistence data stored in COS can be analyzed.
  • The recycle bin feature is enabled for clusters by default for you to restore devices that are deleted by mistake.

Security

  • The network policy for managed Hadoop clusters can be well planned through the convenient network isolation enabled by VPCs. Network ACLs and security groups can be created to filter traffic at the subnet and server levels, helping meet your network security needs in all aspects.
  • Tencent Cloud security reinforcement service provides an integrated security solution for EMR clusters, ranging from network protection and intrusion detection to vulnerability protection.
  • Kerberos authentication can be enabled for clusters to ensure secure access.

Ease of Use

  • Different clusters versions can be created to analyze the same data in COS in response to the actual business needs.
  • Petabytes of data stored in data nodes or COS can be analyzed with the aid of out-of-the-box community components such as Hue and Oozie, eliminating your concerns over any knowledge migration costs.
  • A full-featured, intuitive, and easy-to-use monitoring system is provided to present nearly 1,000 cluster-level and component-level monitoring metrics on the monitoring overview page.
  • In-Cloud clusters consisting of multiple models are supported in a flexible manner, so that you can easily scale out or distribute configurations to heterogeneously configured clusters, enabling you to cope with business analysis challenges with higher-specced hardware.

Reduced Costs

  • EMR allows elastic scaling of your managed Hadoop cluster based on the business curve to reduce the high hardware costs.
  • It comes with a rich set of OPS tools which greatly improve the efficiency and enable you to focus on the business itself without having to worry about repeated construction of infrastructure for monitoring, security, and OPS.
  • Warm and cold data can be stored on COS/CHDFS, effectively reducing the costs by 28%–50%.
  • With unified Hive metadatabases and COS buckets, you can implement a cross-cluster architecture for analyzing the same dataset and create or terminate clusters as needed, reducing the cluster scaling costs.