Benefits

Benefits Elastic MapReduce Self-built Hadoop
Flexibility
  • A secure and reliable Hadoop cluster can be built in just minutes, enabling you to run open-source computing frameworks such as Hive, Spark and Presto.
  • Existing EMR clusters can also be elastically scaled to secure an appropriate balance between the ever-changing requirements of data analytics and high IT hardware costs.
  • It usually takes weeks or even months to plan and build a Hadoop cluster for use in the next six months to one year.
  • The OPS team has difficulties in adapting to the rapid changes in businesses and is unable to achieve elastic scaling.
Reliability
  • The availability of big data services can be ensured with a master node designed around disaster recovery situations and a slave node able to be fetched in seconds.
  • EMR is equipped with a complete monitoring system, which sends notifications for exceptions in cluster components and tasks through SMS in just seconds.
  • Hive metadata stored in CDB can reach 99.9996% reliability.
  • Analysis of petabytes of highly durable data stored in massive-volume COS is supported.
  • Self-built Hadoop clusters that provide a complete support system require a great deal of manual upkeep, capital and time to ensure service continuity and data reliability.
Security
  • Network policies for your managed Hadoop cluster can be configured with convenient VPC-based security isolation measures which support network ACLs and security groups. Further, data traffic can be filtered in the subnet and server dimensions to meet your network security needs.
  • Tencent Cloud-grade security reinforcement services provide integrated security services for EMR clusters, including network protection, intrusion detection, vulnerability prevention and much more.
  • A great deal of time and effort is required to build a big data security system, and even a minimal security oversight could lead to immeasurable business risks.
Ease of Use
  • Different versions of clusters can be created to analyze the same data in COS based on your business needs.
  • With open-source out-of-the-box components such as Hue and Oozie, petabytes of data stored in the data nodes or COS can be analyzed without your concerning over any knowledge transfer costs.
  • In the early stages of use, many Hadoop component conflicts need to be solved, and in the later stage, it takes time and effort to build infrastructure systems for security, monitoring and alarming.
  • Getting technical support through email, forums and other channels usually takes weeks or even months.
Reduced Costs
  • With EMR, your managed Hadoop clusters can be scaled based on real-world business needs to eliminate high hardware costs.
  • A wide variety of OPS tools help greatly improve OPS efficiency, enabling engineers to focus more on adding value to the business itself and get rid of recurring tasks of constructing infrastructure for monitoring, security and OPS.
  • The size of the Hadoop cluster is scheduled based on peak traffic times, resulting in high hardware costs.
  • The complex big data OPS and management environment restricts OPS engineers from working with high efficiency.
 

Features

Elastic Scaling
Cluster creation in minutes

A secure, stable cloud-hosted Hadoop cluster can be created in the console in a few minutes.

Cluster scaling in minutes

An existing EMR cluster can be seamlessly scaled up or down in just minutes to accommodate the rapid changes in Internet-based businesses.

API support

EMR clusters can be created, scaled and terminated through APIs with ease.

Separated Computation and Storage
OPS System Support
 

Scenarios

Offline Data Analytics

After large volumes of logs on the servers of gaming platforms, web applications and mobile apps are synced to EMR data nodes or COS, tools such as Hue, Hive, Spark and Presto can help you quickly obtain deep insights into the data.

Other tools like Sqoop can then be used to load the data scattered across CDBs or other storage engines and sync the analyzed data to CDBs, providing data support for data visualization products such as RayData.

Streaming Data Processing
COS Data Analytics