Elastic MapReduce (EMR) provides three types of clusters for you to choose from based on your business needs.
Billing mode for EMR clusters:
For information on node types, see Node Type Description.
EMR offers a wide variety of CVM models, including EMR Standard, EMR Compute, EMR High IO, EMR MEM-optimized, and EMR Big Data. If you need the CPM model, please submit a ticket to us.
Note:
- A high availability (HA) Hadoop or Druid cluster contains at least eight nodes, including two master nodes, three common nodes, and three core nodes. A non-HA Hadoop or Druid cluster adopts single-replica storage, which can be used for testing but is not recommended in production environment. It contains at least three nodes, including one master node and at least two core nodes.
- An HA ClickHouse cluster contains at least five nodes, including two core nodes and three common nodes. A non-HA ClickHouse cluster adopts single-replica storage, which can be used for testing but is not recommended in production environment. It contains at least one core node.
You can choose a model based on your business needs and budget.
Node Type | Cluster Type | Recommended Specification |
---|---|---|
Master | Hadoop | For master nodes, we recommend you select an instance specification with a large memory size (at least 8 GB) and use cloud disks for high stability. |
ClickHouse | - | |
Druid | For master nodes, we recommend you select an instance specification with a large memory size (at least 16 GB) and use SSD disks for better IO performance. | |
Core | Hadoop | |
ClickHouse | For core nodes, we recommend you select a model with high CPU and a large memory size. Because data may be lost if a local disk is corrupted, cloud disks are recommended. | |
Druid | For core nodes, we recommend you select an instance specification with a large memory size (at least 16 GB) and use SSD disks for better IO performance. | |
Task | Hadoop | |
ClickHouse | - | |
Druid | ||
Common | Hadoop | Common nodes are mainly used as ZooKeeper nodes. We recommend you select the specification of 2 cores, 4 GB memory, and 100 GB cloud disk capacity to meet the requirements. |
ClickHouse | For common nodes, the CPU and memory configuration should be at least 4 cores and 16 GB. | |
Druid | Common nodes are mainly used as ZooKeeper nodes. We recommend you select the specification of 2 cores, 4 GB memory, and 100 GB cloud disk capacity to meet the requirements. | |
Router | Hadoop | Router nodes are mainly used to relieve the load of master nodes and as a task submitter. Therefore, we recommend you select a model with a large memory size, preferably not lower than the specification of master nodes. |
ClickHouse | - | |
Druid | Router nodes are mainly used to relieve the load of master nodes and as a task submitter. Therefore, we recommend you select a model with a large memory size, preferably not lower than the specification of master nodes. |
To ensure the network security, the EMR cluster is placed in a VPC, and a security group policy is added to the VPC. In addition, to ensure easy access to the WebUI of Hadoop, a public IP is enabled for one of the master nodes and the node is billed by traffic. A public IP is not enabled for router nodes by default. However, you can bind a router node to an EIP on the CVM console to enable a public IP for it.
Note:
- A public IP is enabled for master nodes when a cluster is created. You can disable it as needed.
- Enabling a public IP for master nodes is mainly for SSH login and component WebUI access.
- Master nodes with a public IP enabled are billed by traffic with a bandwidth of up to 5 Mbps. You can adjust the network on the console after creating a cluster.
Was this page helpful?