Elastic MapReduce (EMR) provides six types of clusters for you to choose from based on your business needs.
Billing mode for EMR clusters:
Note:Select the shutdown mode with caution when shutting down a pay-as-you-go EMR cluster node in the CVM console, because EMR nodes do not support the "no charges when shut down" mode.
EMR offers a wide variety of CVM models, including EMR Standard, EMR Compute, EMR High IO, EMR MEM Optimized, and EMR Big Data. If you need the CPM model, submit a ticket to us.
You can choose a model based on your business needs and budget.
EMR offers five types of nodes for your choice based on the cluster type.
Cluster Type | Use Case | Node Type | Recommended Specification |
---|---|---|---|
Hadoop | Default use case | Master | Master node: We recommend you select an instance specification with a large memory size (at least 8 GB) and use cloud disks for high stability. |
Core | |||
Task | |||
Common | Common node: It is mainly used as ZooKeeper nodes. You need to select a specification of at least 2 cores, 4 GB memory, and 100 GB cloud disk capacity to meet the requirements. | ||
Router | Router node: It is mainly used to relieve the load of master nodes and as a task submitter. Therefore, we recommend you select a model with a large memory size, preferably not lower than the specification of master nodes. | ||
ZooKeeper | Common | Common node: It is mainly used as ZooKeeper nodes. You need to select a specification of at least 2 cores, 4 GB memory, and 100 GB cloud disk capacity to meet the requirements. | |
HBase | Master | Master node: We recommend you select an instance specification with a large memory size (at least 8 GB) and use cloud disks for high stability. | |
Core | |||
Task | |||
Common | Common node: It is mainly used as ZooKeeper nodes. You need to select a specification of at least 2 cores, 4 GB memory, and 100 GB cloud disk capacity to meet the requirements. | ||
Router | Router node: It is mainly used to relieve the load of master nodes and as a task submitter. Therefore, we recommend you select a model with a large memory size, preferably not lower than the specification of master nodes. | ||
Kudu | Master | Master node: We recommend you select an instance specification with a large memory size (at least 8 GB) and use cloud disks for high stability. | |
Core | |||
Task | |||
Common | Common node: It is mainly used as ZooKeeper nodes. You need to select a specification of at least 2 cores, 4 GB memory, and 100 GB cloud disk capacity to meet the requirements. | ||
Router | Router node: It is mainly used to relieve the load of master nodes and as a task submitter. Therefore, we recommend you select a model with a large memory size, preferably not lower than the specification of master nodes. | ||
Presto | Master | Master node: We recommend you select an instance specification with a large memory size (at least 8 GB) and use cloud disks for high stability. | |
Core | |||
Task | |||
Common | Common node: It is mainly used as ZooKeeper nodes. You need to select a specification of at least 2 cores, 4 GB memory, and 100 GB cloud disk capacity to meet the requirements. | ||
Router | Router node: It is mainly used to relieve the load of master nodes and as a task submitter. Therefore, we recommend you select a model with a large memory size, preferably not lower than the specification of master nodes. | ||
ClickHouse | Default use case | Core | Core node: We recommend you select a model with high CPU and a large memory size. Because data may be lost if a local disk is corrupted, cloud disks are recommended. |
Common | Common node: The CPU and memory configuration should be at least 4 cores and 16 GB. | ||
Kafka | Default use case | Core | Core node: We recommend you select a model with high CPU and a large memory size. Because data may be lost if a local disk is corrupted, cloud disks are recommended. |
Common | Common node: The CPU and memory configuration should be at least 4 cores and 16 GB. | ||
Doris | Default use case | Master | Master node: We recommend you select an instance specification with a large memory size (at least 8 GB) and store all the metadata of master nodes in the memory. |
Core | Core node: We recommend you select an instance specification with a large memory size (at least 8 GB) and use cloud SSD for better IO performance and stability. | ||
Router | Router node: The frontend module is deployed here for high read/write availability. Therefore, we recommend you select a model with a large memory size, preferably not less than that of master nodes. | ||
Druid | Default use case | Master | Master node: We recommend you select an instance specification with a large memory size (at least 16 GB) and use SSD disks for better IO performance. |
Core | Core node: We recommend you select an instance specification with a large memory size (at least 8 GB) and use cloud SSD for better IO performance and stability. | ||
Task | |||
Common | Common node: It is mainly used as ZooKeeper nodes. We recommend you select the specification of 2 cores, 4 GB memory, and 100 GB cloud disk capacity to meet the requirements. | ||
Router | Router node: It is mainly used to relieve the load of master nodes and as a task submitter. Therefore, we recommend you select a model with a large memory size, preferably not lower than the specification of master nodes. | ||
StarRocks | Default use case | Master | Master node: We recommend you select an instance specification with a large memory size (at least 8 GB) and store all the metadata of master nodes in the memory. |
Core | Core node: We recommend you select an instance specification with a large memory size (at least 8 GB) and use cloud SSD for better IO performance and stability. | ||
Router | Router node: The frontend module is deployed here for high read/write availability. Therefore, we recommend you select a model with a large memory size, preferably not less than that of master nodes. |
Note:
- Different cluster types have different requirements for the node specification. Currently, the system automatically recommends the configuration that meets the cluster's requirements by default. You can adjust the model specification based on your business needs, and the recommended model is for reference only.
- Core nodes cannot be elastically scaled. If your architecture does not use COS, core nodes are responsible for processing cluster compute and storage tasks, and three-replica backup is enabled by default. When estimating the data disk capacity, you need to consider the capacity for storing three replicas. In this case, the Big Data model is recommended.
To ensure the network security, the EMR cluster is placed in a VPC, and a security group policy is added to the VPC. In addition, to ensure easy access to the WebUI of Hadoop, a public IP is enabled for one of the master nodes and the node is billed by traffic. A public IP is not enabled for router nodes by default. However, you can bind a router node to an EIP in the CVM console to enable a public IP for it.
Note:
- A public IP is enabled for master nodes when a cluster is created. You can disable it as needed.
- Enabling a public IP for master nodes is mainly for SSH login and component WebUI access.
- Master nodes with a public IP enabled are billed by traffic with a bandwidth of up to 5 Mbps. You can adjust the network in the console after creating a cluster.
Was this page helpful?