Last updated: 2020-02-05 02:03:55PDF
EMR offers five types of nodes:
|Node Type||Description||HA Quantity||Non-HA Quantity|
|Master||Processes such as NameNode, ResourceManager, and HMaster are deployed here.||2||1|
|Core||Processes such as DataNode, NodeManager, and RegionServer are deployed here.||≥ 3||≥ 2|
|Task||Processes such as NodeManger and PrestoWork are deployed here.||The number of task nodes can be changed at any time to achieve elastic scalability of the cluster. The minimum value is 0.|
|Common||Distributed coordinator components such as ZooKeeper and JournalNode are deployed here.||≥ 3||0|
|Router||Hadoop packages, including software programs and processes such as Hive, Hue, and Spark, are deployed here.||The number of router nodes can be changed at any time. The minimum value is 0.|
- A master node is a management node that ensures that the scheduling of the cluster works properly.
- A core node is a compute and storage node. All your data in HDFS is stored in core nodes. Therefore, in order to ensure data security, once core nodes are scale out, they cannot be scaled in.
- A task node is a pure compute node and does not store any data. The computed data comes from a core node or COS. Therefore, it is often used as an elastic node and can be scaled in or out at any time.
- A router is used to share the load of a master node or as the task submitter of the cluster. It can be scaled in or out at any time.