Failed Operations on EMR Master Node Due to Low Configuration

Prev Next

search by keyword

Recent Pages

Documentation

Download PDF

Failed Operations on EMR Master Node Due to Low Configuration

Last updated: 2021-09-23 09:26:33

Download PDF

How do I fix failed operations on the EMR master node due to low configuration?

Symptoms

As the master node's configuration is too low, Hive or Spark jobs submitted to it report errors or are directly killed.

Cause analysis

The memory of the master node is insufficient, causing other applications to be killed due to OOM.

Solution

Too many businesses are deployed on the EMR master node, which usually becomes the bottleneck of the entire cluster. However, the master node cannot be scaled out; instead, it can only be upgraded as described below:
- First, find the node where the standby NameNode resides in the cluster.
  - Run the following command on the standby NameNode to enter the safe mode.
```
hdfs dfsadmin -fs 10.0.0.9(standby node IP):4007 -safemode enter   Enter the safe mode
```
  - Run the following command on the standby NameNode to save the metadata.
```
hdfs dfsadmin -fs 10.0.0.9(standby node IP):4007 -saveNamespace   Save the metadata
```
  - Run the following command on the standby NameNode to exit the safe mode.
```
hdfs dfsadmin -fs 10.0.0.9(standby node IP):4007 -safemode leave   Exit the safe mode
```
- Then, in the EMR Console (or the CVM Console for a legacy cluster), upgrade the active node.
- Upgrade the standby node and make the configuration of the master's active node the same as that of the standby node.
  If your cluster is not a high-availability one, then it will become unavailable for a while during the upgrade.
In Spark, jobs are committed in client mode by default, and the driver runs on the master node. You can change the mode to master mode and then commit jobs.
For the Hive component, enable the router node, migrate HiveServer2 to it, and then disable the Hive component on the master node. For detailed directions, please see Migrating HiveServer2 to Router.
Disable components that are not commonly used on the master node or migrate Hue to the router node.
Directions for migrating Hue to the router node:
- Enter the EMR Console, Add a router node on the Cloud Hardware Management page, and select the Hue component.
- After the scale-out, disable the original Hue component on the master node, retain that on the router node, bind a public EIP to the router node, and open the source policy and ports in the security group.

Preset values of memory size for master node components in EMR cluster and recommendations

List of heap memories of common components

Component	Process	Configuration File	Configuration Item	Default Heap Memory (in MB)
HDFS	Namenode	hadoop-env.sh	NNHeapsize	4,096
YARN	Resourcemanager	yarn-env.sh	Heapsize	2,000
Hive	Hiveserver2	hive-env.sh	HS2Heapsize	4,096
HBase	Hmaster	hbase-env.sh	Heapsize	1,024
Presto	Coordinator	jvm.config	Maximum JVM	3,072
Spark	spark-driver	spark-defaults.conf	spark.driver.memory	1,024
Oozie	Oozie	-	-	1,024
Storm	Nimbus	-	-	1,024

Suggested preset values for components

Component	Suggested Heap Memory Size
HDFS (NameNode)	Minimum heap memory = 250 x number of files + 290 x number of directories + 368 x number of blocks
YARN (ResourceManager)	It can be increased as needed
Hive (HiveServer2)	It can be increased as needed
HBase (HMaster)	The master node only receives DDL requests and performs load balancing. The default size of 1 GB is generally sufficient
Presto (Coordinator)	Use the default value
Spark (spark-driver)	It can be increased as needed
Oozie (oozie)	Use the default value
Storm (Nimbus)	Use the default value

Suggested idle memory size for servers: 10–20% of the total memory size.
You can deploy EMR components in independent mode or hybrid mode as needed.
- Independent deployment: it is suitable for HDFS clusters for storage, HBase clusters for analysis of massive amounts of data, and Spark clusters for job computation.
- Hybrid deployment: multiple components can be deployed in a cluster in this mode, which is suitable for testing clusters or scenarios where the business volume is not high or resource preemption is negligible.

Contact Us

Contact our sales team or business advisors to help your business.

Technical Support

Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

7x24 Phone Support

tencent cloud

Recent Pages

Failed Operations on EMR Master Node Due to Low Configuration

How do I fix failed operations on the EMR master node due to low configuration?

Symptoms

Cause analysis

Solution

Preset values of memory size for master node components in EMR cluster and recommendations

Was this page helpful?

Was this page helpful?

tencent cloud

Sign Up

Log in

Recent Pages

Failed Operations on EMR Master Node Due to Low Configuration

How do I fix failed operations on the EMR master node due to low configuration?

Symptoms

Cause analysis

Solution

Preset values of memory size for master node components in EMR cluster and recommendations

Was this page helpful?

Was this page helpful?