Hadoop Best Practices

Last updated: 2020-03-25 10:46:47

In Hadoop, distributed file system HDFS, resource scheduling framework YARN, and iterative computing framework MR. Tencent Cloud's Hadoop version that have integrated with COS, allowing you to access to COS by running the hadoop fs command line so as to separate compute and storage apart. Below are some best practices:

  1. HDFS
    For both high-availability (HA) cluster and non-HA cluster, do not format the namenode; otherwise, your data will be lost permanently. Tencent Cloud shall not be responsible under any circumstance for any loss of data caused by formatting the namenode.
  2. YARN
    The fair scheduler is enabled by default, and you can change the scheduler based on your actual needs.

Was this page helpful?

Was this page helpful?

  • Not at all
  • Not very helpful
  • Somewhat helpful
  • Very helpful
  • Extremely helpful
Send Feedback
Help