Hue is an open-source Apache Hadoop UI system that evolved from Cloudera Desktop. Cloudera eventually gifted it to the Hadoop project of Apache Software Foundation. Hue is implemented on the basis of Django, a Python web framework.
By using Hue, you can interact with Hadoop clusters in the web-based console on a browser, such as manipulating HDFS data, running MapReduce jobs, executing Hive SQL statements, and browsing HBase databases.
To use the Hue component to manage workflows, log in to the Hue Console first as shown below:
As the default component account upon startup in EMR is Hadoop, please create a Hadoop account after logging in to the Hue Console with the root account for the first time. All subsequent jobs should be submitted by using the Hadoop account.
Hue's Beeswax app provides user-friendly and convenient Hive query capabilities, enabling you to select different Hive databases, write HQL statements, submit query tasks, and view results with ease.
You can use HBase Browser to query, modify, and display data from tables in an HBase cluster.
Hue's web UI makes it easy to view files and folders in HDFS and perform operations such as creation, download, upload, copy, modification, and deletion.
Save the content above as a file named hive_sample.sql. The Hive workflow also requires a hive-site.xml configuration file, which can be found on the cluster node where the Hive component is installed. Upload the Hive script file and hive-site.xml to a directory in HDFS, such as
| create database if not exists hive_sample; | | show databases;| | use hive_sample;| | show tables;| | create table if not exists hive_sample (a int, b string);| | show tables;| | insert into hive_sample select 1, "a";| | select * from hive_sample;|
The document takes the installation of Hive 1 as an example, where the configuration parameter should be
HiveServer1. Errors will be reported if other Hive versions are deployed at the same time (or if the configuration parameters of other Hive versions are used).