Simple SQL Data Analysis

Last updated: 2019-04-23 17:33:36

PDF

Operation Scenario

This document describes how to use the Sparkling Notebook feature to implement simple SQL data queries and visual data analysis. For more information about data development, see Data Development.

Prerequisites

Before performing data analysis, please make sure that you have created a Sparkling cluster as instructed in Creating a Cluster and imported data into the cluster as instructed in Data Import.

Directions

Go to Cluster Management and click Workspace in the left pane to enter the data development page.

1. Creating a Notebook

Click + in the upper left corner of the workspace and select Create a notebook to create a notebook.

2. Finding a Database and Data table

a. After entering the following command on the command line, press Shift + Enter or click "Run" in the upper right corner to run the command line to list the names of databases contained in the current cluster.

show databases


b. Enter the following command to go to the default database.

use default


c. Enter the following command to list the names of data tables contained in the default database. As you can see, the previously imported data table "new_table" already exists in the Sparkling cluster.

show tables

3. Running a Simple SQL Statement

Run the following command to view the data information in new_table, where the pt column is a list of timestamps added when the data is imported into the Sparkling cluster. The default value is 00:00 the day before the data import date.

select * from new_table

4. Analyzing Data Visually

Run the following command to get the number of retrieved rows grouped by "enabled" and create a pie chart of the results as seen in the figure below:

select enabled,count(1) from new_table group by enabled


Run the following command to get the number of retrieved rows grouped by "type" and create a histogram as seen in the figure below. You can click Settings there to set the values of keys, groups, and values.

select type,count(1) from new_table group by type