Glossary

Last updated: 2019-04-24 10:39:38

PDF

Sparkling Cluster

In exclusive mode, a Sparkling cluster is the core Sparkling Data Warehouse Suite component, composing of master nodes and worker nodes. Worker nodes can be divided into core nodes and elastic compute nodes.

Core Node

This is the basic storage and computation unit of a cluster. Each cluster has at least one core node. A core node includes the storage and computation engines, which can be manually or automatically scaled up but not down. As the number of core nodes increases, the cluster capacity and performance can be linearly increased.

Elastic Compute Node

Sparkling clusters support the separation of storage and computation to facilitate scheduling different types of computing loads. An elastic compute node only includes the computation engine, which can be manually or automatically scaled up or down elastically as needed.

Data Studio

Sparkling includes two core components: Data Studio data engineering and science platform and Sparkling data warehouse. Data Studio is a visual console for Sparkling users, which enables a rich set of data-related operations such as integration, management, development, analysis visualization, modeling, governance, ETL, computation and processing as well as cluster management and task scheduling.

Apache Spark

As an advanced project in the Apache community, Apache Spark is a new-generation big data processing framework built around speed, ease of use and complicated analysis. It has become a world-class mainstream distributed big data processing framework.