In exclusive mode, a Sparkling cluster is the core Sparkling Data Warehouse Suite component, composing of master nodes and worker nodes. Worker nodes can be divided into core nodes and elastic compute nodes.
This is the basic storage and computation unit of a cluster. Each cluster has at least one core node. A core node includes the storage and computation engines, which can be manually or automatically scaled up but not down. As the number of core nodes increases, the cluster capacity and performance can be linearly increased.
Sparkling clusters support the separation of storage and computation to facilitate scheduling different types of computing loads. An elastic compute node only includes the computation engine, which can be manually or automatically scaled up or down elastically as needed.
Sparkling includes two core components: Data Studio data engineering and science platform and Sparkling data warehouse. Data Studio is a visual console for Sparkling users, which enables a rich set of data-related operations such as integration, management, development, analysis visualization, modeling, governance, ETL, computation and processing as well as cluster management and task scheduling.
As an advanced project in the Apache community, Apache Spark is a new-generation big data processing framework built around speed, ease of use and complicated analysis. It has become a world-class mainstream distributed big data processing framework.