Sparkling Data Warehouse Suite

An easy to use, fully managed, high performance and highly elastic cloud distributed PB-level data warehousing suite

Overview

Tencent Sparkling Data Warehouse Suite provides you with a fully-managed, easy-to-use and high-performance petabyte-level cloud data warehousing solution. Based on the industry-leading Apache Spark framework, Sparkling enables you to create an enterprise-grade distributed cloud data warehouse with thousands of nodes in a few minutes that can be flexibly scaled as needed. Sparkling features Data Studio, a one-stop big data development and science platform, for cluster management, data integration, metadata management, workflow development, data processing and result visualization. It deeply integrates Business Intelligence for application data mart construction, offline processing of massive amounts of data, data modeling, ad hoc query analysis, data mining and visual exploration. Plus, its cross-data source conjoint analysis feature allows you to easily analyze data on data engines such as COS and CDB, helping you focus on the mining and exploration of data value.

Benefits

Elastic Scaling

Sparkling is equipped with powerful elastic scalability. Computation and storage are separated, and working nodes of a cluster are divided into core nodes and elastic compute nodes. Manual and/or automated scale-out of high numbers of nodes and scale-up/down of computing and storing devices can be quickly achieved using the Tencent Cloud Console or cloud API. Automated elastic scale-in is available to elastic compute nodes to meet changing business scale.

Ease of Use

Sparkling boasts Data Studio, a one-stop data engineering and science platform, which enables cluster management monitoring, data integration, metadata management, data ETL, data processing and computation, data analysis visualization, workflow task management and collaboration in a visualized manner, eliminating cumbersome OPS and parameter adjustment work for the underlying infrastructure and data warehouse cores. It is fully compatible with the ANSI SQL 2003 standard, enabling the construction of enterprise-class data warehouses using standard SQL.

Seamless Integration

Sparkling supports the expansion of COS cloud storage to achieve unlimited storage capacity. It supports high-speed data import from a wide variety of tools and data sources such as traditional relational databases, CKafka and K-V databases to achieve convergent analysis of multi-source cloud data.

Excellent Performance

Based on the Apache Spark ecosystem and leveraging innovative technologies such as distributed multi-level caching, index optimization, off-heap memory management, high-performance columnar storage and CBO optimization, Sparkling supports high-performance parallel loading and accessing and multi-dimensional exploration of data, with a batch processing efficiency several times higher than traditional databases.

Security and Reliability

Sparkling features a three-copy data storage mechanism with master-slave nodes, enabling imperceptible failover and disaster recovery backup. User clusters are deployed separately and support VPC isolation, offering multiple layers of data access security. User behaviors are logged for auditing purpose to protect the security of your data.

Serverless Architecture

To enable ad hoc analysis and pay-as-you-go pricing, Sparkling is designed to be serverless and can be used out of the box with zero deployment and OPS costs. You do not have to purchase or manage clusters; instead, you simply pay for what you use. This pricing method allows you to enjoy high cost effectiveness with guaranteed performance and security.

Features

Sparkling provides a fast, fully-managed petabyte-level data warehousing solution that enables you to analyze and process massive amounts of data economically and efficiently.
Cluster Management

The exclusive usage mode of Sparkling provides cluster management and monitoring modules that support cluster creation, automated scaling, cluster configuration, start/stop and intelligent resource monitoring and alarming. Daily OPS and cluster performance tuning can be performed using the cluster management function.

Scenarios

Global Data Asset Management

Sparkling effectively meets the urgent needs of industries such as gaming, finance, retail and industrial engineering by providing a tool to centrally manage and analyze management and business data of user behaviors, staffing, procurement, sales, assets and supply chain, so that a comprehensive view of global data can be generated to help understand overall operational conditions and make rapid and accurate decisions.

Analysis of Large Volumes of Logs and Targeted Marketing

Featuring a log standardization and normalization mechanism, Sparkling enables you to conveniently analyze petabytes of structured or semi-structured data such as user behavior and system logs, generate cookie-based consumer profiles and personalize recommendations to users. This significantly improves the efficiency of targeted marketing. Moreover, it supports real-time data access and in-depth integration with COS.

Efficient Data-based Decision-making

With the aid of its easy-to-use machine learning framework, interactive collaborative programming environment and real-time data query and analysis capabilities, Sparkling provides data scientists with powerful tools for data modeling. Plus, it enables business managers to refine corporate operations and helps them enhance business insight capabilities.

Developer Resources