tencent cloud

Cloud Virtual Machine

Release Notes and Announcements
Release Notes
Public Image Release Notes
Official End of Support Plan for the Operating System
Announcements
Product Introduction
CVM Overview
Strengths
Basic Concepts
Regions and Zones
Tutorial
Service Regions and Service Providers
Billing
Billing Overview
Billing Modes
Billing Items
Billing Mode Conversion Overview
Purchasing Instances
Configuration Adjustment Billing Guide
Overdue Payments
Getting Started
Purchasing a Customized Linux Instance
Purchasing a Customized Windows Instance
User Guide
Operation Guide Overview
Use Limits
Instances
Spot Instances
Reserved Instances
Images
Storage
Backup and Restoration
Network
Security
Passwords/Keys
Monitoring and Alarms
Ops Management
Convenience Features
Migrating a Server
Online Migration
Migration Consultation
Troubleshooting
CVM Login Failures
Windows Instance Login Failures
Linux Instance Login Failures
Other Login Failures
Instance Running Failures
Linux Instance Memory Failures
Network Failures
Use Cases
Suggestions on CVM Model Selection
Environment Building
Website Building
Application Building
Visual GUI Building
Uploading Local Files to CVM
Network Performance Test
Other Practical Tutorials
API Documentation
History
Introduction
API Category
Making API Requests
Region APIs
Instance APIs
Cloud Hosting Cluster APIs
Image APIs
Instance Launch Template APIs
Placement Group APIs
Key APIs
Security Group APIs
Network APIs
Data Types
Error Codes
Security and Compliance
CAM
Network
FAQs
Regions and Zones
Billing
Instances
Storage
Images
Server Migration
Network
Security
Operating Systems
Ops and Monitoring
CAM
NTP Service
Scenarios
Agreements
CVM Service Level Agreements
Red Hat Enterprise Linux Image Service Agreement
Public IP Service Level Agreement
Glossary

Big Data Instances

PDF
Focus Mode
Font Size
Last updated: 2024-01-06 17:26:45

What is a big data instance?

Big data instances are designed specifically for scenarios like Hadoop distributed computing, massive log processing, distributed file system, large data warehouse, and more. This CVM instance type is mainly used to solve the cloud computing and storage problems of massive business data.

Which industry customers and business scenarios are big data instances applicable to?

Big data instances are applicable to customers in the Internet, game, finance and other industries that require big data computing and storage analysis, as well as business scenarios that require massive data storage and offline computing. They can meet the storage, capacity and private network bandwidth requirements of distributed computing businesses represented by Hadoop. In addition, with the highly available architectural framework of distributed computing services such as Hadoop, big data instances use local storage, making the total cost close to that of a self-built Hadoop cluster in IDC, while ensuring massive storage capacity and high performance.

Features

A single instance has a throughput capacity up to 2.8 GB/s. HDD local disk is ideal for throughput-intensive storage. With stable and high-performing sequential read/write throughput, big data instances are designed specifically for Hadoop distributed computing, massive log processing, large data warehouse and other business scenarios.
Local storage has a unit price as low as 1/10 of S2 instances, making its total cost close to that of a self-built Hadoop clusters in IDC while ensuring massive storage capacity and high performance. Big data instances deliver the optimal cost-efficiency for big data scenarios.
Read/write latency is as lows as 2-5ms. Big data instances are high-performing models suitable for enterprise developers.
It supports the pay-as-you-go billing method.

Specifications

Note:
For more information on instance specifications, see the “Big Data Family” section in Instance Types.

Notes on local data storage

Big data instances use local disk as the data disk, which may lose data (e.g., when the host crashes). If your application cannot guarantee data reliability, we recommend you choose an instance that can use cloud disks as the data disk.
The table below shows you the local disk data status after you perform different operations on an instance with the local disk.
Operation
Local Disk Data Status
Description
Log in to an instance to restart it, restart an instance on the console, or forcibly restart an instance
Retained
Local disk storage as well as the data is retained.
Log in to an instance to shut it down, shut down an instance on the console, or forcibly shut down an instance
Retained
Local disk storage as well as the data is retained.
Terminate (instance) on the console
Erased
Local disk storage is erased. No data is retained.
Note:
Do not store business data that needs to be retained for a long time on a local disk. Back up data in advance and use a highly available architecture. We recommend you store the data on a CBS disk for long-term retention.

How can I purchase a local disk for big data instances?

Local disk can only be purchased when a big data instance is created. The instance specifications determine the number and capacity of local disks you can purchase.

Does the local storage of big data instances support snapshot?

No.

Does a big data instance support configuration adjustment and failover?

No. Big data instances feature massive data storage and use local HDD as data disk. To prevent data loss (when the host crashes or local disk is damaged), we recommend you use a redundancy policy, for example, a file system that supports redundancy and fault tolerance (such as HDFS and Mapr-FS). We also recommend you regularly back up data to a persistent storage system, such as Tencent COS. For more information, please see Cloud Object Storage. If the local disk is damaged, you will need to shut down the CVM instance for us to replace the local disk. We will notify you and perform fixes if the CVM instance crashes.

What is the difference between the Big Data family and High IO I2 instances?

High IO I2 instances featuring ultra-high IOPS are designed specifically for business scenarios with low latency and high random IO. They are suitable for high-performance database (relational database, NoSQL, etc.). Big data instances are designed specifically for business scenarios that require high sequential read/write and low-cost massive data storage. This type features high storage cost-efficiency and private network bandwidth.

How is the disk throughput of big data instances?

Big data D2 instances boast local disks with the sequential read/write throughput as follows.
For a single disk, the sequential read throughput is 220+ MB/s and sequential write throughput is 220+ MB/s (block size of 128 KB and queue depth of 32).
For the CVM, the throughput can reach up to 2.8 GB/s (block size of 128 KB and queue depth of 32).

What is the difference between the local disk of big data instances and CBS?

Cloud Block Storage is a highly available, highly reliable, low-cost, and customizable block storage device. It can be used as an independent and scalable disk for CVM, providing efficient and reliable storage devices. It provides data storage at the data block level and employs a three-copy distributed mechanism to ensure data reliability for CVM instance, meeting the requirements of different use cases. The local disk of big data instances is designed specifically for business scenarios that require high sequential read/write for massive local data sets, such as Hadoop distributed computing, large-scale parallel computing, and data warehouse.

Help and Support

Was this page helpful?

Help us improve! Rate your documentation experience in 5 mins.

Feedback