Overview

The volume of online data, especially order and transaction data generated in scenarios like online retail and payments, are rapidly growing. As a result, most data are barely accessed or updated again after they are written to disks, turning into cold data. Cold data takes up storage space intended for databases of online businesses, resulting in a significant waste of hardware resources and high IT costs for enterprises. By using intelligent archive database tools equipped with the massive storage capacity of the database kernel of OceanBase Database, enterprises can easily archive cold data, release valuable storage resources, and take full control of the data over its entire lifecycle.

Challenges

Rapid Data Growth

Online data, especially order and transaction data generated in scenarios such as online retail and payments, is growing fast in volume, and most of the data will barely be accessed or updated again some time after it is written to the disk.

Low Efficiency, High Costs

Cold data takes up space of solid storage intended for the databases of online businesses, resulting in a serious waste of hardware resources and high IT costs for enterprises. The low data stroage efficiency of online databases drags down the query efficiency and makes it hard to transform and scale up IT systems.

Risky Conventional Solutions

In most conventional solutions, developers or database administrators archive data by using scripts or simple synchronization tools. However, the concurrency and efficiency of the archiving tasks are hard to control. This affects the performance of online databases and can even cause accidental deletion of production data.

Complex O&M Management

The applicable archiving interval and restrictions vary based on the databases or tables of different business modules. Maintaining the execution logic of a large number of scheduled tasks requires a lot of time and effort.

Architecture

The archive database platform using OceanBase Database can be built by using commodity hardware. The platform allows users to configure archiving tasks on graphical pages and supports automatic archiving interval management. Users can set up automatic canary execution of data migration, verification, and deletion tasks with a few clicks. Various features, such as Out-of-Memory (OOM) prevention, intelligent throttling, and multi-granularity traffic management, are provided to ensure stability. This architecture achieves true intelligent O&M management of data archiving tasks, as verified in the core business scenarios within Ant Group. A single transaction payments archival database stores more than 6 PB of data using hundreds of cost-effective large-capacity mechanical disks, with the disk usage automatically balanced. These disks have been running smoothly for years, leading to huge savings in machine resource investment.

Benefits

Visual Management

Users can create and run a task, check the task progress, suspend and resume a task, and perform other basic operations by using a GUI provided by OceanBase Database.

Intelligent O&M Management

OceanBase Database uses the token bucket algorithm for throttling and provides value-adding features, such as resumable data transmission and automatic task scheduling. OceanBase Database is also capable of self-healing. For example, it can automatically replace failed nodes, scale resources, and prevent OOM issues. O&M tasks are executed without human intervention.

Cost-effectiveness

OceanBase Database is able to use large-capacity SATA disks and compresses data at a high compression ratio for compact storage. A single node can store up to 400 TB of data, which is considered large for conventional database archival solutions.

Large Storage Capacity

OceanBase Database allows users to streamline their online business systems and reduce the costs of data archiving. An archival database cluster of OceanBase Database can serve as a large-capacity relational database to support tasks that requires query of large amount of cold data, such as data monitoring, logging, auditing, and verification tasks.