In this tutorial, you will import 1 GiB of TPC-H sample data into OceanBase and compare the data storage size before and after import to experience the high compression features of the OceanBase database.
Concepts
OceanBase Database utilizes asymmetric read-write architecture and data encoding storage compression technology to achieve high data compression, reducing storage costs.
Asymmetric Read-Write Architecture Storage Compression Technology
By designing read and write blocks asymmetrically, efficient compression algorithms can be applied. Due to the structure of the LSM-Tree, which uses a read-write splitting design and row-level fine-grained record updates, data changes are kept in memory and written to disk in batches. This approach achieves the write performance of in-memory databases with the storage cost of disk databases while eliminating the disk random write bottleneck and storage fragmentation problems of traditional B+Tree structures. This results in higher data write performance and greater data compression opportunities.
Data Encoding Storage Compression Technology
OceanBase uses a hybrid row-column storage format where disk data blocks are organized by columns, and has developed a set of encoding methods that combine row and column storage. By using dictionary, delta, and prefix encoding algorithms on rows and columns before applying general compression algorithms, OceanBase achieves higher compression rates.
Prerequisites
You have completed the tasks in Get Started with OceanBase Cloud or have existing tenants in the MySQL mode and the corresponding database and account in your environment.
Procedure
Select your MySQL tenant.
Click to navigate to Tenant Workspace.
Check your current tenant disk usage .
Click to navigate to Load Data.
Click Import Sample Data.
In the pop-up box, select the test scenario as TPC-H.
Click Data volume and select 1 GiB of sample data from the drop-down menu.
Select
default_databaseor create a new database, and click Import.After the import is successful, click to navigate to Tenant Workspace and check the tenant disk usage.
High Compression Feature Analysis
The sample data size is approximately 1 GiB (1 GB is approximately 1 GiB). Before importing the data, the tenant disk usage is 0.1 GiB. After importing the data, the tenant disk usage is 0.2 GiB, an increase of about 0.1 GiB. OceanBase Database's high compression feature can reduce your storage costs by 70%~90%.