In database systems, data is often compressed to different degrees to reduce storage costs. In analytical processing (AP)-oriented column store databases, data is encoded and compressed by column to improve query performance. However, in most compression algorithms, a higher compression ratio indicates more complex calculation and slower compression/decompression. In traditional B-tree-based databases, data compression during writes may increase the computing load of the CPU and affect the write performance. However, in the LSM-tree architecture of OceanBase Database, data can be compressed only during compactions, which does not affect data writes. Therefore, compression methods with higher compression ratios can be used. The superb compression capabilities of OceanBase Database are proven in application scenarios.
In OceanBase Database, when memory usage of a MemTable reaches a specified threshold or during daily compaction, a minor or major compaction is triggered to store data in the MemTable to the disk and compact the data into static data in an SSTable. Compared with MemTables, SSTables store more data, including more cold data. When SSTables are continuously generated during compactions, OceanBase Database compresses and encodes data in the SSTables to save storage space on the disk. This reduces I/O operations when you query data in the SSTables. In an SSTable, data is structured by block. Fixed-length macroblocks of 2 MB ease the management of storage space. Variable-length microblocks in macroblocks facilitate data compression. Microblocks are stored in the flat or encoding format. OceanBase Database compresses and encodes data by microblock. A microblock in the encoding format is constructed through row padding, encoding, general compression (optional), and encryption (optional). Then, the microblock is stored on the disk and written to a fixed-length macroblock. Encoding and general compression in this process are two data compression methods in OceanBase Database.
General compression
General compression is a process in which a compression algorithm is used to directly compress data blocks regardless of the data structure. This compression method encodes data blocks based on the characteristics of binary data to reduce the redundancy of stored data. Compressed data cannot be randomly accessed. Both compression and decompression are performed on data blocks. OceanBase Database supports four compression algorithms: zlib, snappy, lz4, and zstd. The compression level of the zstd and lz4 algorithms is 1. The compression level of the zlib algorithm is 6. The snappy algorithm uses the default compression level. The four algorithms are used to compress microblocks of 16 KB during compression testing in OceanBase Database. During the testing, the snappy and lz4 algorithms achieve high compression speeds with low compression ratios, and the zlib and zstd algorithms achieved high compression ratios with low compression speeds. The lz4 and snappy algorithms have similar compression ratios, but the lz4 algorithm has higher compression and decompression speeds. The zstd and zlib algorithms have similar compression ratios, but the zstd algorithm has higher compression and decompression speeds. In MySQL mode, OceanBase Database allows you to use one of the preceding compression algorithms. In Oracle mode, OceanBase Database is compatible with the compression algorithms of Oracle Database and allows you to use either the lz4 or zstd algorithm.
Encoding
Based on general compression, OceanBase Database provides a hybrid row-column storage encoding method for databases. In contrast to general compression, encoding is a process in which a compression algorithm compresses data blocks based on the format and semantics of data in data blocks. OceanBase Database is a relational database in which data is organized by table. Each column of a table represents a fixed type of data. Therefore, data in the same column is similar in a logical sense. In specific scenarios, data in adjacent rows of a business table are also similar. Therefore, you can compress and store data by column to improve compression performance. To compress data by column, OceanBase Database introduces microblocks in the encoding format. Unlike microblocks in the flat format in which all data is serialized by row, microblocks in the encoding format are stored in hybrid row-column storage mode. Logically, a microblock still stores a set of row data, but the data is encoded by column. Fixed-length encoded data is stored in the column store area of the microblock, and variable-length data is stored in the variable-length area by row. Data in microblocks in encoding format can be randomly accessed. If you want to read a specific row of data in a microblock, you can decode only the row of data. This prevents specific decompression algorithms from decompressing the entire data block when you want to read only a part of data in the data block. To reduce projection overheads, you can also decode only specified columns during vectorized execution.
OceanBase Database supports a variety of encoding formats for compression by column, such as dictionary encoding, run-length encoding (RLE), and delta encoding that are common in column store databases. If a column stores fixed-length values, such as timestamps and bigint values, and data in the microblock is distributed within one value range, delta encoding can achieve high compression performance. Delta encoding stores only the differences between the values of each row and the minimum value in the microblock, and performs bit-packing to reduce the amount of data that is stored. If data in a microblock has small cardinality, dictionary encoding and RLE can construct a dictionary in the microblock and store the references of each row to compress the data. In an extreme case, a column in a microblock may store basically the same data. In this case, OceanBase Database adopts constant encoding to store only constants and values in the microblock that are not equal to the constants. This further increases the compression ratio.
In addition to these common encoding formats, OceanBase Database provides encoding formats for strings. If a column of data has the same prefix, OceanBase Database adopts prefix encoding to store the prefix and the suffix of each row. If a column of data is fixed-length strings with multiple same bytes, OceanBase Database adopts fixed-length string diff encoding to store the differences between the pattern string and each row. If the character cardinality of a column of strings in a microblock is less than 16, OceanBase Database uses a hexadecimal number to represent every character in each string and performs hex encoding. These string-related encoding formats can achieve high compression performance for long business IDs and formatted string data.
In business tables, data in the same column is similar, and data in different columns can also be related. Therefore, OceanBase Database introduces span-column encoding. If most of the data in two columns is the same, OceanBase Database adopts column equal encoding to make one column a reference of another column. If a column stores the prefixes of another column of data, OceanBase Database adopts column prefix encoding to store only the column and the suffixes of another column of data. Span-column encoding reduces data redundancy in tables. It can be used to achieve high compression performance for duplicate timestamps and composite columns, which increases the overall compression ratio of macroblocks. However, span-column encoding and decoding are complex. During encoding, OceanBase Database must check whether data in different columns meets encoding rules. During decoding, OceanBase Database must access referenced column data based on the reference relationship, process the data, and then decode the data. Compared with other encoding formats, span-column encoding is not as suitable to CPUs. Different columns may be referenced in a cascaded manner, which requires special processing.
In addition to encoding by column, OceanBase Database allows you to compress a column of data by using multiple encoding methods. For example, you can use hex encoding together with other string encoding methods. However, this makes encoding and decoding more complex. The column store and row store of null values vary with different encoding methods. However, most encoding methods use a null bitmap to indicate whether data of a row corresponding to a column is null. Encoding formats supported by OceanBase Database are related to both table schemas and data characteristics such as the value range in a microblock. Therefore, a database administrator (DBA) cannot specify column-based encoding when designing the table data model to achieve optimal compression performance. To increase the compression ratio, OceanBase Database adaptively detects a more suitable encoding method during compactions to encode data. To detect m encoding methods for n columns of data, it is assumed that OceanBase Database must encode data m × n times. Then, OceanBase Database can determine the optimal encoding method for each column. This process becomes more complex if span-column encoding is introduced. Therefore, OceanBase Database optimizes the encoding method selection algorithm to improve data encoding efficiency during compactions.
In OceanBase Database V3.2 and later, encoding supports vectorized execution and filter pushdown. This allows you to filter encoded data based on encoding characteristics, which reduces overheads and improves filtering efficiency for specific encoding methods. OceanBase Database also supports Single Instruction, Multiple Data (SIMD) filtering for fixed-length data stored by column based on the Advanced Vector Extensions 2.0 (AVX2) instruction set. In addition, data stored by column in microblocks is decoded by column in vectorized execution, which facilitates caching and branch prediction.
There are issues associated with encoding, of which overheads on encoding and decoding are the most significant. Other issues include computing load on CPUs during compactions, overheads on CPUs caused by complex decoding during progressive iterations in queries, and overheads of the decoder itself. OceanBase Database is optimized to resolve these issues. For example, the decoder is cached in the memory together with related data. However, additional performance overheads caused by decoding are inevitable for specific complex encoding formats. OceanBase Database aims to achieve a balance between query performance, memory usage, and storage costs.
Modify compression options
OceanBase Database allows you to configure table-level compression and encoding methods by using DDL operations. To modify compression options for a table that has generated SSTables, you must progressively rewrite all microblock data through progressive compactions. This prevents excessively high I/O write load in a single compaction. You can set the progressive_merge_num option to specify the number of progressive compactions.
Specify compression options when you create a table
MySQL mode
create table xxx row_format = $value compression = $value;Oracle mode
create table xxx $value;
Modify compression options for a table
MySQL mode
alter table xxx [set] row_format = $value compression = $value;Oracle mode
alter table xxx [move] $value;
Valid values of the compression option in MySQL mode are as follows:
none
lz4_1.0
snappy_1.0
zlib_1.0
zstd_1.0
zstd_1.3.8
lz4_1.9.1
The following table describes the valid values of the row_format option in MySQL mode.
| Value | Microblock format |
|---|---|
| redundant | flat |
| compact | flat |
| dynamic | encoding |
| compressed | encoding |
| condensed | selective encoding: a subset of the encoding format. The selective encoding format supports only RAW, DICT, and CONST encoding and performs only byte packing on data. |
| default | The value is equivalent to dynamic. |
The following table describes the valid values of compression options in Oracle mode.
| Value | General compression | Microblock format |
|---|---|---|
| nocompress | none | flat |
| compress basic | lz4_1.0 | flat |
| compress for oltp | zstd_1.3.8 | flat |
| compress for query | lz4_1.0 | encoding |
| compress for archive | zstd_1.3.8 | encoding |
| compress for query low | lz4_1.0 | The selective encoding format is a subset of the encoding format and is more query-friendly. |