Data encoding and compression

2025-01-02 01:58:40  Updated

OceanBase Database supports MySQL and Oracle tenants. To facilitate the use in MySQL or Oracle mode, OceanBase Database provides multiple configuration methods for you to use data encoding and data compression.

OceanBase Database allows you to configure the encoding and compression of data in tables by executing DDL statements.

Configure data compression and encoding in MySQL mode

When you create or modify a table in MySQL mode, you can set the ROW_FORMAT and COMPRESSION parameters to specify the encoding format and compression algorithm.

The syntax is as follows:

  • Specify the encoding format and compression algorithm when you create the table

    CREATE TABLE table_name table_definition_list 
    ROW_FORMAT [=] row_format COMPRESSION [=] 'compression';
    

    For more information about the CREATE TABLE statement, see CREATE TABLE.

  • Change the encoding format and compression algorithm when you modify the table

    ALTER TABLE table_name [alter_table_action_list] [SET] ROW_FORMAT [=]row_format COMPRESSION [=]  'compression';
    

    For more information about the ALTER TABLE statement, see ALTER TABLE.

Valid values of the ROW_FORMAT parameter are as follows:

  • REDUNDANT and COMPACT: Data is not encoded and is saved in the flat format.

  • DYNAMIC and COMPRESSED: Data is encoded and saved in the encoding format. The default value is DYNAMIC.

  • CONDENSED: Data is encoded and saved in the selective encoding format.

    The selective encoding format is a subset of the encoding format and is more query-friendly.

Valid values of the COMPRESSION parameter are as follows:

Note

The zlib compression algorithm of different minor versions may generate different compression results for the same data. To avoid replica restore failures during an upgrade, the zlib_1.0 compression algorithm is no longer supported in OceanBase Database V4.2.2 and later.

  • none: Data is not compressed.

  • lz4_1.0: Data is compressed by using the lz4_1.0 algorithm.

  • snappy_1.0: Data is compressed by using the snappy_1.0 algorithm.

  • zstd_1.0: Data is compressed by using the zstd_1.0 algorithm.

  • zstd_1.3.8: Data is compressed by using the zstd_1.3.8 algorithm. By default, the zstd_1.3.8 algorithm is used.

  • lz4_1.9.1: Data is compressed by using the lz4_1.9.1 algorithm.

Here are some examples:

  • Specify the encoding format and compression algorithm when you create the table

    Execute the following statement to set ROW_FORMAT to CONDENSED and COMPRESSION to zstd_1.0 when you create a table:

    obclient> CREATE TABLE test (
     id INT NOT NULL,
     fname VARCHAR(30),
     lname VARCHAR(30),
     hired DATE NOT NULL DEFAULT '1970-01-01',
     separated DATE NOT NULL DEFAULT '9999-12-31',
     job_code INT NOT NULL,
     store_id INT NOT NULL
    ) ROW_FORMAT=CONDENSED COMPRESSION='zstd_1.0';
    
  • Change the encoding format and compression algorithm when you modify the table

    Execute the following statement to change the value of ROW_FORMAT to CONDENSED and the value of COMPRESSION to lz4_1.0:

    obclient> ALTER TABLE test SET ROW_FORMAT = CONDENSED COMPRESSION = 'lz4_1.0';
    

Configure data compression and encoding in Oracle mode

When you create or modify a table in Oracle mode, you can specify the data encoding format and compression algorithm.

The syntax is as follows:

  • Specify the encoding format and compression algorithm when you create the table

    CREATE TABLE table_name table_definition_list compression;
    

    For more information about the CREATE TABLE statement, see CREATE TABLE.

  • Change the encoding format and compression algorithm when you modify the table

    ALTER TABLE table_name [alter_table_action_list] compression;
    

    For more information about the ALTER TABLE statement, see ALTER TABLE.

Valid values of the COMPRESSION parameter:

  • NOCOMPRESS: Data is not encoded or compressed, and is saved in the flat format.

  • COMPRESS BASIC: Data is not encoded but is saved in the flat format and compressed by using the lz4_1.0 algorithm.

  • COMPRESS FOR OLTP: Data is not encoded but is saved in the flat format and compressed by using the zstd_1.3.8 algorithm.

  • COMPRESS FOR QUERY: Data is encoded, saved in the encoding format, and compressed by using the lz4_1.0 algorithm.

  • COMPRESS FOR ARCHIVE: Data is encoded, saved in the encoding format, and compressed by using the zstd_1.3.8 algorithm. By default, COMPRESS FOR ARCHIVE is used.

  • COMPRESS FOR QUERY LOW: Data is encoded, saved in the selective encoding format, and compressed by using the lz4_1.0 algorithm.

    The selective encoding format is a subset of the encoding format and is more query-friendly.

Here are some examples:

  • Enable encoding and specify the compression algorithm when you create the table

    Execute the following statement to enable encoding and set COMPRESSION to COMPRESS FOR QUERY to use the lz4_1.0 algorithm when you create a table:

    obclient> CREATE TABLE test (
    id number NOT NULL,
    fname VARCHAR2(30),
    lname VARCHAR2(30),
    hired DATE NOT NULL DEFAULT sysdate,
    separated DATE,
    job_code INT NOT NULL,
    store_id INT NOT NULL
    ) COMPRESS FOR QUERY;
    
  • Enable encoding and change the compression algorithm when you modify the table

    Execute the following statement to enable encoding and change the value of COMPRESSION to COMPRESS FOR ARCHIVE to use the zstd_1.3.8 algorithm:

    obclient> ALTER TABLE test COMPRESS FOR ARCHIVE;
    

Change the compression algorithm for SSTables

To change the compression algorithm for SSTables without causing a huge volume of I/O writes in major compactions, you need to launch a progressive compaction to rewrite all the data microblocks. The number of rounds of progressive compactions can be determined by specifying the progressive_merge_num parameter in the ALTER TABLE statement.

To do so, perform the following steps:

  1. Log in to the database as a tenant administrator.

  2. Execute the following statement to set the number of rounds of progressive compactions.

    The progressive_merge_num parameter specifies the number of rounds of progressive compactions on a table. The default value is 0, which indicates incremental compaction. If you set the parameter to 1, a full compaction is performed.

    To set two rounds of progressive compactions, execute the following statement:

    obclient> ALTER TABLE t1 SET progressive_merge_num=2;
    

    After the parameter is set, data is progressively rewritten during major compactions each day.

Note

To complete data rewriting at the earliest opportunity, you can set the value of the progressive_merge_num parameter to 1, and launch a major compaction. For more information about how to manually initiate a major compaction, see Manually initiate a major compaction.

  1. Change the compression algorithm.

    For more information about how to change the compression algorithm in MySQL mode, see Configure data compression and encoding in MySQL mode in this topic.

    For more information about how to change the compression algorithm in Oracle mode, see Configure data compression and encoding in Oracle mode in this topic.

Contact Us