Partition types |V3.2.4|OceanBase Database| docs|Distributed Database

Partition types

Last Updated：2023-10-24 09:23:04 Updated

Partitioning strategies

OceanBase Database provides multiple partitioning strategies that specify how data is distributed across partitions.

The basic strategies supported by OceanBase Database are range, list, and hash partitioning.

A single-level partitioning strategy uses only one data distribution method. For example, during single-level partitioning, you can use list partitioning or range partitioning, but you cannot use them at the same time.

In composite partitioning, a table is partitioned by using a data distribution method and then subpartitioned by using another data distribution method. For example, assume that you have a table that contains the create_time and user_id columns. You can use range partitioning on the create_time column and use hash partitioning on the user_id column to create subpartitions.

Range partitioning

Range partitioning is the most commonly used partitioning method. It is often used when you want to query tables with dates. In range partitioning, the database maps rows to partitions based on ranges of partition key values.

Each partition has a VALUES LESS THAN clause, which specifies a noninclusive upper bound for the partition. Any values of the partition key equal to or higher than this upper bound are added to the next higher partition. All partitions, except the first, have an implicit lower bound specified by the VALUES LESS THAN clause on the previous partition.

You can define a MAXVALUE literal for the highest partition. MAXVALUE represents a virtual infinite value that sorts higher than all other values for the partition key, including the null value.

Hash partitioning

In hash partitioning, the database maps rows to partitions based on a hashing algorithm that the database applies to the user-specified partition key. The destination of a row is determined based on the number of hash partitions calculated by using the internal hash function applied to the row. The hashing algorithm is designed to evenly distribute rows across partitions when the number of partitions is to the power of two. This algorithm ensures that each partition contains almost the same number of rows.

If you want to evenly distribute data across nodes, we recommend that you use the hash partitioning method. Hash partitioning is an easy-to-use alternative to range partitioning and is suitable when you want to partition a table that does not contain historical data or a table for which you cannot specify a partition key.

You can also use hash partitioning in online transaction processing (OLTP) systems with high update contention. Hash partitioning allows you to divide modifications to a single table into modifications to different partitions.

List partitioning

In list partitioning, the database uses a list of discrete values as the partition key for each partition. The partition key consists of one or more columns.

You can use list partitioning to specify how individual rows map to specific partitions.

List partitioning allows you to group and organize related sets of data when the key used to identify them is not conveniently ordered.

Composite partitioning

You can use the range, list, and hash partitioning methods to divide a partition into subpartitions.

In composite partitioning, a table is partitioned by using a data distribution method and then subpartitioned by using another data distribution method. Composite partitioning is a combination of two basic partitioning methods. All subpartitions of a specified partition represent a logical subset of data in the partition.

Composite partitioning provides the following benefits:

You can perform partition pruning on one or two dimensions based on the SQL statement to improve query performance.
You can perform partial partition-wise and full partition-wise joins for queries on either dimension.
You can perform parallel backup and recovery on a table.
Compared with single-level partitioning, composite partitioning generates more partitions. This can be beneficial for the efficient parallel execution of queries.
You can implement a rolling window to support historical data and still partition on another dimension if many statements can benefit from partition pruning or partition-wise joins.
You can store data based on the partition key. For example, you can store data for a specific product type in a read-only and compressed format, and keep data of other product types uncompressed.

The following figure shows a graphical view of the range-hash and range-list composite partitioning methods.

Partition types