OceanBase supports a variety of partitioning strategies:
RANGE partitioning
RANGE COLUMNS partitioning
LIST partitioning
LIST COLUMNS partitioning
HASH partitioning
Composite partitioning
RANGE partitioning
RANGE partitioning maps data to partitions based on ranges of partitioning key values that you set up for each partition when you define the partitioned table. It is the most common type of partitioning and is often used with dates. For example, you can partition business log tables by day, week, or month. Basic syntax for RANGE partitioning:
CREATE TABLE table_name (
column_name1 column_type
[, column_nameN column_type]
) PARTITION BY RANGE ( expr(column_name1) )
(
PARTITION p0 VALUES LESS THAN ( expr )
[, PARTITION pN VALUES LESS THAN (expr ) ]
[, PARTITION pX VALUES LESS THAN (maxvalue) ]
);
The following rules apply when you use RANGE partitioning:
The expr expression in the PARTITION BY RANGE (expr) clause must return an integer value.
Each partition has a VALUES LESS THAN clause, which specifies a non-inclusive upper bound literal for the partitions. Values of the partitioning key equal to or higher than this literal are added to the next higher partition.
All partitions, except the first, have an implicit lower bound, which is the upper bound of the previous partition.
A MAXVALUE literal can and can only be defined for the last partition. MAXVALUE represents a virtual infinite value that is always greater than other possible values for the partitioning key, including the NULL value.
Example: Creating a RANGE partitioned table
CREATE TABLE t_log_part_by_range (
log_id bigint NOT NULL
, log_value varchar(50)
, log_date timestamp NOT NULL
) PARTITION BY RANGE(UNIX_TIMESTAMP(log_date))
(
PARTITION M202001 VALUES LESS THAN(UNIX_TIMESTAMP('2020/02/01'))
, PARTITION M202002 VALUES LESS THAN(UNIX_TIMESTAMP('2020/03/01'))
, PARTITION M202003 VALUES LESS THAN(UNIX_TIMESTAMP('2020/04/01'))
, PARTITION M202004 VALUES LESS THAN(UNIX_TIMESTAMP('2020/05/01'))
, PARTITION M202005 VALUES LESS THAN(UNIX_TIMESTAMP('2020/06/01'))
, PARTITION M202006 VALUES LESS THAN(UNIX_TIMESTAMP('2020/07/01'))
, PARTITION M202007 VALUES LESS THAN(UNIX_TIMESTAMP('2020/08/01'))
, PARTITION M202008 VALUES LESS THAN(UNIX_TIMESTAMP('2020/09/01'))
, PARTITION M202009 VALUES LESS THAN(UNIX_TIMESTAMP('2020/10/01'))
, PARTITION M202010 VALUES LESS THAN(UNIX_TIMESTAMP('2020/11/01'))
, PARTITION M202011 VALUES LESS THAN(UNIX_TIMESTAMP('2020/12/01'))
, PARTITION M202012 VALUES LESS THAN(UNIX_TIMESTAMP('2021/01/01'))
);
You can add and delete range partitions. If the MAXVALUE is specified in the last range partition, you cannot add another partition.
RANGE partitioning requires the partitioning key expression to return an integer value. To partition a table by RANGE based on a time column, the column must be a TIMESTAMP column and you must use the UNIX_TIMESTAMP function to convert timestamps to numeric values. This can also be implemented by RANGE COLUMNS partitioning, which does not require the results of the partitioning key expression to be integer values.
RANGE COLUMNS partitioning
RANGE COLUMNS partitioning is similar to RANGE partitioning, but they are different in the following aspects:
The results of RANGE COLUMNS partitioning do not have to be integer columns and all column types are accepted.
RANGE COLUMNS does not accept expressions.
RANGE COLUMNS partitioning accepts a list of one or more columns (column vectors).
Basic syntax for RANGE COLUMNS partitioning:
CREATE TABLE table_name (
column_name1 column_type
[, column_nameN column_type]
) PARTITION BY RANGE ( column_name1 [, column_name2] )
(
PARTITION p0 VALUES LESS THAN ( expr )
[, PARTITION pN VALUES LESS THAN (expr ) ]
[, PARTITION pX VALUES LESS THAN (maxvalue) ]
);
Example: Creating a RANGE COLUMNS partition
CREATE TABLE t_log_part_by_range_columns (
log_id bigint NOT NULL
, log_value varchar(50)
, log_date date NOT NULL
) PARTITION BY RANGE COLUMNS(log_date)
(
PARTITION M202001 VALUES LESS THAN('2020/02/01')
, PARTITION M202002 VALUES LESS THAN('2020/03/01')
, PARTITION M202003 VALUES LESS THAN('2020/04/01')
, PARTITION M202004 VALUES LESS THAN('2020/05/01')
, PARTITION M202005 VALUES LESS THAN('2020/06/01')
, PARTITION M202006 VALUES LESS THAN('2020/07/01')
, PARTITION M202007 VALUES LESS THAN('2020/08/01')
, PARTITION M202008 VALUES LESS THAN('2020/09/01')
, PARTITION M202009 VALUES LESS THAN('2020/10/01')
, PARTITION M202010 VALUES LESS THAN('2020/11/01')
, PARTITION M202011 VALUES LESS THAN('2020/12/01')
, PARTITION M202012 VALUES LESS THAN('2021/01/01')
, PARTITION MMAX VALUES LESS THAN MAXVALUE
);
LIST partitioning
Unlike RANGE partitioning and HASH partitioning, LIST partitioning enables you to explicitly control how rows map to partitions by specifying a list of discrete values for the partitioning key in the description for each partition. The advantage of LIST partitioning is that you can partition unordered and unrelated data. Basic syntax for LIST partitioning:
CREATE TABLE table_name (
column_name1 column_type
[, column_nameN column_type]
) PARTITION BY LIST ( expr(column_name1) )
(
PARTITION p0 VALUES IN ( v01 [, v0N] )
[, PARTITION pN VALUES IN ( vN1 [, vNN] ) ]
[, PARTITION pX VALUES IN (default) ]
);
The following rules apply when you use LIST partitioning:
The partitioning expression must return an integer value.
The partitioning expression can reference only one column, instead of a list of multiple columns (column vectors).
Example: Creating a LIST partitioned table
CREATE TABLE t_part_by_list (
c1 BIGINT PRIMARY KEY
, c2 VARCHAR(50)
) PARTITION BY list(c1)
(
PARTITION p0 VALUES IN (1, 2, 3)
, PARTITION p1 VALUES IN (5, 6)
, PARTITION p2 VALUES IN (DEFAULT)
);
You can add more LIST partitions, specifying a new non-duplicate table. You can also delete a partition.
LIST COLUMNS partitioning
LIST COLUMNS partitioning is similar to LIST partitioning, but they are different in the following aspects:
The use of expressions is not accepted for LIST COLUMNS partitioning.
LIST COLUMNS partitioning enables the use of multiple columns as partition keys (column vectors)
Basic syntax for LIST COLUMNS partitioning:
CREATE TABLE table_name (
column_name1 column_type
[, column_nameN column_type]
) PARTITION BY LIST COLUMNS ( column_name1 [, column_nameN ] )
(
PARTITION p0 VALUES IN ( v01 [, v0N] )
[, PARTITION pN VALUES IN ( vN1 [, vNN] ) ]
[, PARTITION pX VALUES IN (default) ]
);
Example: Creating a LIST COLUMNS partition
CREATE TABLE t2 (
id varchar(64),
type varchar(16),
info varchar(512),
gmt_create datetime(6),
gmt_modified datetime(6),
partition_id varchar(2) GENERATED ALWAYS AS (substr(`id`,19,20)) VIRTUAL,
PRIMARY KEY (id)
) partition by list columns(partition_id)
(partition p0 values in ('00','01'),
partition p1 values in ('02','03'),
partition p2 values in (default));
HASH partitioning
HASH partitioning applies to scenarios where RANGE partitioning or LIST partitioning cannot be used. Hash partitioning enables easy partitioning of data by distributing records over partitions based on a hash function on the partitioning key. HASH partitioning is a better choice in the following cases:
You cannot identify an obvious partitioning key for the data.
The sizes of range partitions differ substantially or are difficult to balance manually.
RANGE partitioning can cause the data to be undesirably clustered.
Performance features such as parallel DML, partition pruning, and partition-wise joins are important.
Example: Creating a HASH partitioned table
CREATE TABLE ware(
w_id int
, w_ytd number(12,2)
, w_tax number(4,4)
, w_name varchar(10)
, w_street_1 varchar(20)
, w_street_2 varchar(20)
, w_city varchar(20)
, w_state char(2)
, w_zip char(9)
, primary key(w_id)
) PARTITION by hash(w_id) partitions 60;
Hash partitioning does not support partition deletion.
Composite partitioning
Composite partitioning partitions a table using one partitioning strategy and partitions each partition using a different partitioning strategy. It is suitable for business tables containing large amounts of data. Composite partitioning enables you to take advantage of multiple partitioning strategies.
You can use the SUBPARTITION TEMPLATE clause to specify subpartition settings.
Example: Creating a composite partitioned table
CREATE TABLE t_ordr_part_by_hash_range (o_w_id int
, o_d_id int
, o_id int
, o_c_id int
, o_carrier_id int
, o_ol_cnt int
, o_all_local int
, o_entry_d TIMESTAMP NOT NULL
, index idx_ordr(o_w_id, o_d_id, o_c_id, o_id) LOCAL
, primary key ( o_w_id, o_d_id, o_id, o_entry_d )
)
PARTITION BY hash(o_w_id)
SUBPARTITION BY RANGE(UNIX_TIMESTAMP(o_entry_d))
SUBPARTITION template
(
SUBPARTITION M202001 VALUES LESS THAN(UNIX_TIMESTAMP('2020/02/01'))
, SUBPARTITION M202002 VALUES LESS THAN(UNIX_TIMESTAMP('2020/03/01'))
, SUBPARTITION M202003 VALUES LESS THAN(UNIX_TIMESTAMP('2020/04/01'))
, SUBPARTITION M202004 VALUES LESS THAN(UNIX_TIMESTAMP('2020/05/01'))
, SUBPARTITION M202005 VALUES LESS THAN(UNIX_TIMESTAMP('2020/06/01'))
, SUBPARTITION M202006 VALUES LESS THAN(UNIX_TIMESTAMP('2020/07/01'))
, SUBPARTITION M202007 VALUES LESS THAN(UNIX_TIMESTAMP('2020/08/01'))
, SUBPARTITION M202008 VALUES LESS THAN(UNIX_TIMESTAMP('2020/09/01'))
, SUBPARTITION M202009 VALUES LESS THAN(UNIX_TIMESTAMP('2020/10/01'))
, SUBPARTITION M202010 VALUES LESS THAN(UNIX_TIMESTAMP('2020/11/01'))
, SUBPARTITION M202011 VALUES LESS THAN(UNIX_TIMESTAMP('2020/12/01'))
, SUBPARTITION M202012 VALUES LESS THAN(UNIX_TIMESTAMP('2021/01/01'))
, SUBPARTITION MMAX VALUES LESS THAN MAXVALUE
)
partitions 16;
CREATE TABLE t_log_part_by_range_hash (
log_id int NOT NULL
, log_value varchar(50)
, log_date TIMESTAMP NOT NULL
, PRIMARY key(log_id, log_date)
) PARTITION BY RANGE(UNIX_TIMESTAMP(log_date))
SUBPARTITION BY HASH(log_id) SUBPARTITIONS 16
(
PARTITION M202001 VALUES LESS THAN(UNIX_TIMESTAMP('2020/02/01'))
, PARTITION M202002 VALUES LESS THAN(UNIX_TIMESTAMP('2020/03/01'))
, PARTITION M202003 VALUES LESS THAN(UNIX_TIMESTAMP('2020/04/01'))
, PARTITION M202004 VALUES LESS THAN(UNIX_TIMESTAMP('2020/05/01'))
, PARTITION M202005 VALUES LESS THAN(UNIX_TIMESTAMP('2020/06/01'))
, PARTITION M202006 VALUES LESS THAN(UNIX_TIMESTAMP('2020/07/01'))
, PARTITION M202007 VALUES LESS THAN(UNIX_TIMESTAMP('2020/08/01'))
, PARTITION M202008 VALUES LESS THAN(UNIX_TIMESTAMP('2020/09/01'))
, PARTITION M202009 VALUES LESS THAN(UNIX_TIMESTAMP('2020/10/01'))
, PARTITION M202010 VALUES LESS THAN(UNIX_TIMESTAMP('2020/11/01'))
, PARTITION M202011 VALUES LESS THAN(UNIX_TIMESTAMP('2020/12/01'))
, PARTITION M202012 VALUES LESS THAN(UNIX_TIMESTAMP('2021/01/01'))
, PARTITION MMAX VALUES LESS THAN MAXVALUE
);