In OceanBase Database, partitioning allows you to decompose a table into multiple smaller and more manageable parts called partitions based on specific rules. Each partition is an independent object with its own name and may have its own storage features. This topic describes partitions and the benefits of partitioning.
Overview
An application that accesses a database logically accesses only one table or one index. However, the table may consist of dozens of physical partitions. Each partition is an independent object and can be independently accessed or accessed as part of the table. The partitions are completely transparent to the application and do not affect the business logic of the application.
From the perspective of an application, only one schema object exists. No modification to SQL statements is required to access partitioned tables. Partitioning is useful for many types of database applications, especially those that manage large amounts of data.
A partitioned table consists of one or more partitions, which are managed separately and can operate independently of other partitions. A table can be partitioned or non-partitioned. Even if a partitioned table consists of only one partition, this table is different from a non-partitioned table. You cannot add partitions to a non-partitioned table.
A partitioned table consists of one or more table partition segments. OceanBase Database stores the data of each table partition in its own SSTable, and each SSTable contains part of table data.
Benefits
Higher availability
The unavailability of a partition does not necessarily mean that the entire table is unavailable. The query optimizer automatically removes unreferenced partitions from the query plan. Therefore, queries are not affected when the partitions are unavailable.
Easier management of database objects
A partitioned object has pieces that can be managed collectively or separately. DDL statements can manipulate partitions rather than entire tables or indexes. Therefore, you can decompose resource-intensive tasks such as the recreation of an index or table. For example, you can move only one partition at a time. If an issue occurs, you need only to redo the partition move rather than the table move. In addition, you can execute a
TRUNCATEstatement on a partition to avoid unnecessary deletion of a large amount data.Reduced contention for shared resources in OLTP systems
In online transaction processing (OLTP) scenarios, partitioning can reduce contention for shared resources. For example, a DML operation is performed on many partitions rather than one table.
Enhanced query performance in data warehouses
In analytical processing (AP) scenarios, partitioning can speed up the processing of ad hoc queries. Partitioning keys can implement filtering. For example, when sales data is partitioned by sales time and you want to query the sales data for a quarter, you need to query only one or several partitions rather than the entire table.
Better load balancing results
OceanBase Database stores data and implements load balancing by partition. Different partitions can be stored on different OBServers in an OceanBase cluster. Therefore, different partitions of a partitioned table can be distributed on different OBServers so that the data of a table can be evenly distributed across the entire OceanBase cluster.
Partitioning types
OceanBase Database in Oracle mode supports the following partitioning methods:
RANGE partitioning
LIST partitioning
HASH partitioning
Composite partitioning (subpartitioning)
RANGE partitioning
RANGE partitioning maps data to partitions based on ranges of partitioning key values that you set up for each partition when you define the partitioned table. It is the most common partitioning type and is often used with dates. For example, you can partition business log tables by day, week, or month.
LIST partitioning
Unlike RANGE partitioning and HASH partitioning, LIST partitioning enables you to explicitly control how rows map to partitions by specifying a list of discrete values for the partitioning key in the description for each partition. The advantage of LIST partitioning is that you can partition unordered and unrelated data.
HASH partitioning
HASH partitioning applies to scenarios where RANGE partitioning or LIST partitioning cannot be used. HASH partitioning enables easy partitioning of data by distributing records over partitions based on a HASH function on the partitioning key. HASH partitioning is a better choice in the following cases:
You cannot identify an obvious partitioning key for the data.
The sizes of RANGE partitions differ substantially or are difficult to balance manually.
RANGE partitioning can cause the data to be undesirably clustered.
Performance features such as parallel DML, partition pruning, and partition-wise joins are important.
Subpartitioning
Subpartitioning partitions a table using one partitioning strategy and partitions each partition using a different partitioning strategy. It is suitable for business tables containing large amounts of data. Composite partitioning gives full play to the advantages of the two partitioning strategies that you use in combination.