Partitioned indexes|V4.3.3| docs|Distributed Database

Partitioned indexes

Last Updated：2024-11-11 07:37:15 Updated

In a database, users can improve query performance by creating indexes. When the data volume in a table grows, partitioning can be used to divide the table into multiple shards, distributing the data shards across the entire cluster using load balancing to enhance the overall capability of the cluster. This topic introduces local indexes, global indexes, and unique indexes.

Index types

OceanBase Database supports the following types of partitioned indexes:

Local partitioned index: A local index is created by specifying the keyword LOCAL during index creation. Local indexes do not require specifying partitioning rules and inherit the partitioning attributes of the primary table. They are also modified along with the partition operations of the primary table.
Global partitioned index: A global index is created by specifying the keyword GLOBAL during index creation. Global indexes can be partitioned according to the partitioning rules.
Unique index: A unique index ensures that the indexed key values are unique. It is indicated by the keyword UNIQUE.
Prefixed index: A prefixed index is created when the partitioning key of the partitioned index table is a left prefix of the index column. This is different from the prefixed index for long strings in MySQL. For example, if the index table idx is created on columns c1 and c2, and the partitioning key of this index table is c1, it is called a prefixed index. If the partition key is c2 or another column, it is called a non-prefixed index.
Non-prefixed index: In contrast to a prefixed index for partitions, if the partitioned index is not a prefixed index, it is referred to as a non-prefixed index.

Local partitioned indexes

Local partitioned indexes are divided into local prefixed indexes and local non-prefixed indexes.

Local prefixed indexes

If a local index is partitioned on a left prefix of the index columns and the subpartioning key is included in the index key, it is a local prefixed index. Local prefixed indexes can be unique or nonunique.

A query with a specified index key can use a local prefixed index to uniquely locate an index partition. Local prefixed indexes are suitable for scenarios with small result sets and partition pruning requirements.

For example, assume that the local index idx(c1,c2,c3) is created for Table A that is partitioned on the c1 column. The index is always equipartitioned with the primary table, so the local index idx(c1,c2,c3) is also partitioned on the c1 column. In this case, this index is a local prefixed index. A query with a specified index key can uniquely locate an index partition based on the value of the partitioning key column c1. This significantly reduces the number of index partitions to be scanned.

Local non-prefixed indexes

If a local partitioned index is not a local prefixed index, it is a local non-prefixed index. Likely, the partitioning key of the primary table of the index is not on a left prefix of the index columns, or the subpartioning key is not included in the index key.

If the partitioning key columns are not a subset of the index columns, the local non-prefixed index cannot be unique. A query with a specified index key cannot locate the target index partition by using a local non-prefixed index. In this case, all index partitions are scanned before the result is returned. Therefore, local non-prefixed indexes are suitable for scenarios where you want to implement parallel execution to access a large amount of data.

For example, assume that index idx(c1,c2,c3) is created for Table A that is partitioned on the c4 column. This index is a local non-prefixed index based on the definition. A query with a specified index key cannot use the local non-prefixed index to locate the target index partition based on the partitioning key. In this case, all partitions are scanned before the result is returned. Parallel execution is suitable for such a scenario.

Global partitioned indexes

Global partitioned indexes have their own independent partition definitions and do not need to be the same as the primary table. Additionally, the partitions of a global index table can be split and merged. In general, if the partitioning scheme of the primary table and the global index is exactly the same, except for non-prefixed indexes that require uniqueness, it is recommended to define indexes as local indexes. Managing and maintaining global indexes incurs significantly higher costs compared with local indexes. Moreover, in terms of query cost and partition pruning, global indexes with the same partitioning scheme as the primary table have the same effect as local indexes.

Global prefixed indexes

A global partitioned index is prefixed if it is partitioned on a left prefix of the index columns.

Global prefixed indexes can be unique or nonunique.

Global prefixed indexes are useful when they are partitioned by range and are not useful when partitioned by hash. If the index is hash partitioned, you will most likely specify the index key in your query. If the partitioning key columns are a subset of the index key columns, the target index partition can be located based on the index key values that you specified, regardless of whether the index is prefixed. If the partitioning key columns are not a subset of the index key columns, in other words, if you do not specify all partitioning key values, the query that uses a hash-partitioned index must scan all index partitions. However, a query that uses a range-partitioned index always allows for index partition pruning.

Global non-prefixed indexes

OceanBase Database does not support global non-prefixed indexes, because they do not help improve query efficiency.

For example, assume that global index idx(c1,c2) is created for Table A and is partitioned on the c2 column. Then, index idx(c1,c2) is a global non-prefixed index. In this case, partition pruning can be performed only when all index key values are specified in a user query. Otherwise, all index partitions are scanned before the result is returned. Therefore, you have every reason to partition the index on the c1 column. This allows the query to eliminate unneeded partitions based on c1, the prefix of the index.

Unique partitioned indexes

Unique partitioned indexes can be global or local.

For a local index to be unique, the index partitioning key columns must be a subset of the index key columns. For example, assume that a unique index idx(c1,c2,c3) is created for Table A and is partitioned on the c1 and c2 columns. In this case, rows with the same values in the c1 and c2 columns are distributed in the same partition. This ensures that no identical rows exist in the same partition. If index idx(c1,c2,c3) is partitioned on the c2 and c4 columns, the partitioning key columns are not a subset of the index key columns. You cannot create a local index to ensure that the values in the index key columns are unique among partitions. In this case, a local unique index cannot be created.

Index selection

We recommend that you create indexes that best suit your business requirements in aspects such as user query modes, index management, performance, and availability.

If you want a unique index and the index key columns include all partitioning key columns of the primary table, create a local index. Otherwise, create a global index.
If the partitioning key columns are a subset of the index key columns, you can create a local index.
If the index is equipartitioned with the primary table, we recommend that you create a local index.
If you are concerned about the management costs of the index partitions and the partitions of the primary table are frequently dropped, we recommend that you avoid creating a global index. Frequent dropping of partitions significantly changes the global index. The changes are irreversible and may cause the index to become unavailable.
If the partitioning key values of the index are always specified in user queries, create an index on columns other than the partitioning key columns. This reduces the maintenance and storage costs. If the partitioning key values of the index are not specified in user queries, create a prefixed index that is suitable for scenarios that have a small amount of data and require partition pruning. Non-prefixed indexes are suitable for scenarios that have a large amount of data and support the parallel execution of queries.

Partitioning method selection

For a single index table that contains a huge amount of data, you can partition the index to facilitate parallel queries among partitions and balance the load.

If the index key is specified in most user queries, you can use the hash or range partitioning method. The two partitioning methods are the same in terms of the number of partitions to be scanned and the performance of parallel execution. However, if you have data hotspots in your table, hash partitioning is preferred because it can better prevent data hotspots.
For range queries where the index key is specified, range partitioning is more suitable than hash partitioning in terms of the number of partitions to be scanned. However, hash partitioning can provide better performance when the result set is huge because it allows parallel queries.

Selection of indexes for queries

If partition pruning is required, a prefixed index is preferred. You can specify the partitioning key, a left prefix of the index columns, in the filtering condition of a query to facilitate partition pruning and minimize the number of partitions to be scanned.
For a scenario that involves a large amount of data, you can use a non-prefixed index to improve the query performance, because it allows parallel execution of range queries on the partitioning key columns. A local non-prefixed index allows the parallel scan of all partitions, whereas a local prefixed index eliminates unneeded partitions. Queries that use local prefixed indexes can process a large amount of data by scanning a limited number of partitions, but they may take a longer time than queries that allow parallel execution.