As data volumes in databases grow, you can partition a table into multiple shards and distribute the shards across the cluster to improve the overall service capability of the cluster. You can also create indexes to speed up queries. This topic explains local, global, and unique indexes.
Index types
OceanBase Database supports the following index types:
Local indexes: An index created with the
LOCALkeyword specified is a local index. Local indexes do not require the specification of a partitioning rule. It inherits the partitioning attribute from the primary table and will change accordingly when the partitioning operation is performed on the primary table.Global indexes: An index created with the
GLOBALkeyword specified is a global index. Global indexes can be partitioned according to the specified partitioning rule.Unique indexes: An index with the
UNIQUEkeyword specified is a unique index.Prefix indexes: In a partitioned index table, if the partitioning key is the leftmost prefix of the index column, the index is a prefix index. This is different from a prefix index in MySQL. For example, an index table named
idxis built on columnsc1andc2. If the partitioning key of the index table isc1, the index is a prefix index. If the partitioning key of the index table isc2or another column, the index is a non-prefix index.Non-prefix indexes: An index that is not a prefix index is a non-prefix index.
Local indexes
Local indexes are further divided into prefix local indexes and non-prefix local indexes based on the partitioning key.
Prefix local indexes
A local index that is a prefix index is an index whose partitioning key is the leftmost prefix of the index column. A prefix local index can be a unique index or a non-unique index.
A query with a specified index key can be uniquely located in one index partition with a prefix local index. This is suitable for scenarios with a small result set but requiring partition pruning.
For example, a local index idx(c1,c2,c3) is created on a table named A. The primary table is partitioned by using c1. Based on the characteristics of the local index, the index table and the primary table have the same partitioning key. Therefore, the idx table is also partitioned by using c1. By definition, this index is a prefix local index. When a query contains a specified index key, the query result can be obtained by using the partitioning key c1, greatly reducing the number of index partitions to be accessed.
Non-prefix local indexes
A local index that is not a prefix index is a non-prefix local index. This means that either the partitioning key of the index table is not the leftmost prefix of the index column or the index does not contain a partitioning key.
A non-prefix local index cannot be a unique index if the partitioning key is not a subset of the index key. A query with a specified index key cannot be uniquely located in an index partition with a non-prefix local index. Instead, the query needs to access all index partitions. Therefore, it is suitable for scenarios with a large amount of data and high concurrency.
For example, a local index idx(c1,c2,c3) is created on a table named A. The primary table is partitioned by using c4. By definition, this index is a non-prefix local index. When a query contains a specified index key, the query result cannot be obtained by using the partitioning key. Therefore, the query needs to access all index partitions for the result set. In this scenario, concurrency can play an important role.
Global indexes
A global index has its own partitioning definition, which does not necessarily need to be the same as that of the primary table. Partitions of a global index can be split or merged. Generally, if the partitioning methods of the primary table and the global index are the same, unique global indexes are recommended to be defined as local indexes. This is because maintaining a global index incurs much higher partition management overheads than maintaining a local index, and global indexes and local indexes have the same query costs and partition pruning effects assuming that they have the same partitioning method.
Prefix global indexes
A global index that has the same partitioning attribute as the primary table is a prefix global index.
A prefix global index can be a unique index or a non-unique index.
A prefix global index is meaningful only when the RANGE partitioning method is used, but not when the HASH partitioning method is used. This is because if the user chooses the HASH partitioning method, the user's query pattern is most likely to be exact matching by index key. In this case, whether an index is a prefix index is meaningless. The index keys in different partitions can be uniquely identified by the index keys specified by the user; if the user does not specify all index keys, the HASH-partitioned index needs to access all partitions, whereas the RANGE-partitioned index can prune partitions to a certain extent.
Non-prefix global indexes
OceanBase Database does not support non-prefix global indexes. Therefore, non-prefix global indexes do not provide much benefit for query optimization.
For example, a global index idx(c1,c2) is created on a table named A. The global index idx is partitioned by using c2. In this case, the global index idx is a non-prefix global index. In this setup, partition pruning can be achieved only when all index keys are specified in a query. Otherwise, the system needs to scan all index partitions. Therefore, there is no reason to use c2 as the partitioning key instead of c1. Moreover, prefix filtering can be used to prune partitions when c1 is used as the partitioning key.
Unique indexes
You can define unique indexes as global indexes or local indexes.
To define a local unique index, the index key must cover the partitioning key. For example, a unique index idx(c1,c2,c3) is created on a table named A. The index table idx is partitioned by using (c1,c2). This setup ensures that the same (c1,c2) can be uniquely identified in one partition. The unique index is maintained within a single partition. If the index table idx is partitioned by using (c2,c4), the index does not contain a partitioning key. In this case, a local unique index cannot be created because the unique index keys cannot be uniquely identified between partitions.
Index creation strategies
When you create an index, consider your query patterns, index management, performance, and availability needs to choose the most suitable index strategy for your business.
If you need a unique index and the index key covers all partitioning keys, define the index as a local index. Otherwise, define the index as a global index.
If the partitioning keys of the primary table are a subset of the index keys, define the index as a local index.
If the partitioning attribute of the primary table is the same as that of the index, we recommend that you define the index as a local index.
If you are more concerned about the costs of index partition management and the partitions of the primary table are frequently pruned, we recommend that you do not create a global index. This is because frequent partition pruning of the primary table will result in significant changes, difficulty in restoration, and even unavailability of the global index.
If your queries always specify all index partition keys, you can create an index on other columns without partitioning to reduce the maintenance and storage costs; prefix indexes are more suitable for partition pruning in scenarios with small data volumes, whereas non-prefix indexes generally perform better in scenarios with large data volumes.
Partition selection strategies
When you create a partitioned index, if a single index table contains a large amount of data, you need to divide the data into several partitions to facilitate parallel query execution and load balancing.
If your queries are mostly single-point queries with the index key, both HASH and RANGE partitioning have similar costs in terms of the number of partitions accessed and the level of concurrency. However, HASH partitioning can better avoid hot spots if data has hot spots.
If your queries are mostly range queries with the index key, RANGE partitioning allows access to fewer partitions, whereas HASH partitioning supports higher concurrency. Therefore, HASH partitioning is more advantageous in scenarios with large result sets, whereas RANGE partitioning is more advantageous in scenarios with small result sets.
Query-time index selection strategies
If you want to perform partition pruning, a prefix index is a better choice. With the prefix index, you can specify the partition prefix in the query filter condition to prune partitions and reduce the amount of data that needs to be read from the index partitions.
If you prioritize throughput and a large amount of data is to be accessed, a non-prefix index is a better choice. Non-prefix indexes allow parallel access to partitions for range queries on the partitioning key. Local non-prefix indexes can concurrently access all partitions, whereas local prefix indexes prune partitions. As a result, a large data set is processed by a few partitions, which can result in higher response times compared to parallel queries.