In a database, you can create indexes to speed up queries. As the volume of data in a table increases, you can divide the table into several shards and distribute the shards across the cluster through load balancing to enhance the overall service capability of the cluster. This topic explains local indexes, global indexes, and unique indexes.
Index types
OceanBase Database supports the following index types:
Local index: An index created with the
LOCALkeyword specified is a local index. Local indexes do not require partitioning strategies to be specified. They have the same partition attribute as the primary table and will follow the partitioning operations of the primary table.Global index: An index created with the
GLOBALkeyword specified is a global index. Global indexes can be partitioned based on specified partitioning strategies.Unique index: A unique index is created with the
UNIQUEkeyword. The values of the index key must be unique.Prefix index: In a partitioned index table, if the partitioning key is the leftmost prefix of the index key, the index is a prefix index. This is different from a prefix index in MySQL. For example, an index table
idxis created based on columnsc1andc2. If the partitioning key of the index table isc1, the index is a prefix index; if the partitioning key of the index table isc2or another column, the index is a non-prefix index.Non-prefix index: An index that is not a prefix index is a non-prefix index.
Local indexes
Local indexes are further divided into local prefix indexes and local non-prefix indexes based on the partitioning key.
Local prefix indexes
A local prefix index is a unique or non-unique index in which the partitioning key is the leftmost prefix of the index key. Local prefix indexes can be partitioned based on the same partitioning strategy as the primary table or a different one.
A query with a specified index key can be uniquely located in one index partition with a local prefix index, which is suitable for scenarios with small result sets but require partition pruning.
For example, a local index idx(c1,c2,c3) is created on table A. The primary table is partitioned based on c1. According to the characteristics of the local index, the index table and the primary table have the same partitioning strategy. Therefore, the index idx is also partitioned based on c1. By definition, this index is a local prefix index. When a query contains a specified index key, the query result can be obtained by using the partitioning key c1, which greatly reduces the number of index partitions to be accessed.
Local non-prefix indexes
A local non-prefix index is an index that is not a local prefix index. The partitioning key of the index is not the leftmost prefix of the index key, or the index does not contain the partitioning key.
If the partitioning key is not a subset of the index key, a local non-prefix index cannot be a unique index. A query with a specified index key cannot be uniquely located in a specific index partition with a local non-prefix index, but needs to access all index partitions. Therefore, it is suitable for scenarios with large data volumes and high concurrency.
For example, a local index idx(c1,c2,c3) is created on table A. The primary table is partitioned based on c4. By definition, this index is a local non-prefix index. When a query contains a specified index key, the query result can be obtained only by accessing all index partitions, regardless of whether the partitioning key is specified in the query. Concurrency control can play an important role in such scenarios.
Global indexes
A global index has its own independent partitioning definition, which does not necessarily need to be the same as that of the primary table. Partitions of a global index can be split or merged. Generally, if the partitioning strategies of the primary table and the global index are the same, unique global indexes are recommended to be defined as local indexes. This is because maintaining a global index incurs much higher partition management overheads than maintaining a local index, and global indexes and local indexes have the same query costs and partition pruning effects when they have the same partitioning strategy.
Global prefix indexes
A global prefix index is a global index whose first field is the partitioning key of the table. In other words, the partitioning key of the table is the leftmost prefix of the index key of the global index.
A global prefix index can be a unique index or a non-unique index.
A global prefix index has meaning only when the RANGE partitioning strategy is used, but not when the HASH partitioning strategy is used. This is because if the user chooses the HASH partitioning strategy, the user's query pattern is most likely to be a point query by index key. In this case, whether an index is a prefix index is meaningless. The index keys in different partitions can be the same, and the user can calculate the index partition where the query data is located based on the specified index key. If the user does not specify all index keys, the HASH partitioned index needs to access all partition data, whereas the RANGE partitioned index can prune partitions.
Global non-prefix indexes
OceanBase Database does not support global non-prefix indexes. Therefore, global non-prefix indexes have no significance for query optimization.
For example, a global index idx(c1,c2) is created on table A. The index idx is partitioned based on c2. In this case, the index idx is a global non-prefix index. In this scenario, partition pruning can be achieved only when the user specifies all index keys. Otherwise, the user's query result can be obtained only by scanning all index partitions. Therefore, there is no reason for the user to partition the index by using c2 rather than by using c1, because the latter partitioning strategy allows partition pruning.
Unique indexes
A unique index can be a global index or a local index.
To create a local unique index, the index key must cover the partitioning key. For example, a unique index idx(c1,c2,c3) is created on table A. The index table idx is partitioned based on (c1,c2). This partitioning strategy ensures that the same (c1,c2) can enter the same partition. The unique index is maintained within a single partition. If the index table idx is partitioned based on (c2,c4), the index does not contain the partitioning key. In this case, the unique index cannot guarantee the uniqueness of index keys between partitions. Therefore, such a unique index cannot be created.
Index creation strategies
When you create an index, consider your query patterns, index management, performance, and availability needs to choose the most suitable index strategy for your business.
If you need a unique index and the index key covers all partitioning keys, define the index as a local index. Otherwise, define the index as a global index.
If the partitioning keys of the primary table are a subset of the index keys, define the index as a local index.
If the partitioning attribute of the primary table is the same as that of the index, we recommend that you define the index as a local index.
If the cost of managing index partitions is a concern and the partitions of the primary table are frequently coarsened, we recommend that you do not create a global index. This is because coarsening of the partitions of the primary table causes significant changes in the partitions of the global index, making the global index difficult to restore and potentially unavailable.
If your queries always specify all index partitioning keys, you only need to create an index on other columns without including the partitioning keys to reduce the maintenance and storage costs; prefix indexes are more suitable for partition pruning and scenarios with small data volumes, whereas non-prefix indexes generally apply to scenarios with large data volumes and parallel partition access.
Partition selection strategy
When you create a partitioned index, if the data volume of a single index table is too large, you need to divide the table into several partitions to facilitate parallel execution of queries and load balancing.
If your queries are mostly single-point queries by index key, the query costs for hash-partitioned and range-partitioned indexes are similar in terms of the number of partitions accessed and the level of concurrency. However, hash partitioning can better avoid hot spots when data has hot spots.
If your queries are mostly range queries by index key, range partitioning is more advantageous than hash partitioning in terms of the number of partitions accessed. However, hash partitioning can better utilize concurrent queries when the result set is large. Therefore, hash partitioning has greater advantages over range partitioning in concurrent queries with a large result set.
Query-time index selection strategy
Prefix indexes are more suitable for partition pruning. If you want to maximize partition pruning, specify the prefix of the partition in the query filter condition to reduce the number of index partitions that need to be accessed.
Non-prefix indexes are more suitable for scenarios with large data volumes. When you access a large amount of data, a non-prefix index can improve performance through parallel partition access. Local non-prefix indexes can concurrently access all partitions, whereas local prefix indexes prune partitions. As a result, a few partitions need to process a large amount of data, which can result in longer response times compared to concurrent queries.