As the data volume of a table in a database increases, you can partition the table into several shards and distribute the shards across the cluster to improve the overall service capability of the cluster. You can create indexes on the tables to speed up queries. This topic will introduce local indexes, global indexes, and unique indexes.
Index types
OceanBase Database supports the following index types:
Local partitioned index: A local partitioned index is created by specifying the
LOCALkeyword when creating an index. A local partitioned index does not require a partitioning rule. It has the same partition attribute as the primary table and is affected by any partition operations on the primary table.Global partitioned index: A global partitioned index is created by specifying the
GLOBALkeyword when creating an index. A global partitioned index can be partitioned based on a specified partitioning rule.Unique index: A unique index is created by specifying the
UNIQUEkeyword. The values of the index key are unique.Prefix index: A prefix index is an index where the partitioning key of the index table is the leftmost prefix of the index column. For example, an index table
idxis created based on columnsc1,c2, andc3. If the partitioning key of the index table isc1, the index is a prefix index. If the partitioning key of the index table isc2or another column, the index is a non-prefix index.Non-prefix index: A non-prefix index is an index that is not a prefix index.
Local indexes
Local indexes are further classified into local prefix indexes and local non-prefix indexes based on the partitioning key.
Local prefix indexes
A local prefix index is a unique or non-unique index where the partitioning key of the index is the leftmost prefix of the index column. Local prefix indexes can be partitioned based on a specified partitioning rule.
A query with a specified index key can uniquely locate one index partition. Therefore, local prefix indexes are suitable for scenarios with small result sets and require partition pruning.
For example, a local index idx(c1,c2,c3) is created on table A. The primary table is partitioned based on the c1 column. Based on the characteristics of the local index, the index table and the primary table have the same partitioning rule. Therefore, the idx index is also partitioned based on the c1 column. As defined, the idx index is a local prefix index. When a query request contains a specified index key, the system can use the partitioning key c1 to uniquely locate one index partition, thereby reducing the number of partitions to be accessed.
Local non-prefix indexes
A local non-prefix index is an index that is not a local prefix index. This means that the partitioning key of the index column is not a subset of the partitioning key of the index, or the index does not contain the partitioning key.
If the partitioning key is not a subset of the index key, a local non-prefix index cannot be a unique index. A query with a specified index key can access all index partitions. Therefore, local non-prefix indexes are suitable for scenarios with large data volumes and require high concurrency.
For example, a local index idx(c1,c2,c3) is created on table A. The primary table is partitioned based on the c4 column. Based on the definition, the idx index is a local non-prefix index. When a query request contains a specified index key, the system cannot use the partitioning key to locate an index partition. Therefore, the system must access all index partitions to return the query result. In this scenario, concurrency can help improve performance.
Global indexes
A global index has its own independent partitioning definition, which does not necessarily need to be the same as that of the primary table. Partitions of a global index can be split or merged. Generally, if the partitioning methods of the primary table and the global index are the same, unique global indexes apart from unique local indexes are recommended to be defined as local indexes. This is because maintaining a global index incurs higher partition management overheads than maintaining a local index, and global indexes and local indexes have similar query costs and partition pruning effects.
Global prefix indexes
A global prefix index is a global index where the first field is the partitioning key of the table. In other words, a global prefix index is created by partitioning the index based on the same partitioning rule as the table.
A global prefix index can be a unique index or a non-unique index.
A global prefix index makes sense only for a table partitioned by RANGE. For a table partitioned by HASH, a global prefix index is meaningless because a HASH-partitioned table is usually queried through exact-point queries by index key. Whether an index is a prefix index does not matter for partition pruning. If an index is partitioned by RANGE, partition pruning can be performed. If an index is partitioned by HASH, all partitions need to be accessed.
Global non-prefix indexes
OceanBase Database does not support global non-prefix indexes. Therefore, global non-prefix indexes have no significance for query optimization.
For example, a global index idx(c1,c2) is created on table A. The index idx is partitioned based on the c2 column. In this case, the idx index is a global non-prefix index. In this scenario, partition pruning can be performed only when all index keys are specified. Otherwise, all partitions of the index must be scanned. Therefore, there is no reason to use c2 for partitioning instead of c1, as partition pruning can be achieved with the c1 key but not with the c2 key.
Unique indexes
A unique index can be a global index or a local index.
A unique local index can be created only if the index key covers the partitioning key. For example, a unique index idx(c1,c2,c3) is created on table A. The index table idx is partitioned based on (c1,c2). This partitioning rule ensures that the same (c1,c2) values will definitely fall into the same partition. Therefore, only within a single partition is the uniqueness maintained. If the index table idx is partitioned based on (c2,c4), the index does not contain the partitioning key. In this case, a unique local index cannot be created because the index key cannot guarantee the uniqueness of partitions.
Index creation strategies
When you create an index, consider your query patterns, index management, performance, and availability requirements to choose the most suitable index strategy for your business.
If you need a unique index and the index key covers all partitioning keys, define it as a local index. Otherwise, define it as a global index.
If the partitioning keys of the primary table are a subset of the index keys, you can define a local index.
If the partitioning attribute of the primary table is the same as that of the index, we recommend that you define a local index.
If the cost of managing index partitions is a concern to you and the partitions of the primary table are frequently coalesced, we recommend that you do not create a global index. This is because frequent coalescence of table partitions causes significant changes in global indexes, making the indexes difficult to restore or render unavailable.
If your queries always specify all index partitioning keys, you can create an index on other columns without partitioning, thereby reducing the maintenance and storage costs; prefix indexes are suitable for scenarios with small data volumes where partition pruning can be performed, and non-prefix indexes generally suit scenarios with large data volumes where parallel partition access is beneficial.
Partition selection strategies
When you partition an index, if the index table contains a large volume of data, partitioning is necessary for query parallelism and load balancing.
If your queries are mostly single-point queries by index key, the query costs for hash-partitioned and range-partitioned indexes are similar in terms of the number of partitions accessed and the level of concurrency. However, hash partitioning can better avoid hot spots when data contains hot spots.
If your queries are mostly range queries by index key, range partitioning allows access to fewer partitions than hash partitioning. However, hash partitioning can better utilize concurrent queries when the result set is large. Therefore, hash partitioning has greater advantages over range partitioning in concurrent queries with large result sets.
Query-time index selection strategies
If you want to perform partition pruning, a prefix index is a better choice. With a prefix index, you can specify the partition prefix in the query filter condition to prune partitions and reduce the amount of data that needs to be read from indexes.
If users focus on throughput and the amount of data being accessed is relatively large, choosing a non-prefix index can improve performance. Range queries on partition keys can be addressed through partitioning and parallel processing. Local non-prefix indexes allow concurrent access to all partitions, whereas local prefix indexes perform partition pruning, meaning a small number of partitions handle large data sets, and the response time may be worse than concurrent queries.