Search index|V4.6.0|OceanBase Database| docs|Distributed Database

Search index

Last Updated：2026-05-07 11:26:24 Updated

This topic describes the search index, including its capabilities, prerequisites, supported data types and predicates, common DDL operations, and query examples.

A search index can be built on a single column or multiple columns to create a unified inverted index structure. It can establish index entries for multiple paths within a JSON column or for the entire value of a scalar column. A single index can cover multiple paths and support various predicates. Compared to creating separate function indexes for each access path or using multi-value indexes with partial array element expansion, search indexes emphasize building a single index that covers multiple paths, reducing the number of indexes and maintenance costs, and making it easier to handle the variable paths and diverse query patterns in semi-structured data.

Overview

Typical scenarios: Used together with hybrid search or with multi-column search indexes on analytical wide tables, and with index merge queries, to improve performance.
- AI hybrid search scenarios: Full-text tokenization and keyword retrieval are handled by the full-text index, while the search index focuses on structured and semi-structured conditions within complex types (which can also be used for simple scalar conditions). The two types of indexes can coexist in the same table and be used together with vector indexes.
- AP analytical scenarios: When search indexes are built on wide tables and multiple columns, the optimizer can automatically identify the best index combinations and scan paths using index merge capabilities, improving performance.
DDL operations: DDL operations for search indexes are the same as for regular indexes, supporting creation, deletion, renaming, and visibility changes.

Limitations

Tenants: This feature is only available for MySQL mode tenants.
Table type: The target table must be a ORGANIZATION = HEAP table.
Column-level configuration: Column-level WITH (...) only applies to JSON columns.
JSON path length: The encoded JSON path must be no longer than 2KB. If the JSON document has very long keys or deeply nested structures, the DML operation may fail with ERROR 1071 due to path length limits. It is recommended to avoid excessively long keys in the JSON document.
Array types: Only single-level arrays are supported; nested arrays are not supported.
Partitioned tables: Only local search indexes (LOCAL) are supported for partitioned tables.
Index columns: Accelerated LIKE pattern matching on index columns is not supported.
Generated columns: Search indexes cannot be created on generated columns (GENERATED); the STORING clause is not supported.

How to use

Supported data types

Search indexes are primarily used to accelerate queries on scalar columns and JSON/one-level array columns. The supported data types are as follows:

Category	Supported types
Integer types	`TINYINT`, `SMALLINT`, `MEDIUMINT`, `INT`, `BIGINT`, and their `UNSIGNED` variants
Floating-point/decimal types	`FLOAT`, `DOUBLE`, `DECIMAL`, and `NUMBER`
String types	`VARCHAR` and `CHAR`
Text types	`TINYTEXT`, `TEXT`, `MEDIUMTEXT`, and `LONGTEXT`
Binary types	`BINARY` and `VARBINARY`
Binary large objects	`TINYBLOB`, `BLOB`, `MEDIUMBLOB`, and `LONGBLOB`
Date and time types	`DATE`, `TIME`, `DATETIME`, `TIMESTAMP`, and `YEAR`
JSON type	`JSON`
Array types	`ARRAY(element_type)` (only one-level arrays, such as `ARRAY(INT)` and `ARRAY(VARCHAR(256))`)

The following limitations apply:

Search indexes cannot be created on columns of the BIT, ENUM, SET, or MAP type.
For PICK columns, the json_number field supports integer and floating-point/decimal types, and the json_string field supports string types.

Syntax and parameters

Search indexes can be created when creating a table, after the table is created, or appended to an existing table. The specific syntax and common parameters for each method are as follows:

Create when creating a table

Create after the table is created

CREATE TABLE table_name (
  column_name type
  [, column_name type ...]

  SEARCH INDEX index_name (
    column_name [WITH (option_list)]
    [, column_name [WITH (option_list)] ...]
  )
) ORGANIZATION = HEAP;

option_list:
    [INCLUDE_PATHS = ('path1'[, 'path2']...)]
    [| EXCLUDE_PATHS = ('path1'[, 'path2']...)]
    [| INCLUDE_TYPES = (type1[, type2]...)]

CREATE SEARCH INDEX index_name ON table_name (
    column_name [WITH (option_list)]
    [, column_name [WITH (option_list)] ...]
) [LOCAL];

option_list:
    [INCLUDE_PATHS = ('path1'[, 'path2']...)]
    [| EXCLUDE_PATHS = ('path1'[, 'path2']...)]
    [| INCLUDE_TYPES = (type1[, type2]...)]

You can also use the ALTER TABLE statement to create a search index:

ALTER TABLE table_name ADD SEARCH INDEX index_name (
  column_name [WITH (option_list)]
  [, column_name [WITH (option_list)] ...]
);

option_list:
    [INCLUDE_PATHS = ('path1'[, 'path2']...)]
    [| EXCLUDE_PATHS = ('path1'[, 'path2']...)]
    [| INCLUDE_TYPES = (type1[, type2]...)]

| Parameter | Description | Value range | |------------- ----|---------|---------| | option_list ... | A list of column-level options that control the JSON paths involved in index construction for the column. | | INCLUDE_PATHS / EXCLUDE_PATHS | A column-level path whitelist or blacklist. You can specify only one of these options. These options control the JSON paths involved in index construction for the column. By default, all paths are involved in index construction. The paths can be configured in two ways:

Simple deterministic paths. For example, $.key and $.a. In this case, only scalar and array data under $.key and $.a are indexed, and data under $.a.b is not indexed.
Prefix-matching paths. For example, $.a.*. In this case, JSON data under $.a is indexed, and data under $.a.b is also indexed. $.* matches all paths.

| | | INCLUDE_TYPES | The set of data types involved in index construction (values vary by actual version). For example, INCLUDE_TYPES = (JSON_STRING) indicates that only path indexes of the JSON_STRING type are constructed. By default, all types are involved in index construction. | JSON_STRING, JSON_NUMBER | | LOCAL | An optional parameter that indicates to create a local index. For partitioned tables, you must create a local search index when creating or appending a search index. |

Other operations such as deletion, renaming, and visibility changes are the same as for regular indexes. For more information, see the relevant documentation at the end of this topic.

Supported filtering queries

Note

Whether a search index is used for optimization is determined by the optimizer and the EXPLAIN execution plan. This topic only describes the typical range of filtering queries that can be accelerated.

Scalar

For scalar columns (numerical, string, and time), the following filtering queries can be accelerated:

Equality: col = value
Range: col > / >= / < / <= and BETWEEN
IN (list)
IS NULL / IS NOT NULL
AND / OR combinations: If multiple columns are defined in the same search index or combined with other indexes, Index Merge may be used.

Recommendation: For high selectivity and ordered range access, retain B-tree indexes. For low selectivity, OR combinations, or coexistence of JSON and arrays, search indexes with Index Merge are usually more efficient.

JSON

After a search index is created, you can directly query the internal fields of a JSON column without full table scans.

Category	Description
Value by path	In a JSON column, extract the specified scalar value by path, and then use the `JSON_EXTRACT()` function or `->` operator to compare it with a value by using equality or range comparison operators (`>`, `>=`, `<`, `<=`, `BETWEEN`, etc.). For example, `col->'$.price' > 100`.
`MEMBER OF`	Determines whether a value is an element in a JSON array. The right operand can be a column or a path expression. For example, `MEMBER OF(col)`.
`JSON_CONTAINS`	Determines whether a JSON document contains a specified scalar value or subdocument. For example, `JSON_CONTAINS(c1->'$.tags', '"mobile"')`.
`JSON_OVERLAPS`	Determines whether two JSON arrays have at least one common element. For example, `JSON_OVERLAPS(col1, '[1, 2, 3]')`.
`IN`	You can use the `IN (...)` clause to match the scalar value extracted by path or the entire column. For example, `IN (col->'$.price', col2)`.
`PICK`	When the value of a JSON field may be of multiple types (such as both numeric and string values exist in the same path), you can use the `PICK` keyword to filter by type in queries. You can specify `PICK json_number` or `PICK json_string` after the path. You can also use `PICK` with comparison operators, `MEMBER OF`, `JSON_CONTAINS`, and `JSON_OVERLAPS`. `json_number` and `json_string` indicate numeric and string types, respectively. For more information, see the "Supported data types" section above.

Notice

When you use the JSON_VALUE() function with the PICK keyword, the query cannot be accelerated by a search index if you also use the RETURNING, ON ERROR, or ON EMPTY clause.

Recommendation: Specify paths explicitly in queries. Use the WITH clause to narrow the index scope when a document contains many fields. Use PICK for mixed types in the same path. Use full-text indexes for keyword searches.

One-dimensional array

For ARRAY(element_type) columns:

Function	Description
`ARRAY_CONTAINS(col, value)`	Whether the array contains the specified element.
`ARRAY_CONTAINS_ALL(col, array_literal)`	Whether the array contains all elements in the specified array.
`ARRAY_OVERLAPS(col, array_literal)`	Whether the array has at least one common element.

Recommendation: Use one-dimensional arrays for tags and multi-value attributes. Select the contains, contains all, or overlaps function based on the semantics. You can also use these functions with B-tree conditions on tables for Index Merge.

Confirm whether an index is used

You can use the EXPLAIN statement to view the execution plan and confirm whether the query actually uses a search index. For example:

EXPLAIN SELECT * FROM products WHERE info->'$.price' > 100 AND category = 'electronics';

If the execution plan contains the name of a search index (such as idx_products), the query actually uses the search index.

Scenario examples

Here are some common scenario examples for reference.

Scalar column queries

The following example demonstrates how to create a search index on integer and string columns and perform various queries:

Create a sample table and search index, and insert test data

CREATE TABLE orders (
order_id INT,
user_id INT,
amount DOUBLE,
status VARCHAR(64),
SEARCH INDEX idx_orders (user_id, amount, status)
) ORGANIZATION = HEAP;

INSERT INTO orders VALUES (1, 100, 99.5, 'paid');
INSERT INTO orders VALUES (2, 101, 200.0, 'pending');
INSERT INTO orders VALUES (3, 100, 50.0, 'paid');
INSERT INTO orders VALUES (4, 102, 150.0, 'shipped');
INSERT INTO orders VALUES (5, 100, 300.0, 'refund');

Perform various queries

-- Equality query: Find all orders where user_id = 100
SELECT * FROM orders WHERE user_id = 100;
-- Expected result: three rows with order_id = 1, 3, and 5

-- Range query: Find orders with an amount between 80 and 200
SELECT * FROM orders WHERE amount > 80 AND amount < 200;
-- Expected result: two rows with order_id = 1 and 4

-- IN query: Find orders with status 'paid' or 'shipped'
SELECT * FROM orders WHERE status IN ('paid', 'shipped');
-- Expected result: three rows with order_id = 1, 3, and 4

-- Combined conditions
SELECT * FROM orders WHERE user_id = 100 AND status = 'paid';
-- Expected result: two rows with order_id = 1 and 3

Confirm that the index is being used
```
EXPLAIN SELECT * FROM orders WHERE user_id = 100;
```
The execution plan should include the search index name idx_orders.

JSON queries

Create a sample table and search index, and insert test data

CREATE TABLE products (
id INT, -- Scalar column
info JSON, -- JSON column
SEARCH INDEX idx_products (info)
) ORGANIZATION = HEAP;

INSERT INTO products VALUES
(1, '{"name": "Phone", "price": 999, "brand": "Apple", "tags": ["electronics", "mobile"]}'),
(2, '{"name": "Book", "price": 29, "brand": "O\'Reilly", "tags": ["education"]}'),
(3, '{"name": "Laptop", "price": 1999, "brand": "Dell", "tags": ["electronics", "computer"]}'),
(4, '{"name": "Pen", "price": 5, "brand": "Pilot", "tags": ["stationery"]}');

Perform various queries

-- Path comparison: Price between 10 and 1000
SELECT * FROM products WHERE info->'$.price' BETWEEN 10 AND 1000;
-- Expected result: Book and Phone

-- MEMBER OF query: Find products with "electronics" in tags
SELECT * FROM products WHERE '"electronics"' MEMBER OF (info->'$.tags');
-- Expected result: Phone and Laptop

-- JSON_CONTAINS query: Tags include both "electronics" and "mobile"
SELECT * FROM products WHERE JSON_CONTAINS(info->'$.tags', '["electronics", "mobile"]');
-- Expected result: Phone

Confirm that the index is being used
```
EXPLAIN SELECT * FROM products WHERE info->'$.price' BETWEEN 10 AND 1000;
```
The execution plan should include the search index name idx_products.

JSON filter queries

The following example demonstrates how to use the WITH clause to limit the index scope in a user profile scenario.

Create a sample table and search index, and insert test data

CREATE TABLE users (
id INT,
profile JSON,
SEARCH INDEX idx_profile (profile WITH (INCLUDE_PATHS = ('$.name', '$.address.*')))
) ORGANIZATION = HEAP;

INSERT INTO users VALUES
(1, '{"name": "Zhang San", "age": 30, "address": {"city": "Beijing", "zip": 100000}, "debug": "xxx"}'),
(2, '{"name": "Li Si", "age": 25, "address": {"city": "Shanghai", "zip": 200000}, "debug": "yyy"}');

Perform various queries

-- Query that uses the index (within INCLUDE_PATHS)
SELECT * FROM users WHERE JSON_EXTRACT(profile, '$.name') = '"Zhang San"';
-- Expected result: index is used

SELECT * FROM users WHERE JSON_EXTRACT(profile, '$.address.city') = '"Beijing"';
-- Expected result: index is used

-- Query that does not use the index (outside INCLUDE_PATHS)
SELECT * FROM users WHERE JSON_EXTRACT(profile, '$.age') = 30;
-- Expected result: full table scan (age path is not indexed)

Confirm that the index is being used
```
EXPLAIN SELECT * FROM users WHERE JSON_EXTRACT(profile, '$.age') = 30;
```
The execution plan should not include the search index name idx_profile, indicating a full table scan is being performed.

Index merge

The following example shows how to use index merge. The search index can work with B-tree indexes and full-text indexes on a table.

Create a table with multiple indexes and insert test data.

CREATE TABLE products_v2 (
id INT PRIMARY KEY,
name VARCHAR(128),
category INT,
price INT,
tags JSON,
INDEX idx_category(category), -- B-tree index
INDEX idx_price(price), -- B-tree index
FULLTEXT INDEX ft_name(name), -- Full-text index
SEARCH INDEX idx_tags(tags) -- Search index
) ORGANIZATION HEAP;

INSERT INTO products_v2 VALUES
(1, 'OceanBase Database', 1, 0, '{"type": "database", "license": "open-source"}'),
(2, 'MySQL Server', 1, 0, '{"type": "database", "license": "open-source"}'),
(3, 'Premium Support', 2, 999, '{"type": "service"}');

Run various queries.

-- category uses the B-tree index, tags uses the search index, and the results are intersected.
SELECT * FROM products_v2
WHERE category = 1 AND json_extract(tags, '$.license') = '"open-source"';

-- Full-text search matches "database" or tags with type "service"
SELECT * FROM products_v2
WHERE MATCH(name) AGAINST ("database") OR json_extract(tags, '$.type') = '"service"';

-- Complex combination: B-tree AND (B-tree OR search index)
SELECT * FROM products_v2
WHERE category = 1 AND (price = 0 OR json_extract(tags, '$.type') = '"service"');

References

This feature is mainly used in hybrid search scenarios. For more information about the scenarios, see Hybrid search.
For more information about the principles and concepts, see Search index.
For more information about the syntax and parameters, see CREATE INDEX.
For more information about the syntax and parameters, see CREATE TABLE.
For more information about the syntax and parameters, see ALTER TABLE.

OceanBase

Customer Stories

Documentation

Search index

Overview

Limitations

How to use

Supported data types

Syntax and parameters

Supported filtering queries

Note

Scalar

JSON

Notice

One-dimensional array

Confirm whether an index is used

Scenario examples

Scalar column queries

JSON queries

JSON filter queries

Index merge

References