This topic describes how to use the hybrid search (PL interface) of OceanBase Database to construct an internal SQL statement to combine full-text search and vector search, and then return the search results in JSON format. This way, you can efficiently perform multidimensional searches and improve the accuracy and usability of queries.
Hybrid search combines vector-based semantic search and full-text search based on full-text indexes. It provides more accurate and comprehensive search results by integrating the two methods. Vector search is good at semantic approximate matching but is not good at exact matching of keywords, numbers, and proper nouns. Full-text search can effectively compensate for this limitation. Therefore, hybrid search has become one of the key features of vector databases and is widely used in various products.
Usage
The hybrid search feature is provided through the new system package DBMS_HYBRID_SEARCH, which contains two subfunctions:
| Member Function Name | Description |
|---|---|
DBMS_HYBRID_SEARCH.SEARCH |
Returns search results in JSON format, sorted by relevance. |
DBMS_HYBRID_SEARCH.GET_SQL |
Returns the actual executed SQL statement as a string. |
For detailed syntax and parameter descriptions, see DBMS_HYBRID_SEARCH.
Considerations
- At present, the hybrid search PL interface can only be used on heap tables. The examples in this topic are based on heap tables.
- When using hybrid search, we recommend that you specify the
_sourcefield to define the range of columns to be returned. If you do not specify this field, all columns in the table will be returned by default. If the table contains vector columns, this will significantly increase the response time (RT).
Use cases and examples
Create sample tables and insert data
To demonstrate the hybrid search feature, this section will create several sample tables and insert data, which will be used in the search examples in the following sections.
-
productstable: A basic product information table for demonstrating scalar search. It contains product ID, name, description, brand, category, tags, price, stock quantity, release date, whether it is on sale, and a vector fieldvec.CREATE TABLE products ( `product_id` varchar(50) DEFAULT NULL, `product_name` varchar(255) DEFAULT NULL, `description` text DEFAULT NULL, `brand` varchar(100) DEFAULT NULL, `category` varchar(100) DEFAULT NULL, `tags` varchar(255) DEFAULT NULL, `price` decimal(10,2) DEFAULT NULL, `stock_quantity` int(11) DEFAULT NULL, `release_date` datetime DEFAULT NULL, `is_on_sale` tinyint(1) DEFAULT NULL, `vec` VECTOR(4) DEFAULT NULL ) ORGANIZATION HEAP;Insert data.
INSERT INTO products VALUES ('prod-001', 'Gamer-Pro Mechanical Keyboard', 'A responsive mechanical keyboard with customizable RGB lighting for the ultimate gaming experience.', 'GamerZone', 'Gaming', 'best-seller,gaming-gear,rgb', 149.00, 100, '2023-07-20 00:00:00.000000', 1, '[0.5,0.1,0.6,0.9]'), ('prod-002', 'Gamer-Pro Headset', 'High-fidelity gaming headset with a noise-cancelling microphone.', 'GamerZone', 'Gaming', 'best-seller,gaming-gear,audio', 149.00, 100, '2023-07-20 00:00:00.000000', 1, '[0.1,0.9,0.2,0]'), ('prod-003', 'Eco-Friendly Yoga Mat', 'A non-slip yoga mat made from sustainable and eco-friendly materials.', 'NatureFirst', 'Sports', 'eco-friendly,health', 49.99, 200, '2023-04-22 00:00:00.000000', 0, '[0.1,0.9,0.3,0]'); -
products_fulltexttable: Based on theproductstable, full-text indexes are created on theproduct_name,description, andtagscolumns for demonstrating full-text search.CREATE TABLE products_fulltext ( product_id VARCHAR(50), product_name VARCHAR(255), description TEXT, brand VARCHAR(100), category VARCHAR(100), tags VARCHAR(255), price DECIMAL(10, 2), stock_quantity INT, release_date DATETIME, is_on_sale TINYINT(1), vec vector(4), -- Create full-text indexes on columns that need full-text search FULLTEXT INDEX idx_product_name(product_name), FULLTEXT INDEX idx_description(description), FULLTEXT INDEX idx_tags(tags) ) ORGANIZATION HEAP;Insert data.
INSERT INTO products_fulltext VALUES ('prod-001', 'Gamer-Pro Mechanical Keyboard', 'A responsive mechanical keyboard with customizable RGB lighting for the ultimate gaming experience.', 'GamerZone', 'Gaming', 'best-seller,gaming-gear,rgb', 149.00, 100, '2023-07-20 00:00:00.000000', 1, '[0.5,0.1,0.6,0.9]'), ('prod-002', 'Gamer-Pro Headset', 'High-fidelity gaming headset with a noise-cancelling microphone.', 'GamerZone', 'Gaming', 'best-seller,gaming-gear,audio', 149.00, 100, '2023-07-20 00:00:00.000000', 1, '[0.1,0.9,0.2,0]'), ('prod-003', 'Eco-Friendly Yoga Mat', 'A non-slip yoga mat made from sustainable and eco-friendly materials.', 'NatureFirst', 'Sports', 'eco-friendly,health', 49.99, 200, '2023-04-22 00:00:00.000000', 0, '[0.1,0.9,0.3,0]'); -
doc_tabletable: A document table containing scalar columns, vector columns, and full-text indexed columns for demonstrating full-text search with scalar filters and hybrid search.CREATE TABLE doc_table( c1 INT, vector VECTOR(3), query VARCHAR(255), content VARCHAR(255), VECTOR INDEX idx1(vector) WITH (distance=l2, type=hnsw, lib=vsag), FULLTEXT INDEX idx2(query), FULLTEXT INDEX idx3(content) ) ORGANIZATION HEAP;Insert data.
INSERT INTO doc_table VALUES (1, '[1,2,3]', "hello world", "oceanbase Elasticsearch database"), (2, '[1,2,1]', "hello world, what is your name", "oceanbase mysql database"), (3, '[1,1,1]', "hello world, how are you", "oceanbase oracle database"), (4, '[1,3,1]', "real world, where are you from", "postgres oracle database"), (5, '[1,3,2]', "real world, how old are you", "redis oracle database"), (6, '[2,1,1]', "hello world, where are you from", "starrocks oceanbase database"); -
products_vectortable: Similar to theproductstable, but with a vector index explicitly created on theveccolumn for demonstrating pure vector search.CREATE TABLE products_vector ( `product_id` varchar(50) DEFAULT NULL, `product_name` varchar(255) DEFAULT NULL, `description` text DEFAULT NULL, `brand` varchar(100) DEFAULT NULL, `category` varchar(100) DEFAULT NULL, `tags` varchar(255) DEFAULT NULL, `price` decimal(10,2) DEFAULT NULL, `stock_quantity` int(11) DEFAULT NULL, `release_date` datetime DEFAULT NULL, `is_on_sale` tinyint(1) DEFAULT NULL, `vec` VECTOR(4) DEFAULT NULL, -- Create a vector index on the column that needs vector search VECTOR INDEX idx1(vec) WITH (distance=l2, type=hnsw, lib=vsag) ) ORGANIZATION HEAP;Insert data.
INSERT INTO products_vector VALUES ('prod-001', 'Gamer-Pro Mechanical Keyboard', 'A responsive mechanical keyboard with customizable RGB lighting for the ultimate gaming experience.', 'GamerZone', 'Gaming', 'best-seller,gaming-gear,rgb', 149.00, 100, '2023-07-20 00:00:00.000000', 1, '[0.5,0.1,0.6,0.9]'), ('prod-002', 'Gamer-Pro Headset', 'High-fidelity gaming headset with a noise-cancelling microphone.', 'GamerZone', 'Gaming', 'best-seller,gaming-gear,audio', 149.00, 100, '2023-07-20 00:00:00.000000', 1, '[0.1,0.9,0.2,0]'), ('prod-003', 'Eco-Friendly Yoga Mat', 'A non-slip yoga mat made from sustainable and eco-friendly materials.', 'NatureFirst', 'Sports', 'eco-friendly,health', 49.99, 200, '2023-04-22 00:00:00.000000', 0, '[0.1,0.9,0.3,0]'); -
products_multi_vectortable: A table containing multiple vector fields for demonstrating multi-vector search.CREATE TABLE products_multi_vector ( product_id VARCHAR(50), product_name VARCHAR(255), description TEXT, vec1 VECTOR(4), vec2 VECTOR(4), vec3 VECTOR(4), VECTOR INDEX idx1(vec1) WITH (distance=l2, type=hnsw, lib=vsag), VECTOR INDEX idx2(vec2) WITH (distance=l2, type=hnsw, lib=vsag), VECTOR INDEX idx3(vec3) WITH (distance=l2, type=hnsw, lib=vsag) ) ORGANIZATION HEAP;Insert data.
INSERT INTO products_multi_vector VALUES ('prod-001', 'Gamer-Pro Mechanical Keyboard', 'A responsive mechanical keyboard', '[0.5,0.1,0.6,0.9]', '[0.2,0.3,0.4,0.5]', '[0.1,0.2,0.3,0.4]'), ('prod-002', 'Gamer-Pro Headset', 'High-fidelity gaming headset', '[0.1,0.9,0.2,0]', '[0.3,0.4,0.5,0.6]', '[0.2,0.3,0.4,0.5]'), ('prod-003', 'Eco-Friendly Yoga Mat', 'A non-slip yoga mat', '[0.1,0.9,0.3,0]', '[0.4,0.5,0.6,0.7]', '[0.3,0.4,0.5,0.6]');
Scalar search
Some common use cases for scalar search include:
- E-commerce product filtering: Users want to view all products of a specific brand. For example, users want to view all products of the
GamerZonebrand. - Content management systems: Administrators need to filter articles or documents of a specific category. For example, find all articles by a specific author.
- User management systems: Find users of a specific status or role. For example, find all VIP users.
Here is an example:
Set the query parameters.
SET @parm = '{ "query": { "bool": { "must": [ {"term": {"brand": "GamerZone"}} ] } }, "_source": ["product_id", "product_name", "description", "brand", "category", "tags", "price", "stock_quantity", "release_date", "is_on_sale"] }';Search for all records where
brandis"GamerZone".SELECT json_pretty(DBMS_HYBRID_SEARCH.SEARCH('products', @parm));The result is as follows:
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | json_pretty(DBMS_HYBRID_SEARCH.SEARCH('products', @parm)) | +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | [ { "tags": "best-seller,gaming-gear,rgb", "brand": "GamerZone", "price": 149.00, "_score": 1, "category": "Gaming", "is_on_sale": 1, "product_id": "prod-004", "description": "A responsive mechanical keyboard with customizable RGB lighting for the ultimate gaming experience.", "product_name": "Gamer-Pro Mechanical Keyboard", "release_date": "2023-07-20 00:00:00.000000", "stock_quantity": 100 }, { "tags": "best-seller,gaming-gear,audio", "brand": "GamerZone", "price": 149.00, "_score": 1, "category": "Gaming", "is_on_sale": 1, "product_id": "prod-009", "description": "High-fidelity gaming headset with a noise-cancelling microphone.", "product_name": "Gamer-Pro Headset", "release_date": "2023-07-20 00:00:00.000000", "stock_quantity": 100 } ] | +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set
Scalar range search
Some common use cases for scalar range search include:
- Price range filtering: E-commerce platforms filter products by price range. For example, find products priced between
[30~80]. - Time range query: Find orders or logs within a specific time period. For example, find orders from the last 30 days.
- Numerical range filtering: Filter items by numerical ranges such as ratings or stock quantities. For example, find products rated between
[4~5].
Here is an example:
Set the search parameters.
SET @parm = '{ "query": { "range" : { "price" : { "gte" : 30, "lte" : 80 } } }, "_source": ["product_id", "product_name", "description", "brand", "category", "tags", "price", "stock_quantity", "release_date", "is_on_sale"] }';Query all records where
priceis in the range[30~80].SELECT json_pretty(DBMS_HYBRID_SEARCH.SEARCH('products', @parm));The result is as follows:
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | json_pretty(DBMS_HYBRID_SEARCH.SEARCH('products', @parm)) | +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | [ { "tags": "eco-friendly,health", "brand": "NatureFirst", "price": 49.99, "_score": true, "category": "Sports", "is_on_sale": 0, "product_id": "prod-003", "description": "A non-slip yoga mat made from sustainable and eco-friendly materials.", "product_name": "Eco-Friendly Yoga Mat", "release_date": "2023-04-22 00:00:00.000000", "stock_quantity": 200 } ] | +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set
Full-text search
The following are some typical use cases of full-text search:
- Document search: Search for documents that contain specific keywords in a large number of documents. For example, search for documents that contain the keyword
"how to use"in the FAQ. - Product search: Search for products by name or description. For example, search for products that contain the keyword
"OceanBase". - Knowledge base search: Search for related questions in the FAQ or help documentation. For example, search for answers to related questions in the knowledge base of a customer service system.
Here is an example:
Set the query parameters.
SET @query_str_with_mini = '{ "query": { "query_string": { "type": "best_fields", "fields": ["product_name^3", "description^2.5", "tags^1.5"], "query": "Gamer-Pro^2 keyboard^1.5 audio^1.2", "boost": 1.5 } }, "_source": ["product_id", "product_name", "description", "brand", "category", "tags", "price", "stock_quantity", "release_date", "is_on_sale"] }';Query records that contain the keywords
"Gamer-Pro","keyboard", and"audio"in theproduct_name,description, andtagsfields, and sort the records based on the weights of the specified fields and keywords.SELECT json_pretty(DBMS_HYBRID_SEARCH.SEARCH('products_fulltext', @query_str_with_mini));The returned result is as follows:
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | json_pretty(DBMS_HYBRID_SEARCH.SEARCH('products_fulltext', @query_str_with_mini)) | +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | [ { "tags": "best-seller,gaming-gear,rgb", "brand": "GamerZone", "price": 149.00, "_score": 4.569735248749978, "category": "Gaming", "is_on_sale": 1, "product_id": "prod-001", "description": "A responsive mechanical keyboard with customizable RGB lighting for the ultimate gaming experience.", "product_name": "Gamer-Pro Mechanical Keyboard", "release_date": "2023-07-20 00:00:00.000000", "stock_quantity": 100 }, { "tags": "best-seller,gaming-gear,audio", "brand": "GamerZone", "price": 149.00, "_score": 1.7338881172399914, "category": "Gaming", "is_on_sale": 1, "product_id": "prod-002", "description": "High-fidelity gaming headset with a noise-cancelling microphone.", "product_name": "Gamer-Pro Headset", "release_date": "2023-07-20 00:00:00.000000", "stock_quantity": 100 } ] | +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set
Full-text search with scalar filtering conditions
The following are some typical use cases of full-text search with scalar filtering conditions:
- Precise search: Search for text under specific conditions. For example, search for specific keywords in published articles.
- Access control: Search for data within the scope of user permissions. For example, search for product information in orders within a specific time period in an order system.
- Category search: Search for keywords in a specific category. For example, search for specific user information in active users in a user system.
Here is an example:
Set the query parameters.
-- Filter conditions, specify the scalar filtering condition c1 >= 2 SET @query_str = '{ "query": { "bool" : { "must" : [ {"query_string": { "fields": ["query", "content"], "query": "hello what oceanbase mysql"} } ], "filter" : [ {"range": {"c1": {"gte" : 2}}} ] } }, "_source": ["c1", "query", "content"] }';Query all records where
c1is greater than or equal to 2.SELECT json_pretty(DBMS_HYBRID_SEARCH.SEARCH('doc_table', @query_str));The returned result is as follows:
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | json_pretty(DBMS_HYBRID_SEARCH.SEARCH('doc_table', @query_str)) | +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | [ { "c1": 2, "query": "hello world, what is your name", "_score": 2.170969786679347, "content": "oceanbase mysql database" }, { "c1": 3, "query": "hello world, how are you", "_score": 0.3503184713375797, "content": "oceanbase oracle database" }, { "c1": 6, "query": "hello world, where are you from", "_score": 0.3503184713375797, "content": "starrocks oceanbase database" } ] | +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set
Vector search
The following are some typical use cases of vector search:
- Semantic search: Search for relevant content based on semantic similarity. For example, search for related questions and answers in a knowledge base.
- Recommendation system: Recommend similar products based on user preferences. For example, recommend similar products on an e-commerce platform.
- Image search: Search for similar images based on image features. For example, search for similar images in an image library.
- Intelligent Q&A: Search for related questions and answers in a knowledge base based on semantic similarity. For example, search for related questions and answers in the knowledge base of a customer service system.
Here is an example:
Set the search parameters.
-- field specifies the vector field, k specifies the number of returned results (the k nearest results), and query_vector specifies the query vector. SET @parm = '{ "knn" : { "field": "vec", "k": 3, "query_vector": [0.5,0.1,0.6,0.9] }, "_source": ["product_id", "product_name", "description", "brand", "category", "tags", "price", "stock_quantity", "release_date", "is_on_sale"] }';Search for all records where
vecis similar to[0.5,0.1,0.6,0.9].SELECT json_pretty(DBMS_HYBRID_SEARCH.SEARCH('products_vector', @parm));The returned result is as follows:
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | json_pretty(DBMS_HYBRID_SEARCH.SEARCH('products_vector', @parm)) | +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | [ { "tags": "best-seller,gaming-gear,rgb", "brand": "GamerZone", "price": 149.00, "_score": 1.0, "category": "Gaming", "is_on_sale": 1, "product_id": "prod-001", "description": "A responsive mechanical keyboard with customizable RGB lighting for the ultimate gaming experience.", "product_name": "Gamer-Pro Mechanical Keyboard", "release_date": "2023-07-20 00:00:00.000000", "stock_quantity": 100 }, { "tags": "eco-friendly,health", "brand": "NatureFirst", "price": 49.99, "_score": 0.43405784, "category": "Sports", "is_on_sale": 0, "product_id": "prod-003", "description": "A non-slip yoga mat made from sustainable and eco-friendly materials.", "product_name": "Eco-Friendly Yoga Mat", "release_date": "2023-04-22 00:00:00.000000", "stock_quantity": 200 }, { "tags": "best-seller,gaming-gear,audio", "brand": "GamerZone", "price": 149.00, "_score": 0.42910841, "category": "Gaming", "is_on_sale": 1, "product_id": "prod-002", "description": "High-fidelity gaming headset with a noise-cancelling microphone.", "product_name": "Gamer-Pro Headset", "release_date": "2023-07-20 00:00:00.000000", "stock_quantity": 100 } ] | +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set
Vector search with scalar filtering conditions
The following are some typical use cases of vector search with scalar filtering conditions:
- Precise search: Search for text under specific conditions. For example, search for specific keywords in published articles.
- Access control: Search for data within the scope of user permissions. For example, search for product information in orders within a specific time period in an order system.
- Category search: Search for keywords in a specific category. For example, search for specific user information in active users in a user system.
Here is an example:
Set the search parameters.
-- Specify the scalar filtering condition brand = "GamerZone" SET @parm = '{ "knn" : { "field": "vec", "k": 3, "query_vector": [0.1,0.5,0.3,0.7], "filter" : [ {"term" : {"brand": "GamerZone"} } ] }, "_source": ["product_id", "product_name", "description", "brand", "category", "tags", "price", "stock_quantity", "release_date", "is_on_sale"] }';Search for all records where
vecis similar to[0.1,0.5,0.3,0.7]andbrandis"GamerZone".SELECT json_pretty(DBMS_HYBRID_SEARCH.SEARCH('products_vector', @parm));The returned result is as follows:
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | json_pretty(DBMS_HYBRID_SEARCH.SEARCH('products_vector', @parm)) | +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | [ { "tags": "best-seller,gaming-gear,rgb", "brand": "GamerZone", "price": 149.00, "_score": 0.59850837, "category": "Gaming", "is_on_sale": 1, "product_id": "prod-001", "description": "A responsive mechanical keyboard with customizable RGB lighting for the ultimate gaming experience.", "product_name": "Gamer-Pro Mechanical Keyboard", "release_date": "2023-07-20 00:00:00.000000", "stock_quantity": 100 }, { "tags": "best-seller,gaming-gear,audio", "brand": "GamerZone", "price": 149.00, "_score": 0.55175342, "category": "Gaming", "is_on_sale": 1, "product_id": "prod-002", "description": "High-fidelity gaming headset with a noise-cancelling microphone.", "product_name": "Gamer-Pro Headset", "release_date": "2023-07-20 00:00:00.000000", "stock_quantity": 100 } ] | +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set
Multi-vector search
Multi-vector search refers to searching across multiple vector indexes and returning the most similar records.
Here is an example:
Set the search parameters.
-- Specify 3 vector search paths, each specifying a vector index field, the number of results to return, and the query vector. SET @param_multi_knn = '{ "knn" : [{ "field": "vec1", "k": 5, "query_vector": [0.5,0.1,0.6,0.9] }, { "field": "vec2", "k": 5, "query_vector": [0.2,0.3,0.4,0.5] }, { "field": "vec3", "k": 5, "query_vector": [0.1,0.2,0.3,0.4] } ], "size" : 5, "_source": ["product_id", "product_name", "description"] }';Execute the query and return the query results.
SELECT json_pretty(DBMS_HYBRID_SEARCH.SEARCH('products_multi_vector', @param_multi_knn));+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | json_pretty(DBMS_HYBRID_SEARCH.SEARCH('products_multi_vector', @param_multi_knn)) | +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | [ { "_score": 3.0, "product_id": "prod-001", "description": "A responsive mechanical keyboard", "product_name": "Gamer-Pro Mechanical Keyboard" }, { "_score": 2.0957750699999997, "product_id": "prod-002", "description": "High-fidelity gaming headset", "product_name": "Gamer-Pro Headset" }, { "_score": 1.86262927, "product_id": "prod-003", "description": "A non-slip yoga mat", "product_name": "Eco-Friendly Yoga Mat" } ] | +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set
Full-text and vector hybrid search
Some common use cases of full-text and vector hybrid search are as follows:
- Intelligent search: A comprehensive search that combines keywords and semantic understanding. For example, when a user enters
"I need a gaming keyboard", the system matches keywords such as"gaming"and"keyboard"and understands the semantic meaning of"gaming equipment". - Document search: In a large number of documents, both exact keyword matching and semantic understanding are supported. For example, when you search for
"database optimization", the system matches documents containing these keywords and finds content related to"performance tuning"and"query optimization". - Product recommendation: On an e-commerce platform, both product name search and demand description search are supported. For example, based on the user's description
"a laptop suitable for office work", the system matches keywords and understands the semantic demand of"business office work".
Here is an example:
Set the search parameters.
SET @parm = '{ "query": { "bool": { "should": [ {"match": {"query": "hi hello"}}, {"match": { "content": "oceanbase mysql" }} ] } }, "knn" : { "field": "vector", "k": 5, "query_vector": [1,2,3] }, "rank": { "rrf": { "rank_window_size": 10, "rank_constant": 60 } }, "_source" : ["query", "content"] }';Execute the query and return the query results.
SELECT json_pretty(DBMS_HYBRID_SEARCH.SEARCH('doc_table', @parm));The returned results are as follows:
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | json_pretty(DBMS_HYBRID_SEARCH.SEARCH('doc_table', @parm)) | +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | [ { "query": "hello world", "_score": 0.0325, "content": "oceanbase Elasticsearch database" }, { "query": "hello world, what is your name", "_score": 0.0323, "content": "oceanbase mysql database" }, { "query": "hello world, how are you", "_score": 0.0315, "content": "oceanbase oracle database" }, { "query": "real world, how old are you", "_score": 0.0161, "content": "redis oracle database" }, { "query": "hello world, where are you from", "_score": 0.0159, "content": "starrocks oceanbase database" }, { "query": "real world, where are you from", "_score": 0.0156, "content": "postgres oracle database" } ] | +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set
Full-text and vector RRF hybrid search
By default, the results of a full-text subquery and a vector subquery are combined using weighted hybrid methods. You can use the Rank syntax to configure the combination method to RRF (Reciprocal Rank Fusion) sorting hybrid. Some common use cases are as follows:
- Multi-dimensional sorting: When you need to consider the results of multiple search dimensions. For example, in an academic search system, when you search for a paper in a paper database, you need to consider both keyword matching and semantic relevance.
- Fairness requirements: When you need to ensure that the results of different search methods are reasonably displayed. For example, on an e-commerce platform, you need to consider both the text information such as the title and description of a product and the visual information such as the image and video of a product.
- Complex queries: When you need to search for complex scenarios involving multiple query conditions. For example, in a medical system, you need to consider both the symptoms of a patient and the patient's medical history and test results.
Here is an example:
Set the search parameters.
SET @rrf_query_param = '{
"query": {
"query_string": {
"fields": ["title", "author", "description"],
"query": "fiction American Dream"
}
},
"knn" : {
"field": "vector_embedding",
"k": 5,
"query_vector": [0.1, 0.2, 0.3, 0.4]
},
"rank" : {
"rrf" : {
"rank_window_size" : 10,
"rank_constant" : 60
}
},
"_source": ["title", "author", "description"]
}';
The RRF algorithm calculates the final relevance score by fusing the rankings of multiple subquery result sets. The calculation formula is as follows:
score = 0.0
for q in queries:
if d in result(q):
score += 1.0 / ( k + rank( result(q), d ) ) # K is the configured rank_constant
return score
Summary
The examples in this topic demonstrate the powerful application value of hybrid search:
- Intelligent search upgrade: Integrate semantic understanding into traditional keyword search to provide more accurate and user-intent-aligned search results.
- Optimize user experience: Support natural language queries to simplify operations and improve information retrieval efficiency.
- Enable diverse business scenarios: Widely applied in e-commerce, content management, knowledge bases, and intelligent customer service to cover everything from basic filtering to intelligent recommendations.
- Leverage technical advantages: Combine exact matching with semantic understanding to significantly improve the accuracy and comprehensiveness of search results.
Hybrid search is an ideal choice for handling large volumes of unstructured data and building intelligent search and recommendation systems.
See Also
- DBMS_HYBRID_SEARCH Subprograms Overview
