The GET_SQL function returns the SQL statement that is actually executed as a string.
Syntax
FUNCTION GET_SQL (IN table_name VARCHAR(65535),
IN search_params LONGTEXT)
RETURN LONGTEXT;
Parameters
| Parameter | Description | Required |
|---|---|---|
| table_name | The name of the table. This parameter supports heap tables and tables without a primary key. | Yes |
| search_params | The search parameters, which are a JSON string. | Yes |
search_params syntax
search_params is a JSON string. The syntax is described in this section. Please refer to the parameters and examples below for a better understanding.
Syntax
This section describes the BNF syntax symbols and their usage rules:
Optional parameters
[ ]in BNF indicates that multiple elements can be optional. For example,param_list = param [, param]*means thatparam_listcan contain 1 or moreparamelements.rank_featureandrank_expressionalso use[ ]to indicate that subparameters are optional.[, "boost" : boost_value]indicates that the boost subparameter is optional, and all boost subparameters are optional.
Arrays
[ ]in JSON structures represent arrays, such as[condition_list].
Selection relationships
|indicates a selection relationship. For example,param = "query" | "knn"means thatparamcan be either "query" or "knn".
Repeated elements
*indicates zero or more repetitions. For example,param_list = param [, param]*means thatparam_listcan contain 1 or moreparamelements.
JSON format requirements
- All JSON field names and string values must be enclosed in double quotation marks.
- Numerical values do not need to be enclosed in double quotation marks.
Syntax definition
This section provides a detailed explanation of the syntax of search_params. For more information about the parameters, see the table below.
Top-level parameters
-- Top-level keyword parameters
search_params = '{param_list}'
param_list = param [, param]*
-- The query and knn parameters are used for full-text/scalar and vector search, respectively. At least one of them must be specified. If both are specified, they are used for hybrid search.
param = "query" : {query_expression}
| "knn" : {knn_expression}
| "rank" : {rank_expression}
| "_source" : [field_list]
| "from" : number
| "size" : number
| "es_mode" : boolean
Query expression structure
query_expression = bool_query | single_term
-- bool_query syntax
bool_query = "bool" : {bool_condition_list}
bool_condition_list = bool_condition [, bool_condition]*
-- The boost subparameter is optional.
bool_condition = "must" : [condition_list]
| "should" : [condition_list]
| "must_not" : [condition_list]
| "filter" : [condition_list]
| "boost" : boost_value
-- Nested bool_queries are supported.
condition_list = query_expression [, query_expression]*
Single-term query syntax
single_term = range_query | match_query | term_query | query_string | multi_match | rank_feature
-- range_query syntax
range_query = "range" : {"field_name" : {range_condition_list}}
range_condition_list = range_condition [, range_condition]*
-- boost subparameter is optional
range_condition = "gte" : number
| "gt" : number
| "lte" : number
| "lt" : number
| "boost" : boost_value
-- match_query syntax
match_query = "match" : {"field_name" : {match_condition}}
-- boost subparameter is optional
match_condition = "query" : "string_value" [, "boost" : boost_value]
-- term_query syntax
term_query = "term" : {term_condition_list}
term_condition_list = term_condition [, term_condition]*
term_condition = "field_name" : scalar_value
| "field_name" : term_value_object
-- boost subparameter is optional
term_value_object = "value" : scalar_value [, "boost" : boost_value]
-- Full-text search expression syntax, including query_string and multi_match
query_string = "query_string" : {query_string_condition}
-- fields and query subparameters are required
query_string_condition = "fields" : [field_weight_list]
| "query" : "string_value"
| "boost" : boost_value
| "type" : ("best_fields" | "cross_fields" | "most_fields" | "phrase")
| "default_operator" : ("AND" | "OR")
| "minimum_should_match" : number
field_weight_list = field_weight [, field_weight]*
-- No space is allowed between field_name, ^, and number
field_weight = "field_name[^number]"
-- multi_match syntax, supported since V4.4.1 BP0 HotFix1
multi_match = "multi_match" : {multi_match_condition}
-- fields and query subparameters are required
multi_match_condition = "fields" : [field_weight_list]
| "query" : "string_value"
| "boost" : boost_value
| "type" : ("best_fields" | "cross_fields" | "most_fields" | "phrase")
| "operator" : ("AND" | "OR")
| "minimum_should_match" : number
field_weight_list = field_weight [, field_weight]*
-- No space is allowed between field_name, ^, and number
field_weight = "field_name[^number]"
-- Feature-based ranking expression syntax
rank_feature = "rank_feature" : {"field_name" : {rank_algorithm}}
rank_algorithm = "saturation" : {"pivot" : number[, "positive_score_impact" : boolean]}
| "sigmoid" : {"pivot" : number, "exponent" : number[, "positive_score_impact" : boolean]}
| "log" : {"scaling_factor" : number[, "positive_score_impact" : boolean]}
| "linear" : {["positive_score_impact" : boolean]}
Vector search expression
-- knn_expression syntax
knn_expression = "knn" : {knn_condition_list}
knn_condition_list = knn_condition [, knn_condition]*
-- field, k, and query_vector are required parameters
knn_condition = "field" : "field_name"
| "k" : number
| "query_vector" : [vector_values]
| "filter" : [condition_list]
| "similarity" : number
| "boost" : boost_value
vector_values = float [, float]*
condition_list = {condition [, condition]*}
condition = single_term
-- rank_expression syntax, RRF is supported from V4.4.1 BP0 HotFix1
rank_expression = "rank" : {rank_strategy}
rank_strategy = "rrf" : {rrf_params}
rrf_params = "rank_window_size" : number [, "rank_constant" : number]
Basic data types
-- Basic data types
field_name = "string_value"
field_list = field_name [, field_name]*
number = integer | decimal
boost_value = integer | float -- boost parameter value must be >= 0
boolean = true | false
scalar_value = "string_value" | number | boolean
Parameter details
The detailed parameters of search_params are described as follows:
| Expression type | Parameter name | Parameter description |
|---|---|---|
| Top-level keyword parameters | ||
| query | Can be used alone for full-text search and can be used together with the knn parameter for hybrid search. |
|
| knn | Can be used alone for vector search and can be used together with the query parameter for hybrid search. |
|
| rank (optional) | Specifies the sorting strategy for hybrid search. The RRF (Reciprocal Rank Fusion) algorithm is supported.
NoticeThis parameter is supported starting from V4.4.1 BP0 HotFix1. |
|
| _source (optional) | Specifies the columns to be returned in the search results. If not specified, all user-defined columns in the table are returned. | |
| from (optional) | Specifies the starting row for returning search results. If not specified, the default is the first row. It must be used together with the size parameter. |
|
| size (optional) | Specifies the maximum number of search results to return. If not specified, no limit is applied. | |
| es_mode | Specifies whether to convert full-text search to the new ESQL syntax. The default value is false. |
|
| bool | must | Must be satisfied. Scoring is required. When nested bool expressions are needed, the conditions within the bool expression are combined using AND logic by default. |
| should | Should be satisfied. Similar to OR. Scoring is required. When nested bool expressions are needed, the conditions within the bool expression are combined using AND logic by default. | |
| must_not | Must not be satisfied. Scoring is not required. Converted to a 'NOT' expression. Conditions within must_not are combined using AND logic. When nested bool expressions are needed, the conditions within the bool expression are combined using AND logic by default. | |
| filter | Must be satisfied. Scoring is not required. Converted to an 'AND' expression. When nested bool expressions are needed, the conditions within the bool expression are combined using AND logic by default. | |
| boost (optional) | Query weight. For more information, see the detailed description of the boost parameter below. | |
| rank_feature (parameters for calculating relevance scores) | pivot | A required parameter for the saturation and sigmoid algorithms. The default value is the geometric mean of the data in this list. |
| positive_score_impact (optional) | Specifies whether the field's impact on the final relevance is positive or negative. | |
| scaling_factor | A required parameter for the log scoring calculation formula. | |
| exponent | A required parameter for the sigmoid algorithm calculation formula. | |
| rank_feature (relevance score algorithms) | saturation | Default relevance score algorithm
|
| sigmoid |
|
|
| log |
|
|
| linear |
|
|
| single term (single-term search) | ||
| range | Range search. Used with gte, gt, lte, lt, and boost. fieldname is required. |
|
| match | Fuzzy match. Converted to the SQL 'match' expression. Used with boost. | |
| term | Exact match. Supports scalar values such as strings, numbers, and boolean values. Converted to the SQL '=' expression. Used with boost. | |
| query_string | Full-text match. Converted to a combination of multiple 'match' expressions in SQL. | |
| multi_match | Full-text match. Converted to a combination of multiple 'match' expressions in SQL. Similar to query_string, but single keywords do not support weights.
NoticeThis parameter is supported starting from V4.4.1 BP0 HotFix1. |
|
| fields | List of text search fields. Weights can be configured for each column. | |
| query | Retrieves a list of keywords, and you can configure the weight of each keyword. | |
| minimum_should_match (optional) | Controls the number of conditions that must be satisfied in the should or query_string clauses. If this parameter is not specified, the default value is 1. Note: If the bool expression contains must/filter clauses and this parameter is not specified, the default value will be 0, meaning that the should conditions do not need to be satisfied. |
|
| boost (optional) | The query weight. For more information, see the detailed description of the boost parameter below. | |
| type (optional) | Specifies the matching mode. Supported values: best_fields, cross_fields, most_fields, and phrase. The default value is best_fields if this parameter is not specified. |
|
| default_operator (optional) | A subfield of query_string that specifies the combination logic between multiple keywords. | |
| operator (optional) | A subfield of multi_match that specifies the combination logic between multiple keywords. | |
| knn (vector search) | ||
| field | The vector search field. | |
| k | The number of rows to return for vector search. | |
| query_vector | The search vector. | |
| filter (optional) | The filter condition. | |
| similarity (optional) | Specifies the filter condition for the vector distance. | |
| boost (optional) | The query weight. For more information, see the detailed description of the boost parameter below. | |
| rank (RRF sorting strategy) | rrf | The RRF (Reciprocal Rank Fusion) sorting strategy, which is used to integrate and sort multiple query results in hybrid search.
NoticeThis parameter is supported starting from V4.4.1 BP0 HotFix1. |
| rank_window_size (optional) | This parameter specifies the size of the result set returned for each query. A larger value indicates higher relevance but also higher performance overhead. The final sorted result set will be trimmed to the specified size.rank_window_size must meet the following conditions:
size parameter. |
|
| rank_constant (optional) | This parameter controls the impact of each document in the result set on the final sorting result. A larger value indicates a greater impact of documents ranked lower on the final result. The default value is 60. |
boost parameter details
The boost parameter specifies the weight of the query conditions in the final relevance calculation. The value must be ≥ 0, and the default value is 1 if not specified. In the syntax structure above, the bool, single_term (excluding rank_feature), and knn parameters support specifying the boost parameter. The usage of the boost parameter varies by search type:
Query-level boost
Specifies the weight for the entire query condition, for example:
{ "bool": { "must": [{"term": {"category": "Gaming"}}], "boost": 2.0 // Weight for the entire bool query } }Field-level boost
Specifies the weight for a specific field's query:
{ "query_string": { "fields": ["product_name", "description"], "query": "gaming keyboard", "boost": 1.5 // Weight for the entire query_string query } }Match value-level boost
Specifies the weight for a specific match value (only supported by match and term queries):
{ "match" : { "product_name": { "query" : "gaming keyboard", "boost" : 1.5 } } }Field weight syntax
In query_string and multi_match, you can use the
field_name^weightsyntax:{ "query_string": { "fields": ["product_name^2.0", "description^1.0"], "query": "gaming" } }
Examples
Create a table with one vector column, create a vector index for the vector column, and two varchar columns, and create full-text indexes for the two varchar columns.
CREATE TABLE doc_table(c1 INT, vector VECTOR(3), query VARCHAR(255), content VARCHAR(255),
VECTOR INDEX idx1(vector) WITH (distance=l2, type=hnsw, lib=vsag), FULLTEXT INDEX idx2(query),
FULLTEXT INDEX idx3(content));
Set the search parameters.
SET @parm = '{
"query": {
"bool": {
"should": [
{"match": {"query": "hi hello"}},
{"match": { "content": "oceanbase mysql" }}
],
"filter": [
{"term": { "content" : "postgres" }}
]
}
},
"knn" : {
"field": "vector",
"k": 5,
"query_vector": [1,2,3]
},
"_source" : ["query", "content", "_keyword_score", "_semantic_score"]
}';
Execute the query and return the query result.
SELECT DBMS_HYBRID_SEARCH.GET_SQL('doc_table', @parm);
The result is as follows:
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| dbms_hybrid_search.get_sql('doc_table', @parm) |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| SELECT query, content, _keyword_score, _semantic_score, (ifnull(_fts._keyword_score, 0) + ifnull(_vs._semantic_score, 0)) as _score FROM ((SELECT /*+ opt_param('hidden_column_visible', 'true') union_merge( doc_table idx2 idx3)*/__pk_increment, (match(query) against('hi hello' in boolean mode) + match(content) against('oceanbase mysql' in boolean mode)) as _keyword_score FROM wxj.doc_table WHERE (content = 'postgres') ORDER BY _keyword_score DESC) _fts right join (SELECT /*+ opt_param('hidden_column_visible', 'true') */l2_distance(vector, '[1, 2, 3]') as _distance, __pk_increment, query, content, round((1 / (1 + l2_distance(vector, '[1, 2, 3]'))), 8) as _semantic_score FROM wxj.doc_table ORDER BY _distance APPROXIMATE LIMIT 5) _vs on (_fts.__pk_increment = _vs.__pk_increment)) ORDER BY _score DESC |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+