In OceanBase Database, you can use dot notation and JSON functions to access JSON data. OceanBase Database recommends that you directly store JSON data in the JSON base type.
Access JSON data by using dot notation
The dot notation syntax is essentially a table alias, followed by a JSON column name, and then one or more field names, separated by periods (.) to indicate the fields.
table_alias.json_col_name.json_field
# or
.json_field followed by array_step.(field_name)
When using the dot notation syntax as a query parameter in SQL, the following constraints apply:
table_aliasmust be an alias for the table.json_col_namemust be valid JSON data (with theIS JSONconstraint or of the JSON type).array_stepcannot be used alone and must followjson_field, for example:.json_field[1,2].- The identifier length for each node must not exceed 128 bytes.
- The size of the query return value must not exceed 4k. If it does, the return value is
NULL.
In the following example, po is an alias for the j_purchaseorder table, po_doc is a JSON column or a VARCHAR2, BLOB, or CLOB column with the IS JSON constraint, and PONumber is the field name in the JSON data.
SELECT po.po_doc.PONumber FROM j_purchaseorder po;
The dot notation syntax can also be used as a parameter for a function and can call methods supported by JSON Path. Here is an example:
SELECT SUBSTR(po.po_doc.PONumber.string(),2,2) FROM j_purchaseorder po;
Access JSON data by using JSON path
Because JSON files are hierarchical, you can use a path expression to extract a part of a JSON document, modify a part of a JSON document, or specify a location in the document to operate on. JSON path matches JSON data by using JSON functions and conditions, and selects zero or more matching or satisfying JSON values. JSON path can use wildcards and array ranges, and the match is case-sensitive.
JSON Path syntax
The basic syntax of JSON Path consists of a context symbol ($) followed by zero or more object, array, and recursive steps. Each node can be followed by a filter expression or a function node. JSON Path (also known as a path expression) includes absolute path expressions or relative path expressions.
- An absolute path expression starts with a
$symbol and is followed by zero or more nodes (for example, in the expression $.filed_a.field_b,$,field_a, andfield_bare all nodes). - A relative path expression is denoted by an
@symbol and is similar to an absolute path expression, except that it uses only relative path expressions in the filter expression.
In a basic path expression, a single function node (Function Step) is optional and typically follows the item_method. If present, it is the last step in the path expression. An object node (Object Step) is a period (.) followed by a field or a wildcard (*), representing one or all fields.
An array node (Array Step) is a left bracket ([) followed by a star wildcard (*) and a right bracket (]). It represents all array elements, one or more specific array indices, or a specified range separated by commas (,).
In a path expression, array indexing specifies a single array position. Array indices can be integer literals (0, 1, 2, ...). Array positions and indices are zero-based, with the first array element having an index of 0 (specifying position 0). You can use the last keyword to reference the last element of any non-empty array. Array indexing can also be in the last - N format, where - represents subtraction. N is an integer literal (0, 1, 2, ...) that is not greater than "array size - 1". Ranges are specified as "N to M", for example, "3 to 1", "2 to 4", and "last-1 to last-2". For example, in the original data ["1", "2", "3", "4", "5", "6", "7", "8", "9"], the selected objects are [3 to 1, 2 to 4, last-1 to last-2, 0, 0], resulting in ["2", "3", "4", "3", "4", "5", "7", "8", "1", "1"].
A descendant node (Descendant Step) is denoted by two consecutive periods (..) followed by a field. It recursively descends into objects or arrays that match the preceding node and returns all collected field values. Here is an example:
obclient> SELECT JSON_QUERY('{ "a" : { "b" : { "z" : 1 }, "c" : [ 5, { "z" : 2 } ], "z" : 3 }, "z" : 4 }', '$.a..z' WITH ARRAY WRAPPER) FROM DUAL;
+----------------------------------------------------------------------------------------+
| JSON_QUERY('{"A":{"B":{"Z":1},"C":[5,{"Z":2}],"Z":3},"Z":4}','$.A..Z'WITHARRAYWRAPPER) |
+----------------------------------------------------------------------------------------+
| [3,1,2] |
+----------------------------------------------------------------------------------------+
1 row in set
A filter expression (Filter Expression, abbreviated as Filter) is denoted by a question mark (?) followed by a filter condition enclosed in parentheses (()). If the filter condition is met, it returns true. A filter condition (Filter Condition) uses a predicate (boolean function) as an argument. The filter condition can be in the following forms, where each cond, cond1, and cond2 represents a filter condition.
(cond): Parentheses are used for grouping, separating the filter conditioncondfrom other filter conditions before or after it.cond1 && cond2: Requires bothcond1andcond2to be satisfied.cond1 || cond2: Requires eithercond1orcond2or both to be satisfied.! (cond): The negation ofcond, meaningcondmust not be satisfied.exists(followed by a relative path expression): The target data exists under the specified conditions.- A comparison predicate has the following forms:
- A relative path expression, followed by a comparison predicate, followed by a JSON scalar value or a JSON variable.
- A JSON scalar value or a JSON variable, followed by a comparison predicate, followed by a relative path expression.
- A JSON scalar value, followed by a comparison predicate, followed by another JSON scalar value.
- The supported connection predicates are
&&,||, and!. The supported comparison predicates are>,>=,<,<=,==, and!=. For strings, the supported comparison predicates arehas substringandstarts with. The comparison predicateslike,like_regex, andeq_regexare not supported.
Here are some examples of accessing JSON data using JSON Path:
$.fruits # The value of the fruits field in an object
$.fruits[0] # The first element of an array object
$.fruits[0].name # The value of the name field in an object, which is the first element of the fruits array
$.fruits[*].name # The value of the name field in each object of the fruits array
$.*[*].name # The value of the name field in all array objects contained in the object
$.fruits[3, 7 to 10, 12] # The 4th, 8th to 11th, and 13th elements of the fruits array. The elements must be specified in ascending order.
$.fruits[3].price # The value of the price field in the 4th element of the fruits array
$.fruits[3].* # The value of the 4th element of the fruits array
$.fruits[3].price[0].year # The value of the year field in the 1st element of the price array in the 4th element of the fruits array.
$.fruits[3].price[0]?(@.year > 2016) # The value of the 1st element of the price array in the 4th element of the fruits array, if the following conditions are met: the value of the year field in the 1st element of the price array can be converted to a number and is greater than 2016
$.fruits[3].price[0]?(@.year.number() > 2016) # The same as the previous example, but the year field value must have a number() method (which returns a number, such as 2017, for strings and arrays like "2017") and the method's return value must be greater than 2016
$.fruits[3].price[0]?(@.year.numberOnly() > 2016) # The same as the previous example, but only the numberOnly() method of the year field value is used, excluding cases where the year value is a string number (such as "2017").
$.fruits[3]?(@.produce.city == "San Francisco") # The 4th element of the fruits array, provided it has a produce field whose value is an object with a city field whose value is the string "San Francisco".
JSON Path also supports a relaxed syntax. The difference between the relaxed and strict syntax is in how arrays are handled. In relaxed mode, accessed JSON data is automatically wrapped in array type encapsulation or the data encapsulation is removed.
For example, in relaxed mode, $.fruits is equivalent to $.fruits or $[*].fruits in strict mode. In relaxed mode, $.fruits[0].name is equivalent to $.fruit[0].name, $.fruit.name, $[*].fruits.name, or $[*].fruits[0].name in strict mode.
Function method (item_method)
The function method (item_method) of JSON Path is the final step in a path expression. It can convert the target JSON data into (possibly other) JSON data. However, queries using path expressions (with or without item_method) can return data in SQL data types that do not support JSON data. The following table lists the function methods supported in the current version.
| Method | Description |
|---|---|
| abs() | Returns the absolute value of a JSON data value of the numeric type. Corresponds to the SQL function ABS(). |
| boolean()/ | |
| booleanOnly() | Converts a JSON data value to an SQL VARCHAR2 value. |
| ceiling() | Rounds a JSON data value of the numeric type to the nearest integer. Corresponds to the SQL function CEIL(). |
| floor() | Rounds a JSON data value of the numeric type to the nearest integer. Corresponds to the SQL function FLOOR(). |
| double() | Converts a JSON data value of the STRING or NUMBER type to an SQL BINARY DOUBLE value. |
| type() | Returns the name of the JSON data type of the target data.
|
| size() | Returns the size of a JSON item. |
| number() /numberOnly() | Converts a JSON data value of the STRING or NUMBER type to an SQL NUMBER value. |
| string()/stringOnly() | Returns the string representation of the target JSON data. This representation is the same as the VARCHAR2 type returned by the RETURNING clause used in JSON functions. (For the STRING type, true is "true" and false is "false". If the value is null, "null" is returned. For a numeric value, the value is returned in its standard form.) Any errors during conversion are ignored. |
| length() | Returns the number of characters in the target JSON string as an SQL NUMBER value. |
| lower() | Returns the lowercase form of the target string, which is a JSON string. |
| upper() | Returns the uppercase form of the target string, which is a JSON string. |
| timestamp() | Converts a JSON data value of the STRING or NUMBER type to an SQL TIMESTAMP value. The string must be in ISO date format. |
| date() | Converts a JSON data value of the STRING or NUMBER type to an SQL DATE value. The string must be in ISO date format. |
In addition to JSON data types, a set of SQL functions can be used to create, query, and modify JSON values. For more information, see JSON functions.