In OceanBase Database, you can use dot notation and JSON functions to access JSON data. OceanBase Database recommends that you store JSON data in the JSON base type.
Access JSON data by using dot notation
The dot notation syntax is essentially a table alias, followed by a JSON column name, and then one or more field names, separated by periods (.) to indicate the fields.
table_alias.json_col_name. json_field
# or
.json_field followed by array_step.(field_name)
When using the dot notation syntax as a query parameter in SQL, the following constraints apply:
table_aliasmust be an alias for the table.json_col_namemust be valid JSON data (with theIS JSONconstraint or itself being of the JSON type).array_stepcannot exist alone and must followjson_field, for example:.json_field[1,2].- The identifier length for each node must not exceed 128 bytes.
- The size of the query return value must not exceed 4k; if it does, the return value is
NULL.
In the following example, po is an alias for the j_purchaseorder table, po_doc is a JSON column or a VARCHAR2, BLOB, or CLOB type with the IS JSON constraint, and PONumber is the field name in the JSON data.
SELECT po.po_doc.PONumber FROM j_purchaseorder po;
The dot notation syntax can also be used as a parameter for functions and can invoke methods supported by JSON Path. Here is an example:
SELECT SUBSTR(po.po_doc.PONumber.string(),2,2) FROM j_purchaseorder po;
Access JSON data by using JSON path
Because JSON files are hierarchical, you must use a path expression to extract parts of a JSON document, modify parts of a JSON document, or specify the location in the document to operate on. JSON Path matches JSON data by using JSON functions and conditions, and selects zero or more matching or satisfying JSON values. JSON Path can use wildcards and array ranges, and the matching is case-sensitive.
JSON Path syntax
The basic syntax of JSON Path consists of a context symbol ($) followed by zero or more object, array, and recursive steps. Each node can be followed by a filter expression or a function node. JSON Path (also known as a path expression) can be an absolute path expression or a relative path expression.
- An absolute path expression starts with a
$symbol and is followed by zero or more nodes. For example, in the expression$.filed_a.field_b,$,filed_a, andfiled_bare all nodes. - A relative path expression starts with an
@symbol and is similar to an absolute path expression, except that it only uses relative path expressions in the filter expression.
In a basic path expression, a single function node (Function Step) is optional and typically follows the item_method. If present, it is the last step in the path expression. An object node (Object Step) is a period (.) followed by a field or a wildcard (*), indicating a single field or all fields.
An array node (Array Step) is a left bracket ([) followed by a star wildcard (*) and a right bracket (]). It represents all array elements, one or more specific array indices, or a specified range separated by commas (,).
In a path expression, an array index (Array Indexing) specifies a single array position. The array index can be an integer literal (0, 1, 2, ...). Array positions and indices are zero-based, with the first array element having an index of 0 (specifying position 0). You can use the last index to reference the last element of any non-empty array. The array index can also be in the last - N format, where - represents subtraction. N is an integer literal (0, 1, 2, ...) that is not greater than the array size minus 1. The range is specified as "N to M", for example, "3 to 1", "2 to 4", or "last-1 to last-2". For example, in the original data ["1", "2", "3", "4", "5", "6", "7", "8", "9"], the selected objects are [3 to 1, 2 to 4, last-1 to last-2, 0, 0], and the result is ["2", "3", "4", "3", "4", "5", "7", "8", "1", "1"].
A descendant node (Descendant Step) is represented by two consecutive periods (..) followed by a field. It recursively descends into objects or arrays that match the preceding node and returns all collected field values. Here is an example:
obclient> SELECT JSON_QUERY('{ "a" : { "b" : { "z" : 1 }, "c" : [ 5, { "z" : 2 } ], "z" : 3 }, "z" : 4 }', '$.a..z' WITH ARRAY WRAPPER) FROM DUAL;
+----------------------------------------------------------------------------------------+
| JSON_QUERY('{"A":{"B":{"Z":1},"C":[5,{"Z":2}],"Z":3},"Z":4}','$.A..Z'WITHARRAYWRAPPER) |
+----------------------------------------------------------------------------------------+
| [3,1,2] |
+----------------------------------------------------------------------------------------+
1 row in set
A filter expression (Filter Expression, abbreviated as Filter) is represented by a question mark (?) followed by a filter condition enclosed in parentheses (()). If the filter condition is satisfied, it returns true. A filter condition (Filter Condition) uses a predicate (boolean function) as an argument. The filter condition can be in the following forms, where each cond, cond1, and cond2 represents a filter condition.
(cond): Parentheses are used for grouping, separating the filter conditioncondfrom other filter conditions before or after it.cond1 && cond2:andrequires bothcond1andcond2to be satisfied.cond1 || cond2: Requirescond1orcond2to be satisfied, or both.! (cond): The negation ofcond, meaningcondmust not be satisfied.exists(followed by a relative path expression): The target data exists under the specified condition.- The comparison predicate has the following forms:
- A relative path expression, followed by a comparison predicate, and then a JSON scalar value or JSON variable.
- A JSON scalar value or JSON variable, followed by a comparison predicate, and then a relative path expression.
- A JSON scalar value, followed by a comparison predicate, and then another JSON scalar value.
- The connection predicate supports
&&,||, and!. The comparison predicate supports>,>=,<,<=,==, and!=. For strings, the comparison predicate supportshas substringandstarts with. Other comparison predicates such aslike,like_regex, andeq_regexare not supported.
Here are some examples of accessing JSON data using JSON Path:
$.fruits # The value of the fruits field in the object.
$.fruits[0] # The first element of an array object.
$.fruits[0].name # The value of the name field in the object, which is the first element of the fruits array.
$.fruits[*].name # The value of the name field in each object of the fruits array.
$.*[*].name # The value of the name field in all array objects contained in the object.
$.fruits[3, 7 to 10, 12] # The 4th, 8th to 11th, and 13th elements of the fruits array. The elements must be specified in ascending order.
$.fruits[3].price # The value of the price field in the 4th element of the fruits array.
$.fruits[3].* # The value of the 4th element of the fruits array.
$.fruits[3].price[0].year # The value of the year field in the 1st element of the price array in the 4th element of the fruits array.
$.fruits[3].price[0]?(@.year > 2016) # The value of the 1st element of the price array in the 4th element of the fruits array, if the value of the year field in the 1st element of the price array can be converted to a number and satisfies '> 2016'.
$.fruits[3].price[0]?(@.year.number() > 2016) # Same as the previous example, but the year field value must have the number() method (which returns a number for strings and arrays, such as 2017, "2017", and returns 2017), and the method's return value must be greater than 2016.
$.fruits[3].price[0]?(@.year.numberOnly() > 2016) # Same as the previous example, but only the numberOnly() method of the year field value is used, excluding cases where the year value is a string number (such as "2017").
$.fruits[3]?(@.produce.city == "San Francisco") # The 4th element of the fruits array, provided it has a produce field whose value is an object with a city field whose value is the string "San Francisco".
JSON Path also supports a lenient syntax. The difference between the lenient and strict syntax is in how arrays are handled. In the lenient syntax, accessed JSON data is automatically wrapped in array type or unwrapped.
For example, in the lenient syntax, $.fruits is equivalent to $.fruits or $[*].fruits in the strict syntax. In the lenient syntax, $.fruits[0].name is equivalent to $.fruit[0].name, $.fruit.name, $[*].fruits.name, or $[*].fruits[0].name in the strict syntax.
Item methods of JSON Path
The item methods of JSON Path are the final step in a path expression. They can convert the target JSON data into (possibly other) JSON data. However, queries using path expressions (with or without item methods) can return data in SQL data types that do not support JSON data. The following table lists the item methods supported in the current version.
| Method | Description |
|---|---|
| abs() | Takes the absolute value of a JSON data value of the numeric type. This corresponds to the SQL function ABS(). |
| boolean()/ | |
| booleanOnly() | Converts a JSON data value to an SQL value of the VARCHAR2 type. |
| ceiling() | Rounds a JSON data value of the numeric type to the nearest integer. This corresponds to the SQL function CEIL(). |
| floor() | Rounds a JSON data value of the numeric type to the nearest integer. This corresponds to the SQL function FLOOR(). |
| double() | Converts a JSON data value to an SQL value of the BINARY DOUBLE type based on the target JSON type (STRING or NUMBER). |
| type() | Returns the name of the JSON data type of the target data:
|
| size() | Returns the size of a JSON item. |
| number() /numberOnly() | Converts a JSON data value to an SQL value of the NUMBER type based on the target JSON type (STRING or NUMBER). |
| string()/stringOnly() | Returns the string representation of the target JSON data. This representation is the same as the VARCHAR2 type returned by the RETURNING clause used in JSON functions. (For STRING type, true is represented as "true" and false as "false". If the value is null, "null" is returned. If the value is a number, it is returned in its standard form.) Errors are ignored during the conversion. |
| length() | Returns the number of characters in the target JSON string as an SQL NUMBER value. |
| lower() | Converts a JSON string value to lowercase. |
| upper() | Converts a JSON string value to uppercase. |
| timestamp() | Converts a JSON data value to an SQL TIMESTAMP value based on the target JSON type (STRING or NUMBER). The string must be in ISO date format. |
| date() | Converts a JSON data value to an SQL DATE value based on the target JSON type (STRING or NUMBER). The string must be in ISO date format. |
In addition to JSON data types, a set of SQL functions can be used to create, query, and modify JSON values. For more information, see JSON functions.