In OceanBase Database, you can use dot notation and JSON functions to access JSON data. OceanBase Database recommends that you store JSON data in the JSON data type.
Access JSON data by using dot notation
The syntax for dot notation is essentially an alias for a table, followed by a JSON column name and one or more field names, which are separated by periods (.):
table_alias.json_col_name.json_field
# or
.json_field followed by array_step.(field_name)
When using dot notation as a query parameter in SQL, the following constraints apply:
table_aliasmust be an alias for a table.json_col_namemust be valid JSON data (with theIS JSONconstraint, or it must be of the JSON type).array_stepcannot exist alone and must followjson_field, for example:.json_field[1,2].- The identifier length for each node must not exceed 128 bytes.
- The size of the query return value must not exceed 4k; if it does, the return value is
NULL.
In the following example, po is an alias for the j_purchaseorder table, po_doc is a JSON column or a VARCHAR2, BLOB, or CLOB column with the IS JSON constraint, and PONumber is the field name in the JSON data.
SELECT po.po_doc.PONumber FROM j_purchaseorder po;
Dot notation can also be used as a parameter for a function and can be used to call methods supported by JSON Path. Here is an example:
SELECT SUBSTR(po.po_doc.PONumber.string(),2,2) FROM j_purchaseorder po;
Access JSON data by using JSON path
Because JSON files are hierarchical, you can use a path expression to extract a part of a JSON document, modify a part of a JSON document, or specify a location in the document to operate on. JSON path matches JSON data by using JSON functions and conditions and selects zero or more matching or satisfying JSON values. JSON path can use wildcards and array ranges, and the matching is case-sensitive.
JSON Path syntax
The basic syntax of JSON Path consists of a context symbol ($) followed by zero or more object, array, and recursive steps. Each node can be followed by a filter expression or a function node. JSON Path (also known as a path expression) can contain an absolute path expression or a relative path expression.
- An absolute path expression starts with a $ symbol and is followed by zero or more nodes. For example, in the expression $.filed_a.field_b, the $, filed_a, and filed_b are all nodes.
- A relative path expression starts with an @ symbol. It is similar to an absolute path expression, but it uses only relative path expressions in the filter expression.
In a basic path expression, a single function node (Function Step) is optional and usually follows the item_method. If present, it is the last step in the path expression. An object node (Object Step) is a period (.) followed by a field or a wildcard (*), indicating a single field or all fields.
An array node (Array Step) is an opening bracket ([) followed by a star wildcard (*) and a closing bracket (]). It represents all array elements, one or more specific array indices, or a specified range separated by commas (,).
In a path expression, an array index (Array Indexing) specifies a single array position. Array indices can be integer literals (0, 1, 2, ...). Array positions and indices are zero-based, with the first array element having an index of 0. You can use the last keyword to reference the last element of any non-empty array. Array indices can also be specified in the last - N format, where - represents subtraction. N is an integer literal (0, 1, 2, ...) that is not greater than the array size minus 1. Ranges can be specified as "N to M", such as "3 to 1", "2 to 4", or "last-1 to last-2". For example, in the original data ["1", "2", "3", "4", "5", "6", "7", "8", "9"], the selected objects [3 to 1, 2 to 4, last-1 to last-2, 0, 0] result in ["2", "3", "4", "3", "4", "5", "7", "8", "1", "1"].
A descendant node (Descendant Step) is represented by two consecutive periods (..) followed by a field. It recursively traverses down to objects or arrays that match the previous node and returns all collected field values. Here is an example:
obclient> SELECT JSON_QUERY('{ "a" : { "b" : { "z" : 1 }, "c" : [ 5, { "z" : 2 } ], "z" : 3 }, "z" : 4 }', '$.a..z' WITH ARRAY WRAPPER) FROM DUAL;
+----------------------------------------------------------------------------------------+
| JSON_QUERY('{"A":{"B":{"Z":1},"C":[5,{"Z":2}],"Z":3},"Z":4}','$.A..Z'WITHARRAYWRAPPER) |
+----------------------------------------------------------------------------------------+
| [3,1,2] |
+----------------------------------------------------------------------------------------+
1 row in set
A filter expression (Filter) is a question mark (?) followed by a filter condition enclosed in parentheses (()). If the filter condition is met, it returns true. A filter condition (Filter Condition) uses a predicate (boolean function) as an argument. The filter condition can be in the following forms, where each cond, cond1, and cond2 represents a filter condition.
(cond): Parentheses are used for grouping, separating the filter conditioncondfrom other filter conditions before or after it.cond1 && cond2: Requires bothcond1andcond2to be satisfied.cond1 || cond2: Requires at least one ofcond1orcond2to be satisfied.! (cond): Negatescond, meaningcondmust not be satisfied.exists(followed by a relative path expression): Indicates that the target data exists under the specified conditions.- A comparison can be in the following forms:
- A relative path expression followed by a comparison predicate and a JSON scalar value or JSON variable.
- A JSON scalar value or JSON variable followed by a comparison predicate and a relative path expression.
- A JSON scalar value followed by a comparison predicate and another JSON scalar value.
- The supported logical operators for predicates are
&&,||, and!. The supported comparison operators are>,>=,<,<=,==, and!=. For strings, the supported comparison operators arehas substringandstarts with. The comparison operatorslike,like_regex, andeq_regexare not supported.
Here are some examples of accessing JSON data using JSON Path:
$.fruits # The value of the fruits field in the object.
$.fruits[0] # The first element of the array object.
$.fruits[0].name # The value of the name field in the object, which is the first element of the fruits array.
$.fruits[*].name # The value of the name field in each object of the fruits array.
$.*[*].name # The value of the name field in all array objects contained in the object.
$.fruits[3, 7 to 10, 12] # The 4th, 8th to 11th, and 13th elements of the fruits array. The elements must be specified in ascending order.
$.fruits[3].price # The value of the price field in the 4th element of the fruits array.
$.fruits[3].* # The value of the 4th element of the fruits array.
$.fruits[3].price[0].year # The value of the year field in the 1st element of the price array in the 4th element of the fruits array.
$.fruits[3].price[0]?(@.year > 2016) # The value of the 1st element of the price array in the 4th element of the fruits array, if the following conditions are met: the value of the year field in the 1st element of the price array can be converted to a number and is greater than 2016.
$.fruits[3].price[0]?(@.year.number() > 2016) # Same as the previous example, but the year field value must have a number() method that returns a number (e.g., 2017) when the value is a string or array, and the returned value must be greater than 2016.
$.fruits[3].price[0]?(@.year.numberOnly() > 2016) # Same as the previous example, but the numberOnly() method of the year field value is used, which excludes cases where the year value is a string representing a number (e.g., "2017").
$.fruits[3]?(@.produce.city == "San Francisco") # The 4th element of the fruits array, provided it has a produce field whose value is an object with a city field whose value is the string "San Francisco".
JSON Path also supports a relaxed syntax. The difference between relaxed and strict syntax is in how arrays are handled. In relaxed mode, JSON data accessed from an array is automatically wrapped in an array type or unwrapped from its data encapsulation.
For example, in relaxed mode, $.fruits is equivalent to $.fruits or $[*].fruits in strict mode. In relaxed mode, $.fruits[0].name is equivalent to $.fruit[0].name, $.fruit.name, $[*].fruits.name, or $[*].fruits[0].name in strict mode.
Item method for JSON path (item_method)
The item method for JSON path (item_method) is the last step in a path expression and can convert the target JSON data into (possibly other) JSON data. However, queries using path expressions (with or without item_method) can return SQL data types that do not support JSON data. The following table lists the supported item methods in the current version.
| Method | Description |
|---|---|
| abs() | Takes the absolute value of a JSON data number type, corresponding to the SQL function ABS(). |
| boolean()/ | |
| booleanOnly() | Converts the value of JSON data into an SQL value of the VARCHAR2 type. |
| ceiling() | Rounds a JSON data number type to the nearest integer, corresponding to the SQL function CEIL(). |
| floor() | Rounds a JSON data number type to the nearest integer, corresponding to the SQL function FLOOR(). |
| double() | Converts the target JSON data to the BINARY DOUBLE numeric type in SQL based on the target JSON type (STRING/NUMBER). |
| type() | Returns the name of the JSON data type of the target data:
|
| size() | The size of a JSON item. |
| number() /numberOnly() | Converts the target JSON data to the NUMBER numeric type in SQL based on the target JSON type (STRING/NUMBER). |
| string()/stringOnly() | Returns the string representation of the target JSON data. This representation is the same as the VARCHAR2 type returned by the RETURNING clause used in JSON functions. (For the STRING type, true is "true" and false is "false". If the value is null, "null" is returned. For a number, the value is returned in its standard form.) Errors during conversion are ignored. |
| length() | Calculates the number of characters in the target JSON string and returns it as an SQL NUMBER. |
| lower() | Converts a JSON string to lowercase. |
| upper() | Converts a JSON string to uppercase. |
| timestamp() | Converts the target JSON data to the TIMESTAMP type in SQL based on the target JSON type (STRING / NUMBER). The string must be in ISO date format. |
| date() | Converts the target JSON data to the DATE type in SQL based on the target JSON type (STRING / NUMBER). The string must be in ISO date format. |
In addition to JSON data types, a set of SQL functions can be used to create, query, and modify JSON values. For more information, see JSON functions.
