A JSON value must consist of JSON objects, arrays, strings, numbers, Boolean values (true or false), or nulls. The false, true, and null values must be in lowercase.
JSON text
A JSON text consists of structural characters, strings, numbers, and three literal names. Various delimiters, including spaces, tabs, line breaks, and carriage returns, are allowed before or after any structural character.
Begin array = left bracket ([)
Begin object = left brace ({)
End array = right bracket (])
End object = right brace (})
Name separator = colon (:)
Value separator = comma (,)
Objects
The structure of an object is represented as a pair of braces that enclose zero or more name/value pairs or members. The names within an object must be unique. A name is a string. Each name is followed by a colon to separate it from its value. A single comma separates a value from a name that follows. A single comma is added to separate two name/value pairs. Here is an example:
{ "NAME": "SAM", "Height": 175, "Weight": 100, "Registered" : false}
Arrays
The structure of an array is represented as a pair of brackets that enclose zero or more values. A value is also known as an element. Elements in an array are separated by commas (,), and values in the array can be different.
Here is an example:
["abc", 10, null, true, false]
Numbers
Decimal numbers are used. A number contains an integer component that is prefixed with an optional minus sign (-) and may be followed by a fractional part and/or an exponent part. Leading zeros are not allowed. The fractional part consists of a decimal point and one or more following digits. The exponent part starts with an uppercase or lowercase E, which is followed by an optional plus sign (+) or a minus sign (-). The letter E and the optional sign can be followed by one or more digits.
Here is an example:
[100, 0, -100, 100.11, -12.11, 10.22e2, -10.22e2]
Strings
A string starts and ends with a double quotation mark ("). All Unicode characters can be enclosed in double quotation marks, except for the following characters that must be escaped: quotation marks, backslashes, and control characters.
The JSON text must be encoded in UTF-8, UTF-16, or UTF-32. By default, the JSON text is encoded in UTF-8.
Here is an example:
{"Url": "http://www.example.com/image/481989943"}
Create a JSON value
OceanBase Database supports the following DDL operations for the JSON data type:
Creating tables with JSON columns.
Adding or deleting JSON columns.
Creating indexes on generated columns of the JSON data type.
Enabling semi-structured encoding when creating tables in MySQL tenants.
Modifying existing tables to enable semi-structured encoding in MySQL tenants.
Limitations
You can create multiple JSON columns in each table, but the following limitations apply:
A JSON column cannot be used as the
PRIMARY KEY,FOREIGN KEY, orUNIQUE KEY, but you can add theNOT NULLorCHECKconstraint.A JSON column cannot contain default values.
JSON columns cannot be used as partitioning keys.
The length of the JSON data cannot exceed the value of
LONGTEXT. The maximum depth of each JSON object or array is 99.
Examples
Create/Modify JSON columns
obclient> CREATE TABLE tbl1 (id INT PRIMARY KEY, docs JSON NOT NULL, docs1 JSON);
Query OK, 0 rows affected
obclient> ALTER TABLE tbl1 MODIFY docs JSON CHECK(docs <'{"a" : 100}');
Query OK, 0 rows affected
obclient> CREATE TABLE json_tab(
id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT COMMENT 'primary key',
json_info JSON COMMENT 'JSON data',
json_id INT GENERATED ALWAYS AS (json_info -> '$.id') COMMENT 'virtual JSON data field',
json_name VARCHAR(5) GENERATED ALWAYS AS (json_info -> '$.NAME'),
index json_info_id_idx (json_id)
)COMMENT 'json sample table';
Query OK, 0 rows affected
obclient> ALTER TABLE json_tab ADD COLUMN json_info1 JSON;
Query OK, 0 rows affected
obclient> ALTER TABLE json_tab ADD INDEX (json_name);
Query OK, 0 rows affected
obclient> ALTER TABLE json_tab DROP COLUMN json_info1;
Query OK, 0 rows affected
Create an index for the specified key by using a generated column
obclient> CREATE TABLE jn ( c JSON, g INT GENERATED ALWAYS AS (c->"$.id"));
Query OK, 0 rows affected
obclient> CREATE INDEX idx1 ON jn(g);
Query OK, 0 rows affected
Records: 0 Duplicates: 0 Warnings: 0
obclient> INSERT INTO jn (c) VALUES
('{"id": "1", "name": "Fred"}'), ('{"id": "2", "name": "Wilma"}'),
('{"id": "3", "name": "Barney"}'), ('{"id": "4", "name": "Betty"}');
Query OK, 4 rows affected
Records: 4 Duplicates: 0 Warnings: 0
obclient> SELECT c->>"$.name" AS name FROM jn WHERE g <= 2;
+-------+
| name |
+-------+
| Fred |
| Wilma |
+-------+
2 rows in set
obclient> EXPLAIN SELECT c->>"$.name" AS name FROM jn WHERE g <= 2\G
*************************** 1. row ***************************
Query Plan: =========================================
|ID|OPERATOR |NAME |EST. ROWS|COST|
-----------------------------------------
|0 |TABLE SCAN|jemp(idx1)|2 |92 |
=========================================
Outputs & filters:
-------------------------------------
0 - output([JSON_UNQUOTE(JSON_EXTRACT(jemp.c, '$.name'))]), filter(nil),
access([jemp.c]), partitions(p0)
1 row in set
Use semi-structured encoding
OceanBase Database allows you to enable semi-structured encoding when creating tables in MySQL-compatible tenants. This feature is controlled by the table-level parameter semistruct_encoding_type and requires the table to have row_format=COMPRESSED set. Otherwise, an error will occur. When semistruct_encoding_type='encoding', the table is treated as a semi-structured table, meaning all JSON columns in the table will use semi-structured encoding. If semistruct_encoding_type='', the table is treated as a structured table.
Enable semi-structured encoding
Example: Enable semi-structured encoding during table creationExample: Enable semi-structured encoding for an existing tableCREATE TABLE t1( j json ) row_format=COMPRESSED semistruct_encoding_type = 'encoding';For more syntax details, see CREATE TABLE.
CREATE TABLE t1(j json); ALTER TABLE t1 SET row_format=COMPRESSED semistruct_encoding_type = 'encoding';For more syntax details, see ALTER TABLE.
Disable semi-structured encoding.
The following example shows how to disable semi-structured encoding:
ALTER TABLE t1 SET row_format=COMPRESSED semistruct_encoding_type = '';Query the configuration of semi-structured encoding.
Use the
SHOW CREATE TABLEstatement to query the semi-structured encoding configuration. An example statement is as follows:SHOW CREATE TABLE t1;The returned result is as follows:
+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Table | Create Table | +-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | t1 | CREATE TABLE `t1` ( `j` json DEFAULT NULL ) DEFAULT CHARSET = utf8mb4 ROW_FORMAT = COMPRESSED COMPRESSION = 'zstd_1.3.8' REPLICA_NUM = 1 BLOCK_SIZE = 16384 USE_BLOOM_FILTER = FALSE ENABLE_MACRO_BLOCK_BLOOM_FILTER = FALSE TABLET_SIZE = 134217728 PCTFREE = 0 SEMISTRUCT_ENCODING_TYPE='ENCODING' | +-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in setWhen
semistruct_encoding_type='encoding', the query will display the parameter information, indicating that the semi-structured encoding feature is enabled.
Using semi-structured encoding can improve the performance of conditional filtering queries with the JSON_VALUE() function. Based on JSON semi-structured encoding technology, OceanBase optimizes the performance of filtering queries using the JSON_VALUE expression. Since JSON data is split into sub-columns, the system can directly filter based on the encoded sub-column data without needing to restore the complete JSON structure, significantly improving query efficiency.
An example query is as follows:
-- Query rows where the name field value is 'Devin'
SELECT * FROM t WHERE JSON_VALUE(j_doc, '$.name' RETURNING CHAR) = 'Devin';
Character set considerations are as follows:
OceanBase uses
utf8_binencoding for JSON.To ensure proper functionality of string-based filtering, it is recommended to set:
SET @@collation_server = 'utf8mb4_bin'; SET @@collation_connection = 'utf8mb4_bin';