JSON values must be objects (JSON objects), arrays, strings, numbers, boolean values (false/true), or null. The values false, true, and null must be in lowercase.
JSON text structure
The JSON text structure includes characters, strings, numbers, and three literal names. Various separators, including spaces, horizontal tabs, line breaks, and carriage returns, can be used before or after any structure character.
Start array = [ Left bracket
Start object = { Left brace
End array = ] Right bracket
End object = } Right brace
Name separator = : Colon
Value separator = , Comma
Object
The structure of an object is represented by a pair of braces containing 0 or more name/value pairs (or members). The names within an object should be unique. A name is a string, followed by a colon to separate the name from its value. A single comma separates multiple name/value pairs. Here is an example:
{ "NAME": "SAM", "Height": 175, "Weight": 100, "Registered" : false}
Array
The structure of an array is represented by a pair of brackets containing 0 or more values (also known as elements). Array elements are separated by commas, and the values in the array do not need to be the same.
Here is an example:
["abc", 10, null, true, false]
Number
Numbers are in decimal format and include an integer component, which may be prefixed with an optional minus sign (-). A fractional part and/or an exponent part may follow the integer component. Leading zeros are not allowed. The fractional part consists of a decimal point followed by one or more digits. The exponent part starts with an uppercase or lowercase letter E, followed by an optional plus sign (+) or minus sign (-). After the E and the optional sign, one or more digits follow.
Here is an example:
[100, 0, -100, 100.11, -12.11, 10.22e2, -10.22e2]
String
A string is enclosed in double quotation marks (""). All Unicode characters can be placed within the quotation marks, except for characters that must be escaped (including quotation marks, backslashes, and control characters).
JSON text should be encoded in UTF-8, UTF-16, or UTF-32. The default encoding is UTF-8.
Here is an example:
{"Url": "http://www.example.com/image/481989943"}
Create a JSON value
OceanBase Database supports the following DDL operations for JSON data types:
Create a table with a JSON column.
Add or drop a JSON column.
Create an index on a generated column of the JSON data type.
Enable semi-structured encoding when creating a table in a MySQL-compatible tenant.
Enable semi-structured encoding for an existing table in a MySQL-compatible tenant.
Limitations
You can create multiple JSON columns in a table, subject to the following limitations:
JSON columns cannot be defined as
PRIMARY KEY,FOREIGN KEY, orUNIQUE KEY, but you can addNOT NULLorCHECKconstraints to them.JSON columns cannot have default values.
JSON columns cannot be used as partitioning keys.
The length of JSON data cannot exceed the length of
LONGTEXT, and the maximum depth of each JSON object or array is 99.
Examples
Create or modify a JSON column
obclient> CREATE TABLE tbl1 (id INT PRIMARY KEY, docs JSON NOT NULL, docs1 JSON);
Query OK, 0 rows affected
obclient> ALTER TABLE tbl1 MODIFY docs JSON CHECK(docs <'{"a" : 100}');
Query OK, 0 rows affected
obclient> CREATE TABLE json_tab(
id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT COMMENT 'Primary key',
json_info JSON COMMENT 'JSON data',
json_id INT GENERATED ALWAYS AS (json_info -> '$.id') COMMENT 'Virtual column for JSON data',
json_name VARCHAR(5) GENERATED ALWAYS AS (json_info -> '$.NAME'),
index json_info_id_idx (json_id)
)COMMENT 'JSON sample table';
Query OK, 0 rows affected
obclient> ALTER TABLE json_tab ADD COLUMN json_info1 JSON;
Query OK, 0 rows affected
obclient> ALTER TABLE json_tab ADD INDEX (json_name);
Query OK, 0 rows affected
obclient> ALTER TABLE json_tab drop COLUMN json_info1;
Query OK, 0 rows affected
Create an index for a generated column
obclient> CREATE TABLE jn ( c JSON, g INT GENERATED ALWAYS AS (c->"$.id"));
Query OK, 0 rows affected
obclient> CREATE INDEX idx1 ON jn(g);
Query OK, 0 rows affected
Records: 0 Duplicates: 0 Warnings: 0
obclient> INSERT INTO jn (c) VALUES
('{"id": "1", "name": "Fred"}'), ('{"id": "2", "name": "Wilma"}'),
('{"id": "3", "name": "Barney"}'), ('{"id": "4", "name": "Betty"}');
Query OK, 4 rows affected
Records: 4 Duplicates: 0 Warnings: 0
obclient> SELECT c->>"$.name" AS name FROM jn WHERE g <= 2;
+-------+
| name |
+-------+
| Fred |
| Wilma |
+-------+
2 rows in set
obclient> EXPLAIN SELECT c->>"$.name" AS name FROM jn WHERE g <= 2\G
*************************** 1. row ***************************
Query Plan: =========================================
|ID|OPERATOR |NAME |EST. ROWS|COST|
-----------------------------------------
|0 |TABLE SCAN|jemp(idx1)|2 |92 |
=========================================
Outputs & filters:
-------------------------------------
0 - output([JSON_UNQUOTE(JSON_EXTRACT(jemp.c, '$.name'))]), filter(nil),
access([jemp.c]), partitions(p0)
1 row in set
Use semi-structured encoding
Notice
Starting from OceanBase Database V4.4.1, the table-level parameter SEMISTRUCT_ENCODING_TYPE is deprecated. Use SEMISTRUCT_PROPERTIES instead.
OceanBase Database allows you to enable semi-structured encoding when you create a table in a MySQL-compatible tenant. This feature is controlled by the table-level parameter SEMISTRUCT_PROPERTIES. You must also set the ROW_FORMAT=COMPRESSED parameter for the table. Otherwise, an error will be returned:
- When
SEMISTRUCT_PROPERTIES=(encoding_type=encoding), the table is considered a semi-structured table. In this case, semi-structured encoding is enabled for all JSON columns in the table. - When
SEMISTRUCT_PROPERTIES=(encoding_type=none), the table is considered a structured table. - You can also use the
freq_thresholdparameter to set the frequency threshold. For more information, see CREATE TABLE. - The
encoding_typeandfreq_thresholdparameters are supported only in Online DDL operations. They are not supported in Offline DDL operations.
Enable semi-structured encoding
Notice
Before you enable semi-structured encoding, make sure that the cluster parameter micro_block_merge_verify_level is set to the default value
2. Do not disable microblock merge verification.Example of enabling semi-structured encoding when you create a tableExample of enabling semi-structured encoding for an existing tableCREATE TABLE t1( j json) ROW_FORMAT=COMPRESSED SEMISTRUCT_PROPERTIES=(encoding_type=encoding, freq_threshold=50);For more information about the syntax, see CREATE TABLE.
CREATE TABLE t1(j json); ALTER TABLE t1 SET ROW_FORMAT=COMPRESSED SEMISTRUCT_PROPERTIES = (encoding_type=encoding, freq_threshold=50);For more information about the syntax, see ALTER TABLE.
Some limitations:
- If you modify the frequency threshold without enabling semi-structured encoding, no error will be returned, but the modification will not take effect.
- If you perform a direct load or lock the table, you cannot modify the
freq_thresholdparameter. - If you modify one subparameter, the other subparameters remain unchanged.
Disable semi-structured encoding
When
SEMISTRUCT_PROPERTIESis set to(encoding_type=none), semi-structured encoding is disabled. This operation does not affect existing data. It only takes effect on subsequent writes. Here is an example of disabling semi-structured encoding:ALTER TABLE t1 SET ROW_FORMAT=COMPRESSED SEMISTRUCT_PROPERTIES = (encoding_type=none);Query the semi-structured encoding configuration
You can query the semi-structured encoding configuration by using the
SHOW CREATE TABLEstatement. Here is an example:SHOW CREATE TABLE t1;The returned result is as follows:
+-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Table | Create Table | +-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | t1 | CREATE TABLE `t1` ( `j` json DEFAULT NULL ) ORGANIZATION INDEX DEFAULT CHARSET = utf8mb4 ROW_FORMAT = COMPRESSED COMPRESSION = 'zstd_1.3.8' REPLICA_NUM = 1 BLOCK_SIZE = 16384 USE_BLOOM_FILTER = FALSE ENABLE_MACRO_BLOCK_BLOOM_FILTER = FALSE TABLET_SIZE = 134217728 PCTFREE = 0 SEMISTRUCT_PROPERTIES=(ENCODING_TYPE=ENCODING, FREQ_THRESHOLD=50) | +-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in setWhen
SEMISTRUCT_PROPERTIES=(encoding_type=encoding), the configuration item information is displayed, indicating that semi-structured encoding is enabled.
Using semi-structured encoding can improve the performance of JSON_VALUE() queries with conditions. Based on the JSON semi-structured encoding technology, OceanBase Database optimizes the performance of JSON_VALUE expression-based condition filtering queries. Since the JSON data is split into subcolumns, the system can directly filter based on the encoded subcolumn data without reconstructing the complete JSON structure, significantly improving query efficiency.
Here is an example:
-- Query rows where the value of the name field is 'Devin'.
SELECT * FROM t WHERE JSON_VALUE(j_doc, '$.name' RETURNING CHAR) = 'Devin';
Character set considerations:
OceanBase Database uses the
utf8_binencoding for JSON data.To ensure that string whitelisting filtering works properly, we recommend that you set:
SET @@collation_server = 'utf8mb4_bin'; SET @@collation_connection='utf8mb4_bin';