A JSON value must be an object (JSON object), array, string, number, boolean value (false/true), or null. The values false, true, and null must be in lowercase.
JSON text structure
The JSON text structure includes characters, strings, numbers, and three literal names. Various separators, including spaces, tabs, line breaks, and carriage returns, can be used before or after any structural character.
Start array = [ Left bracket
Start object = { Left brace
End array = ] Right bracket
End object = } Right brace
Name separator = : Colon
Value separator = , Comma
Object
The structure of an object is represented by a pair of braces containing 0 or more name-value pairs (or members). The names within an object should be unique. A name is a string, and each name is followed by a colon to separate it from its value. A single comma is used to separate multiple name-value pairs. Example:
{ "NAME": "SAM", "Height": 175, "Weight": 100, "Registered" : false}
Array
The structure of an array is represented by a pair of brackets containing 0 or more values (also known as elements). Array elements are separated by commas, and the values in an array do not need to be the same.
Example:
["abc", 10, null, true, false]
Number
Numbers are represented in decimal format and include an integer component, which may be prefixed with an optional minus sign (-). A number can also have a fractional part and/or an exponent part. Leading zeros are not allowed. The fractional part is a decimal point followed by one or more digits. The exponent part starts with an uppercase or lowercase letter E, which can be followed by a plus sign (+) or a minus sign (-). After the E and the optional sign, one or more digits follow.
Example:
[100, 0, -100, 100.11, -12.11, 10.22e2, -10.22e2]
String
A string starts and ends with quotation marks (""). All Unicode characters can be placed within quotation marks, except for characters that must be escaped (including quotation marks, backslashes, and control characters).
JSON text should be encoded in UTF-8, UTF-16, or UTF-32. The default encoding is UTF-8.
Example:
{"Url": "http://www.example.com/image/481989943"}
Create a JSON value
OceanBase Database supports the following DDL operations on JSON data types:
Create a table with a JSON column.
Add or drop a JSON column.
Create an index on a generated column of the JSON data type.
Enable semi-structured encoding when creating a table in a MySQL tenant.
Enable semi-structured encoding for an existing table in a MySQL tenant.
Limitations
You can create multiple JSON columns in a table, subject to the following limitations:
JSON columns cannot be used as
PRIMARY KEY,FOREIGN KEY, orUNIQUE KEYcolumns, but you can addNOT NULLorCHECKconstraints to them.JSON columns cannot have default values.
JSON columns cannot be used as partitioning keys.
JSON data cannot exceed the length of
LONGTEXT, and the maximum depth of each JSON object or array is 99.
Examples
Create or modify a JSON column
obclient> CREATE TABLE tbl1 (id INT PRIMARY KEY, docs JSON NOT NULL, docs1 JSON);
Query OK, 0 rows affected
obclient> ALTER TABLE tbl1 MODIFY docs JSON CHECK(docs <'{"a" : 100}');
Query OK, 0 rows affected
obclient> CREATE TABLE json_tab(
id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT COMMENT 'Primary key',
json_info JSON COMMENT 'JSON data',
json_id INT GENERATED ALWAYS AS (json_info -> '$.id') COMMENT 'Virtual column for JSON data',
json_name VARCHAR(5) GENERATED ALWAYS AS (json_info -> '$.NAME'),
index json_info_id_idx (json_id)
)COMMENT 'JSON sample table';
Query OK, 0 rows affected
obclient> ALTER TABLE json_tab ADD COLUMN json_info1 JSON;
Query OK, 0 rows affected
obclient> ALTER TABLE json_tab ADD INDEX (json_name);
Query OK, 0 rows affected
obclient> ALTER TABLE json_tab drop COLUMN json_info1;
Query OK, 0 rows affected
Create an index for a generated column
obclient> CREATE TABLE jn ( c JSON, g INT GENERATED ALWAYS AS (c->"$.id"));
Query OK, 0 rows affected
obclient> CREATE INDEX idx1 ON jn(g);
Query OK, 0 rows affected
Records: 0 Duplicates: 0 Warnings: 0
obclient> INSERT INTO jn (c) VALUES
('{"id": "1", "name": "Fred"}'), ('{"id": "2", "name": "Wilma"}'),
('{"id": "3", "name": "Barney"}'), ('{"id": "4", "name": "Betty"}');
Query OK, 4 rows affected
Records: 4 Duplicates: 0 Warnings: 0
obclient> SELECT c->>"$.name" AS name FROM jn WHERE g <= 2;
+-------+
| name |
+-------+
| Fred |
| Wilma |
+-------+
2 rows in set
obclient> EXPLAIN SELECT c->>"$.name" AS name FROM jn WHERE g <= 2\G
*************************** 1. row ***************************
Query Plan: =========================================
|ID|OPERATOR |NAME |EST. ROWS|COST|
-----------------------------------------
|0 |TABLE SCAN|jemp(idx1)|2 |92 |
=========================================
Outputs & filters:
-------------------------------------
0 - output([JSON_UNQUOTE(JSON_EXTRACT(jemp.c, '$.name'))]), filter(nil),
access([jemp.c]), partitions(p0)
1 row in set
Use semi-structured encoding
Notice
Starting from OceanBase Database V4.4.1, the table-level parameter SEMISTRUCT_ENCODING_TYPE is deprecated. Instead, use SEMISTRUCT_PROPERTIES.
OceanBase Database allows you to enable semi-structured encoding when creating a table in a MySQL tenant. This is controlled by the table-level parameter SEMISTRUCT_PROPERTIES. Additionally, you must set the table's ROW_FORMAT=COMPRESSED. Otherwise, an error will occur:
- When
SEMISTRUCT_PROPERTIES=(encoding_type=encoding), the table is considered a semi-structured table, meaning that all JSON columns in the table will be enabled for semi-structured encoding. - When
SEMISTRUCT_PROPERTIES=(encoding_type=none), the table is considered a structured table. - You can also set the frequency threshold using the
freq_thresholdparameter. For more information about the syntax and parameters, see CREATE TABLE. - The
encoding_typeandfreq_thresholdparameters are supported only for Online DDL operations and are not supported for Offline DDL operations.
Enable semi-structured encoding
Notice
Before enabling semi-structured encoding, make sure that the cluster configuration item micro_block_merge_verify_level is set to the default value
2. Do not disable micro-block merge verification.Example of enabling semi-structured encoding when creating a tableExample of enabling semi-structured encoding for an existing tableCREATE TABLE t1( j json) ROW_FORMAT=COMPRESSED SEMISTRUCT_PROPERTIES=(encoding_type=encoding, freq_threshold=50);For more information about the syntax, see CREATE TABLE.
CREATE TABLE t1(j json); ALTER TABLE t1 SET ROW_FORMAT=COMPRESSED SEMISTRUCT_PROPERTIES = (encoding_type=encoding, freq_threshold=50);For more information about the syntax, see ALTER TABLE.
Some modification limitations:
- Modifying the frequency threshold does not take effect if semi-structured encoding is not enabled.
- The
freq_thresholdparameter cannot be modified during a direct load or when the table is locked. - Modifying one sub-configuration item does not affect the others.
Disable semi-structured encoding
When
SEMISTRUCT_PROPERTIESis set to(encoding_type=none), semi-structured encoding is disabled. This operation does not affect existing data; it only affects new data written after the change. Here is an example of disabling semi-structured encoding:ALTER TABLE t1 SET ROW_FORMAT=COMPRESSED SEMISTRUCT_PROPERTIES = (encoding_type=none);Query the semi-structured encoding configuration
Use the
SHOW CREATE TABLEstatement to query the semi-structured encoding configuration. Here is an example statement:SHOW CREATE TABLE t1;The result is as follows:
+-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Table | Create Table | +-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | t1 | CREATE TABLE `t1` ( `j` json DEFAULT NULL ) ORGANIZATION INDEX DEFAULT CHARSET = utf8mb4 ROW_FORMAT = COMPRESSED COMPRESSION = 'zstd_1.3.8' REPLICA_NUM = 1 BLOCK_SIZE = 16384 USE_BLOOM_FILTER = FALSE ENABLE_MACRO_BLOCK_BLOOM_FILTER = FALSE TABLET_SIZE = 134217728 PCTFREE = 0 SEMISTRUCT_PROPERTIES=(ENCODING_TYPE=ENCODING, FREQ_THRESHOLD=50) | +-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in setWhen
SEMISTRUCT_PROPERTIES=(encoding_type=encoding), the configuration item information is displayed, indicating that semi-structured encoding is enabled.
Using semi-structured encoding can improve the performance of JSON_VALUE() function for condition-based filtering queries. Based on JSON semi-structured encoding technology, OceanBase Database optimizes the performance of JSON_VALUE expression-based condition filtering queries. Since the JSON data is split into subcolumns, the system can directly filter based on the encoded subcolumn data without needing to reconstruct the full JSON structure, significantly improving query efficiency.
Here is an example query:
-- Query rows where the name field is 'Devin'
SELECT * FROM t WHERE JSON_VALUE(j_doc, '$.name' RETURNING CHAR) = 'Devin';
Character set considerations:
OceanBase Database uses
utf8_binencoding for JSON data.To ensure that string white-box filtering works correctly, we recommend that you set:
SET @@collation_server = 'utf8mb4_bin'; SET @@collation_connection='utf8mb4_bin';
