The default collation of OceanBase Database is utf8mb4_general_ci.
The following table describes the collations supported by OceanBase Database.
| Collation | Character set | Description |
|---|---|---|
| utf8mb4_general_ci | utf8mb4 | A general collation. |
| utf8mb4_bin | utf8mb4 | A binary collation. |
| utf8mb4_unicode_ci | utf8mb4 | A collation that is based on Unicode Collation Algorithm (UCA). |
| binary | binary | A binary collation. |
| gbk_chinese_ci | gbk | A collation for Chinese. |
| gbk_bin | gbk | A binary collation. |
| utf16_general_ci | utf16 | A general collation. |
| utf16_bin | utf16 | A binary collation. |
| utf16_unicode_ci | utf16 | A collation that is based on UCA. |
| gb18030_chinese_ci | gb18030 | A collation for Chinese. |
| gb18030_bin | gb18030 | A binary collation. |
| latin1_swedish_ci | latin1 | A collation for Swedish/Finnish. |
| latin1_bin | latin1 | A binary collation. |
| gb18030_2022_bin | gb18030_2022 | A binary collation. |
| gb18030_2022_chinese_ci | gb18030_2022 | A Pinyin collation for Chinese. The collation is case-insensitive. This is the default collation for this character set in MySQL mode. |
| gb18030_2022_chinese_cs | gb18030_2022 | A Pinyin collation for Chinese. The collation is case-sensitive. |
| gb18030_2022_radical_ci | gb18030_2022 | A radical stroke collation for Chinese. The collation is case-insensitive. |
| gb18030_2022_radical_cs | gb18030_2022 | A radical stroke collation for Chinese. The collation is case-sensitive. |
| gb18030_2022_stroke_ci | gb18030_2022 | A stroke collation for Chinese. The collation is case-insensitive. |
| gb18030_2022_stroke_cs | gb18030_2022 | A stroke collation for Chinese. The collation is case-sensitive. |
Note
- For any Unicode character set, the operation is executed faster if the character set is sorted by a general collation
xxx_general_ciinstead of a Unicode collationxxx_unicode_ci. - You cannot modify the collations.
By default, you can execute the SHOW COLLATION statement to display all available collations.
obclient> SHOW COLLATION;
+-------------------------+--------------+-----+---------+----------+---------+
| Collation | Charset | Id | Default | Compiled | Sortlen |
+-------------------------+--------------+-----+---------+----------+---------+
| utf8mb4_general_ci | utf8mb4 | 45 | Yes | Yes | 1 |
| utf8mb4_bin | utf8mb4 | 46 | | Yes | 1 |
| binary | binary | 63 | Yes | Yes | 1 |
| gbk_chinese_ci | gbk | 28 | Yes | Yes | 1 |
| gbk_bin | gbk | 87 | | Yes | 1 |
| utf16_general_ci | utf16 | 54 | Yes | Yes | 1 |
| utf16_bin | utf16 | 55 | | Yes | 1 |
| utf8mb4_unicode_ci | utf8mb4 | 224 | | Yes | 1 |
| utf16_unicode_ci | utf16 | 101 | | Yes | 1 |
| gb18030_chinese_ci | gb18030 | 248 | Yes | Yes | 1 |
| gb18030_bin | gb18030 | 249 | | Yes | 1 |
| latin1_swedish_ci | latin1 | 8 | Yes | Yes | 1 |
| latin1_bin | latin1 | 47 | | Yes | 1 |
| gb18030_2022_bin | gb18030_2022 | 216 | | Yes | 1 |
| gb18030_2022_chinese_ci | gb18030_2022 | 217 | Yes | Yes | 1 |
| gb18030_2022_chinese_cs | gb18030_2022 | 218 | | Yes | 1 |
| gb18030_2022_radical_ci | gb18030_2022 | 219 | | Yes | 1 |
| gb18030_2022_radical_cs | gb18030_2022 | 220 | | Yes | 1 |
| gb18030_2022_stroke_ci | gb18030_2022 | 221 | | Yes | 1 |
| gb18030_2022_stroke_cs | gb18030_2022 | 222 | | Yes | 1 |
+-------------------------+--------------+-----+---------+----------+---------+
20 rows in set
Note
OceanBase Database Community Edition currently does not support utf8mb4_unicode_ci and utf16_unicode_ci.
General characteristics of collations are as follows:
Two different character sets cannot use the same collation.
Each character set uses a default collation. You can execute the
SHOW CHARACTER SETstatement to indicate the default collation of each character set. The execution result of theSHOW COLLATIONstatement has a column that indicates whether a collation is the default collation of its character set. If yes, the value isYes. Otherwise, the value is empty.The name of a collation starts with the name of the corresponding character set. In most cases, the name is followed by one or more suffix items that indicate other characteristics of the collation.