Collations

2023-07-28 02:55:42  Updated

The default collation of OceanBase Database is utf8mb4_general_ci.

The following table describes the collations supported by OceanBase Database.

Collations Character set Description
utf8mb4_general_ci utf8mb4 A general collation.
utf8mb4_bin utf8mb4 A binary collation.
utf8mb4_unicode_ci utf8mb4 A collation that is based on Unicode Collation Algorithm (UCA).
binary binary A binary collation.
gbk_chinese_ci gbk A collation for Chinese.
gbk_bin gbk A binary collation.
utf16_general_ci utf16 A general collation.
utf16_bin utf16 A binary collation.
utf16_unicode_ci utf16 A collation that is based on Unicode Collation Algorithm (UCA).
gb18030_chinese_ci gb18030 A collation for Chinese.
gb18030_bin gb18030 A binary collation.
latin1_swedish_ci latin1 A collation for Swedish/Finnish.
latin1_bin latin1 A binary collation.

Note

  • For any Unicode character set, the operation is executed faster if the character set is sorted by a general collation xxx_general_ci instead of a Unicode collation xxx_unicode_ci.
  • You cannot modify the collations.

By default, you can execute the SHOW COLLATION statement to display all available collations.

obclient> SHOW COLLATION;
+--------------------+---------+-----+---------+----------+---------+
| Collation          | Charset | Id  | Default | Compiled | Sortlen |
+--------------------+---------+-----+---------+----------+---------+
| utf8mb4_general_ci | utf8mb4 |  45 | Yes     | Yes      |       1 |
| utf8mb4_bin        | utf8mb4 |  46 |         | Yes      |       1 |
| binary             | binary  |  63 | Yes     | Yes      |       1 |
| gbk_chinese_ci     | gbk     |  28 | Yes     | Yes      |       1 |
| gbk_bin            | gbk     |  87 |         | Yes      |       1 |
| utf16_general_ci   | utf16   |  54 | Yes     | Yes      |       1 |
| utf16_bin          | utf16   |  55 |         | Yes      |       1 |
| utf8mb4_unicode_ci | utf8mb4 | 224 |         | Yes      |       1 |
| utf16_unicode_ci   | utf16   | 101 |         | Yes      |       1 |
| gb18030_chinese_ci | gb18030 | 248 | Yes     | Yes      |       1 |
| gb18030_bin        | gb18030 | 249 |         | Yes      |       1 |
| latin1_swedish_ci  | latin1  |   8 | Yes     | Yes      |       1 |
| latin1_bin         | latin1  |  47 |         | Yes      |       1 |
+--------------------+---------+-----+---------+----------+---------+
11 rows in set

General characteristics of collations:

  • Two different character sets cannot use the same collation.

  • Each character set uses a default collation. You can execute the SHOW CHARACTER SET statement to indicate the default collation of each character set. The execution result of the SHOW COLLATION statement has a column that indicates whether a collation is the default collation of its character set. If yes, the value is Yes. Otherwise, the value is empty.

  • The name of a collation starts with the name of the corresponding character set. In most cases, the name is followed by one or more suffix items that indicate other characteristics of the collation.

Contact Us