Collations|V4.3.5|OceanBase Database| docs|Distributed Database

Collations

Last Updated：2026-04-09 02:53:55 Updated

The default collation of OceanBase Database is utf8mb4_general_ci.

The following table describes the collations supported by OceanBase Database.

Collation	Character set	Description
utf8mb4_general_ci	utf8mb4	Uses the general collation.
utf8mb4_bin	utf8mb4	Uses the binary collation.
utf8mb4_unicode_ci	utf8mb4	Uses the collation based on Unicode Collation Algorithm (UCA).
utf8mb4_unicode_520_ci	utf8mb4	Uses the collation of Unicode 5.2.0. It sorts characters based on Unicode code points and ignores case differences.
utf8mb4_croatian_ci	utf8mb4	Uses the collation of the Croatian language. `utf8mb4_croatian_ci` is compatible with `utf8_croatian_ci`.
utf8mb4_czech_ci	utf8mb4	Uses the collation of the Czech language.
utf8mb4_0900_ai_ci	utf8mb4	Uses the collation and sort order of Unicode 9.0.0, ignoring case differences and treating uppercase and lowercase letters as the same character.
binary	binary	Uses the binary collation.
gbk_chinese_ci	gbk	Uses the collation of the Chinese language.
gbk_bin	gbk	Uses the binary collation.
utf16_general_ci	utf16	Uses the general collation.
utf16_bin	utf16	Uses the binary collation.
utf16_unicode_ci	utf16	Uses the collation based on UCA.
utf8mb4_german2_ci	utf16le	Uses the collation of the German language.
utf8mb4_croatian_ci	utf16le	Uses the collation of the Croatian language.
gb18030_chinese_ci	gb18030	Uses the collation of the Chinese language.
gb18030_bin	gb18030	Uses the binary collation.
latin1_swedish_ci	latin1	Uses the collation of the Swedish and Finnish languages.
latin1_german1_ci	latin1	Uses the collation of the latin1 character set for the German language environment.
latin1_danish_ci	latin1	Uses the collation of the latin1 character set for the Danish language environment.
latin1_german2_ci	latin1	Uses the collation for the German language environment, which is suitable for applications that need to compare characters based on dictionary order.
latin1_general_ci	latin1	Uses the collation for scenarios where case insensitivity and support for accent marks are required, such as in database designs of some European languages.
latin1_general_cs	latin1	Uses the case-sensitive collation for the general scenario, which supports multiple languages (such as Western European languages).
latin1_spanish_ci	latin1	Uses the collation of the Spanish language.
latin1_bin	latin1	Uses the binary collation of the latin1 character set.
gb18030_2022_bin	gb18030_2022	Uses the binary collation.
gb18030_2022_chinese_ci	gb18030_2022	Uses the collation based on pinyin. The collation is case-insensitive. The default collation of this character set in MySQL mode.
gb18030_2022_chinese_cs	gb18030_2022	Uses the collation based on pinyin. The collation is case-sensitive.
gb18030_2022_radical_ci	gb18030_2022	Uses the collation based on radicals and strokes. The collation is case-insensitive.
gb18030_2022_radical_cs	gb18030_2022	Uses the collation based on radicals and strokes. The collation is case-sensitive.
gb18030_2022_stroke_ci	gb18030_2022	Uses the collation based on strokes. The collation is case-insensitive.
gb18030_2022_stroke_cs	gb18030_2022	Uses the collation based on strokes. The collation is case-sensitive.
ascii_bin	ascii	Compares and sorts characters as binary data based on the comparison of binary bits.
ascii_general_ci	ascii	Uses a case-insensitive collation that sorts characters based on letters. It ignores case differences and treats uppercase and lowercase letters as the same character.
tis620_bin	tis620	Uses the binary collation.
tis620_thai_ci	tis620	Uses the collation of the Thai language, which is case-insensitive.
sjis_japanese_ci	sjis	Uses the collation of the Japanese language.
dec8_swedish_ci	dec8	Uses the collation of the Swedish language.
gb2312_chinese_ci	gb2312	Uses the GB2312 character set. It sorts data regardless of case according to the Chinese collation rules. Note For OceanBase Database V4.3.5, this collation is supported since V4.3.5 BP1.
gb2312_bin	gb2312	Uses the GB2312 character set. It sorts data in binary order and distinguishes between uppercase and lowercase letters. Note For OceanBase Database V4.3.5, this collation is supported since V4.3.5 BP1.
ujis_japanese_ci	ujis	Uses the UJIS character set. It sorts data regardless of case according to the Japanese collation rules. Note For OceanBase Database V4.3.5, this collation is supported since V4.3.5 BP1.
ujis_bin	ujis	Uses the UJIS character set. It sorts data in binary order and distinguishes between uppercase and lowercase letters. Note For OceanBase Database V4.3.5, this collation is supported since V4.3.5 BP1.
euckr_korean_ci	euckr	Uses the EUCKR character set. It sorts data regardless of case according to the Korean collation rules. Note For OceanBase Database V4.3.5, this collation is supported since V4.3.5 BP1.
euckr_bin	euckr	Uses the EUCKR character set. It sorts data in binary order and distinguishes between uppercase and lowercase letters. Note For OceanBase Database V4.3.5, this collation is supported since V4.3.5 BP1.
eucjpms_japanese_ci	eucjpms	Uses the EUCJPMS character set. It sorts data regardless of case according to the Japanese collation rules. Note For OceanBase Database V4.3.5, this collation is supported since V4.3.5 BP1.
eucjpms_bin	eucjpms	Uses the EUCJPMS character set. It sorts data in binary order and distinguishes between uppercase and lowercase letters. Note For OceanBase Database V4.3.5, this collation is supported since V4.3.5 BP1.
cp932_japanese_ci	cp932	Uses the CP932 character set. It sorts data regardless of case according to the Japanese collation rules. Note For OceanBase Database V4.3.5, this collation is supported since V4.3.5 BP1.
cp932_bin	cp932	Uses the CP932 character set. It sorts data in binary order and distinguishes between uppercase and lowercase letters. Note For OceanBase Database V4.3.5, this collation is supported since V4.3.5 BP1.
cp850_general_ci	cp850	Uses the CP850 character set. It sorts data regardless of case according to general collation rules. Note For OceanBase Database V4.3.5, this collation is supported since V4.3.5 BP1.
cp850_bin	cp850	Uses the CP850 character set. It sorts data in binary order and distinguishes between uppercase and lowercase letters. Note For OceanBase Database V4.3.5, this collation is supported since V4.3.5 BP1.
hp8_english_ci	hp8	Uses the HP8 character set. It sorts data regardless of case according to the English collation rules. Note For OceanBase Database V4.3.5, this collation is supported since V4.3.5 BP1.
hp8_bin	hp8	Uses the HP8 character set. It sorts data in binary order and distinguishes between uppercase and lowercase letters. Note For OceanBase Database V4.3.5, this collation is supported since V4.3.5 BP1.
macroman_general_ci	macroman	Uses the MacRoman character set. It sorts data regardless of case according to general collation rules. Note For OceanBase Database V4.3.5, this collation is supported since V4.3.5 BP1.
macroman_bin	macroman	Uses the MacRoman character set. It sorts data in binary order and distinguishes between uppercase and lowercase letters. Note For OceanBase Database V4.3.5, this collation is supported since V4.3.5 BP1.
swe7_swedish_ci	swe7	Uses the SWE7 character set. It sorts data regardless of case according to the Swedish collation rules. Note For OceanBase Database V4.3.5, this collation is supported since V4.3.5 BP1.
swe7_bin	swe7	Uses the SWE7 character set. It sorts data in binary order and distinguishes between uppercase and lowercase letters. Note For OceanBase Database V4.3.5, this collation is supported since V4.3.5 BP1.

Note

For any Unicode character set, the operation is executed faster if the character set is sorted by a general collation xxx_general_ci instead of a Unicode collation xxx_unicode_ci.
You cannot modify the collations.

By default, you can execute the SHOW COLLATION statement to display all available collations.

obclient [(none)]> SHOW COLLATION;

The return result is as follows:

+----------------------------+--------------+------+---------+----------+---------+
| Collation                  | Charset      | Id   | Default | Compiled | Sortlen |
+----------------------------+--------------+------+---------+----------+---------+
| utf8mb4_general_ci         | utf8mb4      |   45 | Yes     | Yes      |       1 |
| utf8mb4_bin                | utf8mb4      |   46 |         | Yes      |       1 |
| binary                     | binary       |   63 | Yes     | Yes      |       1 |
| gbk_chinese_ci             | gbk          |   28 | Yes     | Yes      |       1 |
| gbk_bin                    | gbk          |   87 |         | Yes      |       1 |
| utf16_general_ci           | utf16        |   54 | Yes     | Yes      |       1 |
| utf16_bin                  | utf16        |   55 |         | Yes      |       1 |
| gb18030_chinese_ci         | gb18030      |  248 | Yes     | Yes      |       2 |
| gb18030_bin                | gb18030      |  249 |         | Yes      |       1 |
| latin1_swedish_ci          | latin1       |    8 | Yes     | Yes      |       1 |
| latin1_german1_ci          | latin1       |    5 |         | Yes      |       1 |
| latin1_danish_ci           | latin1       |   15 |         | Yes      |       1 |
| latin1_german2_ci          | latin1       |   31 |         | Yes      |       1 |
| latin1_general_ci          | latin1       |   48 |         | Yes      |       1 |
| latin1_general_cs          | latin1       |   49 |         | Yes      |       1 |
| latin1_spanish_ci          | latin1       |   94 |         | Yes      |       1 |
| latin1_bin                 | latin1       |   47 |         | Yes      |       1 |
| gb18030_2022_bin           | gb18030_2022 |  216 |         | Yes      |       1 |
| gb18030_2022_chinese_ci    | gb18030_2022 |  217 | Yes     | Yes      |       1 |
| gb18030_2022_chinese_cs    | gb18030_2022 |  218 |         | Yes      |       1 |
| gb18030_2022_radical_ci    | gb18030_2022 |  219 |         | Yes      |       1 |
| gb18030_2022_radical_cs    | gb18030_2022 |  220 |         | Yes      |       1 |
| gb18030_2022_stroke_ci     | gb18030_2022 |  221 |         | Yes      |       1 |
| gb18030_2022_stroke_cs     | gb18030_2022 |  222 |         | Yes      |       1 |
| gb2312_chinese_ci          | gb2312       |   24 | Yes     | Yes      |       1 |
| gb2312_bin                 | gb2312       |   86 |         | Yes      |       1 |
| ascii_general_ci           | ascii        |   11 | Yes     | Yes      |       1 |
| ascii_bin                  | ascii        |   65 |         | Yes      |       1 |
| tis620_thai_ci             | tis620       |   18 | Yes     | Yes      |       1 |
| tis620_bin                 | tis620       |   89 |         | Yes      |       1 |
| ujis_japanese_ci           | ujis         |   12 | Yes     | Yes      |       1 |
| ujis_bin                   | ujis         |   91 |         | Yes      |       1 |
| euckr_korean_ci            | euckr        |   19 | Yes     | Yes      |       1 |
| euckr_bin                  | euckr        |   85 |         | Yes      |       1 |
| eucjpms_japanese_ci        | eucjpms      |   97 | Yes     | Yes      |       1 |
| eucjpms_bin                | eucjpms      |   98 |         | Yes      |       1 |
| cp932_japanese_ci          | cp932        |   95 | Yes     | Yes      |       1 |
| cp932_bin                  | cp932        |   96 |         | Yes      |       1 |
| utf16le_general_ci         | utf16le      |   56 | Yes     | Yes      |       1 |
| utf16le_bin                | utf16le      |   62 |         | Yes      |       1 |
| sjis_japanese_ci           | sjis         |   13 | Yes     | Yes      |       1 |
| sjis_bin                   | sjis         |   88 |         | Yes      |       1 |
| big5_chinese_ci            | big5         |    1 | Yes     | Yes      |       1 |
| big5_bin                   | big5         |   84 |         | Yes      |       1 |
| hkscs_bin                  | hkscs        |  152 | Yes     | Yes      |       1 |
| hkscs31_bin                | hkscs31      |  153 | Yes     | Yes      |       1 |
| utf16_unicode_ci           | utf16        |  101 |         | Yes      |       8 |
| utf16_icelandic_ci         | utf16        |  102 |         | Yes      |       8 |
| utf16_latvian_ci           | utf16        |  103 |         | Yes      |       8 |
| utf16_romanian_ci          | utf16        |  104 |         | Yes      |       8 |
| utf16_slovenian_ci         | utf16        |  105 |         | Yes      |       8 |
| utf16_polish_ci            | utf16        |  106 |         | Yes      |       8 |
| utf16_estonian_ci          | utf16        |  107 |         | Yes      |       8 |
| utf16_spanish_ci           | utf16        |  108 |         | Yes      |       8 |
| utf16_swedish_ci           | utf16        |  109 |         | Yes      |       8 |
| utf16_turkish_ci           | utf16        |  110 |         | Yes      |       8 |
| utf16_czech_ci             | utf16        |  111 |         | Yes      |       8 |
| utf16_danish_ci            | utf16        |  112 |         | Yes      |       8 |
| utf16_lithuanian_ci        | utf16        |  113 |         | Yes      |       8 |
| utf16_slovak_ci            | utf16        |  114 |         | Yes      |       8 |
| utf16_spanish2_ci          | utf16        |  115 |         | Yes      |       8 |
| utf16_roman_ci             | utf16        |  116 |         | Yes      |       8 |
| utf16_persian_ci           | utf16        |  117 |         | Yes      |       8 |
| utf16_esperanto_ci         | utf16        |  118 |         | Yes      |       8 |
| utf16_hungarian_ci         | utf16        |  119 |         | Yes      |       8 |
| utf16_sinhala_ci           | utf16        |  120 |         | Yes      |       8 |
| utf16_german2_ci           | utf16        |  121 |         | Yes      |       8 |
| utf16_croatian_ci          | utf16        |  122 |         | Yes      |       8 |
| utf16_unicode_520_ci       | utf16        |  123 |         | Yes      |       8 |
| utf16_vietnamese_ci        | utf16        |  124 |         | Yes      |       8 |
| utf8mb4_unicode_ci         | utf8mb4      |  224 |         | Yes      |       8 |
| utf8mb4_icelandic_ci       | utf8mb4      |  225 |         | Yes      |       8 |
| utf8mb4_latvian_ci         | utf8mb4      |  226 |         | Yes      |       8 |
| utf8mb4_romanian_ci        | utf8mb4      |  227 |         | Yes      |       8 |
| utf8mb4_slovenian_ci       | utf8mb4      |  228 |         | Yes      |       8 |
| utf8mb4_polish_ci          | utf8mb4      |  229 |         | Yes      |       8 |
| utf8mb4_estonian_ci        | utf8mb4      |  230 |         | Yes      |       8 |
| utf8mb4_spanish_ci         | utf8mb4      |  231 |         | Yes      |       8 |
| utf8mb4_swedish_ci         | utf8mb4      |  232 |         | Yes      |       8 |
| utf8mb4_turkish_ci         | utf8mb4      |  233 |         | Yes      |       8 |
| utf8mb4_czech_ci           | utf8mb4      |  234 |         | Yes      |       8 |
| utf8mb4_danish_ci          | utf8mb4      |  235 |         | Yes      |       8 |
| utf8mb4_lithuanian_ci      | utf8mb4      |  236 |         | Yes      |       8 |
| utf8mb4_slovak_ci          | utf8mb4      |  237 |         | Yes      |       8 |
| utf8mb4_spanish2_ci        | utf8mb4      |  238 |         | Yes      |       8 |
| utf8mb4_roman_ci           | utf8mb4      |  239 |         | Yes      |       8 |
| utf8mb4_persian_ci         | utf8mb4      |  240 |         | Yes      |       8 |
| utf8mb4_esperanto_ci       | utf8mb4      |  241 |         | Yes      |       8 |
| utf8mb4_hungarian_ci       | utf8mb4      |  242 |         | Yes      |       8 |
| utf8mb4_sinhala_ci         | utf8mb4      |  243 |         | Yes      |       8 |
| utf8mb4_german2_ci         | utf8mb4      |  244 |         | Yes      |       8 |
| utf8mb4_croatian_ci        | utf8mb4      |  245 |         | Yes      |       8 |
| utf8mb4_unicode_520_ci     | utf8mb4      |  246 |         | Yes      |       8 |
| utf8mb4_vietnamese_ci      | utf8mb4      |  247 |         | Yes      |       8 |
| dec8_swedish_ci            | dec8         |    3 | Yes     | Yes      |       8 |
| dec8_bin                   | dec8         |   69 |         | Yes      |       8 |
| cp850_general_ci           | cp850        |    4 | Yes     | Yes      |       8 |
| cp850_bin                  | cp850        |   80 |         | Yes      |       8 |
| hp8_english_ci             | hp8          |    6 | Yes     | Yes      |       8 |
| hp8_bin                    | hp8          |   72 |         | Yes      |       8 |
| macroman_general_ci        | macroman     |   39 | Yes     | Yes      |       8 |
| macroman_bin               | macroman     |   53 |         | Yes      |       8 |
| swe7_swedish_ci            | swe7         |   10 | Yes     | Yes      |       8 |
| swe7_bin                   | swe7         |   82 |         | Yes      |       8 |
| utf8mb4_0900_ai_ci         | utf8mb4      |  255 |         | Yes      |       0 |
| utf8mb4_de_pb_0900_ai_ci   | utf8mb4      |  256 |         | Yes      |       0 |
| utf8mb4_is_0900_ai_ci      | utf8mb4      |  257 |         | Yes      |       0 |
| utf8mb4_lv_0900_ai_ci      | utf8mb4      |  258 |         | Yes      |       0 |
| utf8mb4_ro_0900_ai_ci      | utf8mb4      |  259 |         | Yes      |       0 |
| utf8mb4_sl_0900_ai_ci      | utf8mb4      |  260 |         | Yes      |       0 |
| utf8mb4_pl_0900_ai_ci      | utf8mb4      |  261 |         | Yes      |       0 |
| utf8mb4_et_0900_ai_ci      | utf8mb4      |  262 |         | Yes      |       0 |
| utf8mb4_es_0900_ai_ci      | utf8mb4      |  263 |         | Yes      |       0 |
| utf8mb4_sv_0900_ai_ci      | utf8mb4      |  264 |         | Yes      |       0 |
| utf8mb4_tr_0900_ai_ci      | utf8mb4      |  265 |         | Yes      |       0 |
| utf8mb4_cs_0900_ai_ci      | utf8mb4      |  266 |         | Yes      |       0 |
| utf8mb4_da_0900_ai_ci      | utf8mb4      |  267 |         | Yes      |       0 |
| utf8mb4_lt_0900_ai_ci      | utf8mb4      |  268 |         | Yes      |       0 |
| utf8mb4_sk_0900_ai_ci      | utf8mb4      |  269 |         | Yes      |       0 |
| utf8mb4_es_trad_0900_ai_ci | utf8mb4      |  270 |         | Yes      |       0 |
| utf8mb4_la_0900_ai_ci      | utf8mb4      |  271 |         | Yes      |       0 |
| utf8mb4_eo_0900_ai_ci      | utf8mb4      |  273 |         | Yes      |       0 |
| utf8mb4_hu_0900_ai_ci      | utf8mb4      |  274 |         | Yes      |       0 |
| utf8mb4_hr_0900_ai_ci      | utf8mb4      |  275 |         | Yes      |       0 |
| utf8mb4_vi_0900_ai_ci      | utf8mb4      |  277 |         | Yes      |       0 |
| utf8mb4_0900_as_cs         | utf8mb4      |  278 |         | Yes      |       0 |
| utf8mb4_de_pb_0900_as_cs   | utf8mb4      |  279 |         | Yes      |       0 |
| utf8mb4_is_0900_as_cs      | utf8mb4      |  280 |         | Yes      |       0 |
| utf8mb4_lv_0900_as_cs      | utf8mb4      |  281 |         | Yes      |       0 |
| utf8mb4_ro_0900_as_cs      | utf8mb4      |  282 |         | Yes      |       0 |
| utf8mb4_sl_0900_as_cs      | utf8mb4      |  283 |         | Yes      |       0 |
| utf8mb4_pl_0900_as_cs      | utf8mb4      |  284 |         | Yes      |       0 |
| utf8mb4_et_0900_as_cs      | utf8mb4      |  285 |         | Yes      |       0 |
| utf8mb4_es_0900_as_cs      | utf8mb4      |  286 |         | Yes      |       0 |
| utf8mb4_sv_0900_as_cs      | utf8mb4      |  287 |         | Yes      |       0 |
| utf8mb4_tr_0900_as_cs      | utf8mb4      |  288 |         | Yes      |       0 |
| utf8mb4_cs_0900_as_cs      | utf8mb4      |  289 |         | Yes      |       0 |
| utf8mb4_da_0900_as_cs      | utf8mb4      |  290 |         | Yes      |       0 |
| utf8mb4_lt_0900_as_cs      | utf8mb4      |  291 |         | Yes      |       0 |
| utf8mb4_sk_0900_as_cs      | utf8mb4      |  292 |         | Yes      |       0 |
| utf8mb4_es_trad_0900_as_cs | utf8mb4      |  293 |         | Yes      |       0 |
| utf8mb4_la_0900_as_cs      | utf8mb4      |  294 |         | Yes      |       0 |
| utf8mb4_eo_0900_as_cs      | utf8mb4      |  296 |         | Yes      |       0 |
| utf8mb4_hu_0900_as_cs      | utf8mb4      |  297 |         | Yes      |       0 |
| utf8mb4_hr_0900_as_cs      | utf8mb4      |  298 |         | Yes      |       0 |
| utf8mb4_vi_0900_as_cs      | utf8mb4      |  300 |         | Yes      |       0 |
| utf8mb4_ja_0900_as_cs      | utf8mb4      |  303 |         | Yes      |       0 |
| utf8mb4_ja_0900_as_cs_ks   | utf8mb4      |  304 |         | Yes      |      24 |
| utf8mb4_0900_as_ci         | utf8mb4      |  305 |         | Yes      |       0 |
| utf8mb4_ru_0900_ai_ci      | utf8mb4      |  306 |         | Yes      |       0 |
| utf8mb4_ru_0900_as_cs      | utf8mb4      |  307 |         | Yes      |       0 |
| utf8mb4_zh_0900_as_cs      | utf8mb4      |  308 |         | Yes      |       0 |
| utf8mb4_0900_bin           | utf8mb4      |  309 |         | Yes      |       1 |
| utf8mb4_nb_0900_ai_ci      | utf8mb4      |  310 |         | Yes      |       0 |
| utf8mb4_nb_0900_as_cs      | utf8mb4      |  311 |         | Yes      |       0 |
| utf8mb4_nn_0900_ai_ci      | utf8mb4      |  312 |         | Yes      |       0 |
| utf8mb4_nn_0900_as_cs      | utf8mb4      |  313 |         | Yes      |       0 |
| utf8mb4_sr_latn_0900_ai_ci | utf8mb4      |  314 |         | Yes      |       0 |
| utf8mb4_sr_latn_0900_as_cs | utf8mb4      |  315 |         | Yes      |       0 |
| utf8mb4_bs_0900_ai_ci      | utf8mb4      |  316 |         | Yes      |       0 |
| utf8mb4_bs_0900_as_cs      | utf8mb4      |  317 |         | Yes      |       0 |
| utf8mb4_bg_0900_ai_ci      | utf8mb4      |  318 |         | Yes      |       0 |
| utf8mb4_bg_0900_as_cs      | utf8mb4      |  319 |         | Yes      |       0 |
| utf8mb4_gl_0900_ai_ci      | utf8mb4      |  320 |         | Yes      |       0 |
| utf8mb4_gl_0900_as_cs      | utf8mb4      |  321 |         | Yes      |       0 |
| utf8mb4_mn_cyrl_0900_ai_ci | utf8mb4      |  322 |         | Yes      |       0 |
| utf8mb4_mn_cyrl_0900_as_cs | utf8mb4      |  323 |         | Yes      |       0 |
+----------------------------+--------------+------+---------+----------+---------+
167 rows in set

General characteristics of collations are as follows:

Two different character sets cannot use the same collation.
Each character set uses a default collation. You can execute the SHOW CHARACTER SET statement to indicate the default collation of each character set. The execution result of the SHOW COLLATION statement has a column that indicates whether a collation is the default collation of its character set. If yes, the value is Yes. Otherwise, the value is empty.
The name of a collation starts with the name of the corresponding character set. In most cases, the name is followed by one or more suffix items that indicate other characteristics of the collation.

OceanBase

Customer Stories

Documentation

Collations

Note

Note

Note

Note

Note

Note

Note

Note

Note

Note

Note

Note

Note

Note

Note

Note

Note

Note

Note