Column-level character sets and collations

2026-02-03 08:48:37  Updated

Every column of the CHAR, VARCHAR, or TEXT type, or a synonym type has a column character set and a column collation.

Specify the character set and collation for a column

You can execute the CREATE TABLE or ALTER TABLE statement to specify the character set and collation for a column by using the following syntax:

CREATE TABLE table_name (
    column_name {CHAR | VARCHAR | TEXT} (column_length)
    [CHARACTER SET charset_name]
    [COLLATE collation_name]
);

ALTER TABLE table_name MODIFY
    col_name {CHAR | VARCHAR | TEXT} (column_length)
    CHARACTER SET [=] charset_name
    [COLLATE [=] collation_name]

Here are some examples:

obclient> CREATE TABLE t (     
     col1 VARCHAR(5)       
     CHARACTER SET utf8mb4       
     COLLATE utf8mb4_unicode_ci);
Query OK, 0 rows affected

obclient> ALTER TABLE t MODIFY
      col1 VARCHAR(5)
      CHARACTER SET gbk
      COLLATE gbk_chinese_ci;
Query OK, 0 rows affected

Select the character set and collation for a column

You can select the character set and collation for a column by using the following methods in OceanBase Database:

  • If you specify both the CHARACTER SET charset_name and COLLATE collation_name parameters, the charset_name character set and collation_name collation are used.

  • If you specify only the CHARACTER SET charset_name parameter, the specified charset_name character set and the associated default collation are used. To view the default collation for each character set, execute the SHOW CHARACTER SET statement.

  • If you specify only the COLLATE collation_name parameter, the specified collation_name collation and the associated character set are used.

  • If neither CHARACTER SET nor COLLATE is specified, the character set and collation for the table are used.

If the column-level character set and collation are not specified in the definition of each column, the table-level character set and collation are used by default.

In the MySQL mode of OceanBase Database, you can use ORDER BY after COLLATE to sort data records based on the collation specified for a column. You can also use COLLATE after ORDER BY to specify another collation in the same character set for sorting.

obclient> CREATE TABLE t(a VARCHAR(10) COLLATE gb18030_2022_radical_ci);
Query OK, 0 rows affected

obclient> INSERT INTO t VALUES('a'),('A'),('b'),('B');
Query OK, 4 rows affected
Records: 4  Duplicates: 0  Warnings: 0

obclient> SELECT a FROM t ORDER BY a;
+------+
| a    |
+------+
| a    |
| A    |
| b    |
| B    |
+------+
4 rows in set

obclient> SELECT a FROM t ORDER BY a COLLATE gb18030_2022_chinese_cs;
+------+
| a    |
+------+
| A    |
| B    |
| a    |
| b    |
+------+
4 rows in set

Conversion conditions for column-level character sets

To convert a binary or non-binary string column to a specified character set, execute the ALTER TABLE statement. To succeed, one of the following conditions must be met:

  • If the column contains a binary data type (BINARY, VARBINARY, or BLOB), all values it contains must be encoded by using the character set to which the column is converted. If you use binary columns to store data of multiple character sets, OceanBase Database cannot match the character sets with the respective values and the conversion fails.

  • If the column contains a non-binary data type (CHAR, VARCHAR, or TEXT), its content must be encoded by using the column-level character set. If the column content is encoded by using different character sets, you can convert the column to the binary data type in use and then to a non-binary column with the desired character set.

Contact Us