Every column of the CHAR, VARCHAR, or TEXT type, or a synonym type has a column character set and a column collation.
Specify the character set and collation for a column
You can execute the CREATE TABLE or ALTER TABLE statement to specify the character set and collation for a column by using the following syntax:
CREATE TABLE table_name (
column_name {CHAR | VARCHAR | TEXT} (column_length)
[CHARACTER SET charset_name]
[COLLATE collation_name]
);
ALTER TABLE table_name MODIFY
col_name {CHAR | VARCHAR | TEXT} (column_length)
CHARACTER SET [=] charset_name
[COLLATE [=] collation_name]
Here are some examples:
obclient> CREATE TABLE t (
col1 VARCHAR(5)
CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_ci);
Query OK, 0 rows affected
obclient> ALTER TABLE t MODIFY
col1 VARCHAR(5)
CHARACTER SET gbk
COLLATE gbk_chinese_ci;
Query OK, 0 rows affected
Select the character set and collation for a column
You can select the character set and collation for a column by using the following methods in OceanBase Database:
If you specify both the
CHARACTER SET charset_nameandCOLLATE collation_nameparameters, thecharset_namecharacter set andcollation_namecollation are used.If you specify only the
CHARACTER SET charset_nameparameter, the specifiedcharset_namecharacter set and the associated default collation are used. To view the default collation for each character set, execute theSHOW CHARACTER SETstatement.If you specify only the
COLLATE collation_nameparameter, the specifiedcollation_namecollation and the associated character set are used.If neither
CHARACTER SETnorCOLLATEis specified, the character set and collation for the table are used.
If the column-level character set and collation are not specified in the definition of each column, the table-level character set and collation are used by default.
In the MySQL mode of OceanBase Database, you can use ORDER BY after COLLATE to sort data records based on the collation specified for a column. You can also use COLLATE after ORDER BY to specify another collation in the same character set for sorting.
obclient> CREATE TABLE t(a VARCHAR(10) COLLATE gb18030_2022_radical_ci);
Query OK, 0 rows affected
obclient> INSERT INTO t VALUES('a'),('A'),('b'),('B');
Query OK, 4 rows affected
Records: 4 Duplicates: 0 Warnings: 0
obclient> SELECT a FROM t ORDER BY a;
+------+
| a |
+------+
| a |
| A |
| b |
| B |
+------+
4 rows in set
obclient> SELECT a FROM t ORDER BY a COLLATE gb18030_2022_chinese_cs;
+------+
| a |
+------+
| A |
| B |
| a |
| b |
+------+
4 rows in set
Conversion conditions for column-level character sets
To convert a binary or non-binary string column to a specified character set, execute the ALTER TABLE statement. To succeed, one of the following conditions must be met:
If the column contains a binary data type (
BINARY,VARBINARY, orBLOB), all values it contains must be encoded by using the character set to which the column is converted. If you use binary columns to store data of multiple character sets, OceanBase Database cannot match the character sets with the respective values and the conversion fails.If the column contains a non-binary data type (
CHAR,VARCHAR, orTEXT), its content must be encoded by using the column-level character set. If the column content is encoded by using different character sets, you can convert the column to the binary data type in use and then to a non-binary column with the desired character set.