This topic describes the specifications for selecting character sets in OceanBase Database.
You can set character sets at the tenant, database, table, field, and session levels. OceanBase Database supports character sets such as UTF8MB4, GBK, GB18030, BINARY, and UTF16.
Note
- To support seamless migration, OceanBase Database treats
UTF8as a synonym ofUTF8MB4.- You cannot modify the database character set.
The following takes the GBK character set as an example:
Set the character set when you create a tenant.
Add
"charset=gbk"in the CREATE TENANT statement to set the character set, as shown in the following example:create tenant oracle replica_num = 1, resource_pool_list =('pool1'), charset = gbk set ob_tcp_invited_nodes = '%', ob_compatibility_mode = 'oracle', parallel_servers_target = 10, ob_sql_work_area_percentage = 20, secure_file_priv = "";Select GBK as the character set when you create a tenant in OCP.
Notice
- In Oracle mode, the character set is specified at the tenant level. In a tenant whose character set is set to GBK, the char, varchar2, and clob fields in all user tables are of the GBK character set. However, char, varchar2, and other fields in system tables are still of the UTF8 character set.
- In Oracle mode, the character set of a tenant cannot be modified. In other words, you cannot use the
ALTERstatement to modify the character set of a tenant, database, table, or column. However, in MySQL mode, the character set of a tenant can be modified.
Set the client (link) character set.
The client (link) character set is a character set used for the interaction between the client and the server.
The client sends SQL strings to the server for execution. The server then returns the execution results to the client. In this process, the server must know the character set used by the client to correctly parse and execute the SQL strings and return the results. In different environments, the client can be OBClient, JDBC, or OCI. The client character set is also called the link character set.
The tenant character set and the client character set are independent of each other.
A GBK tenant can be accessed by a GBK client or a UTF8 client. Example:
If the client character set is GBK, the server parses and executes the received SQL statements based on GBK.
If the client character set is UTF8, the server parses and executes the received SQL statements based on UTF8.
Configuration method
Permanent modification
set global character_set_client = gbk; set global character_set_connection = gbk; set global character_set_results = gbk;character_set_client: specifies the client character set.
character_set_connection: specifies the connection character set. In an Oracle tenant, set this parameter to the same value as that of character_set_client.
character_set_results: specifies the character set of the results returned by the server to the client.
Generally, the character set of the strings that the client sends to the server and the character set of the strings that the server returns to the client are the same. Therefore, in Oracle mode, the values of these three variables must be the same. In MySQL mode, these three variables are provided for flexible configuration. In general scenarios, you can set the three variables to the client character set.
Temporary modification (valid only for the current session):
Method 1:
set character_set_client = gbk; set character_set_connection = gbk; set character_set_results = gbk;Method 2:
set names gbk;
Set the client character set.
When you use the JDBC driver to connect to OceanBase Database, add
"characterEncoding=gbkto the URL to set up a GBK link, as shown in the following example:String url = "jdbc:oceanbase://xxx.xxx.xxx.xxx:xxxx?useSSL=false&useUnicode=true&characterEncoding=gbk&connectTimeout=30000&rewriteBatchedStatements=true";When you use OBClient to connect to OceanBase Database, we recommend that you use the
zh_CN.GB18030superset ofzh_CN.GBKfor the bash environment variables of the GBK link.Modify the bash environment variables:
export LANG=zh_CN.GB18030 export LC_ALL=zh_CN.GB18030Modify the character set settings of the terminal to set the character set of the current window to GBK. Follow the instructions on the terminal.
Notice
You must set the client and driver accordingly. Otherwise, garbled characters may be displayed.