This topic describes the specifications for selecting character sets in OceanBase Database.
You can set character sets at the tenant, database, table, field, and session levels. OceanBase Database supports character sets such as UTF8MB4, GBK, GB18030, BINARY, and UTF16.
Note
- To support seamless migration, OceanBase Database treats
UTF8as a synonym ofUTF8MB4. - Currently, the database character sets cannot be modified.
The following takes the GBK character set as an example:
Set the character set when you create a tenant.
Add
"charset=gbk"in theCREATE TENANTstatement to set the character set.create tenant oracle replica_num = 1, resource_pool_list =('pool1'), charset = gbk set ob_tcp_invited_nodes = '%', ob_compatibility_mode = 'oracle', parallel_servers_target = 10, ob_sql_work_area_percentage = 20, secure_file_priv = "";Select GBK as the character set when you create a tenant in OCP.
Notice
- In Oracle mode, the character set is specified at the tenant level. In a tenant whose character set is set to GBK, the char, varchar2, and clob fields in all user tables are of the GBK character set. However, char, varchar2, and other fields in system tables are still of the UTF8 character set.
- In Oracle mode, the character set of a tenant cannot be modified. In other words, you cannot use the
ALTERstatement to modify the character set of a tenant, database, table, or column. However, in MySQL mode, the character set of a tenant can be modified.
Set the client (link) character set.
The client (link) character set is a character set used for the interaction between the client and the server.
The client sends SQL strings to the server for execution. The server then returns the execution results to the client. In this process, the server must know the character set used by the client to correctly parse and execute the SQL strings and return the results. In different environments, the client can be OBClient, Java Database Connectivity (JDBC), or Oracle Call Interface (OCI). The client character set is also called the link character set.
The tenant character set and the client character set are independent of each other.
A GBK tenant can be accessed by a GBK client or a UTF8 client.
If the client character set is GBK, the server parses and executes the received SQL statements based on GBK.
If the client character set is UTF8, the server parses and executes the received SQL statements based on UTF8.
Configuration methods
Permanent modification
set global character_set_client = gbk; set global character_set_connection = gbk; set global character_set_results = gbk;where
character_set_clientspecifies the client character set.character_set_connectionspecifies the connection character set. In an Oracle tenant, set this parameter to the same value as that ofcharacter_set_client.character_set_resultsspecifies the character set of the results returned by the server to the client.Generally, the character set of the strings that the client sends to the server and the character set of the strings that the server returns to the client are the same. Therefore, in Oracle mode, the values of these three variables must be the same. In MySQL mode, these three variables are provided for flexible configuration. In general scenarios, you can set the three variables to the client character set.
Temporary modification (valid only for the current session):
Method 1:
set character_set_client = gbk; set character_set_connection = gbk; set character_set_results = gbk;Method 2:
set names gbk;
Set the client character set.
When you use the JDBC driver to connect to OceanBase Database, add
characterEncoding=gbkto the URL to set up a GBK link.String url = "jdbc:oceanbase://xxx.xxx.xxx.xxx:xxxx?useSSL=false&useUnicode=true&characterEncoding=gbk&connectTimeout=30000&rewriteBatchedStatements=true";When you use OBClient to connect to OceanBase Database, we recommend that you use the
zh_CN.GB18030superset ofzh_CN.GBKfor the bash environment variables of the GBK link.Modify the bash environment variables
export LANG=zh_CN.GB18030 export LC_ALL=zh_CN.GB18030Modify the character set settings of the terminal to set the character set of the current window to GBK. Follow the instructions on the terminal.
Notice
In addition to configuring the database (observer process) as GBK link, the client and driver also need to be configured accordingly. If the configuration is incorrect, garbled characters may be displayed.