Globalization support features |V2.2.77|OceanBase Database| docs|Distributed Database

Globalization support features

Last Updated：2023-08-18 09:26:34 Updated

The globalization support features of OceanBase Database ensure that you can deploy multilingual applications that can run in any country. An application can display the user interface and process data in your native language and locale preferences. OceanBase Database is now available only with the English UI.

In the past, Oracle's globalization support was referred to as National Language Support (NLS), which is actually a subset of globalization support. OceanBase Database in Oracle mode is compatible with Oracle databases. Therefore, the name Oracle is used. NLSallowsyou to choose a language and store data in a specific character set, and is implemented by using a series of NLS parameters.

Language support

OceanBase Database uses the UTF8MB4 character set to store data. This character set can store characters in most languages. However, the current OceanBase Database version supports only English for the display of locale information such as the local messages, currency, and time and date. The setting of the NLS_LANG parameter in the client environment does not affect OceanBase Database connections.

Territory support

The current OceanBase Database version supports only the AMERICA territory. The default local time format, date format, numeric format, and, currency format depend on the settings of the territory.

Date and time formats

The current OceanBase Database version supports only the date and time formats specified for the AMERICA territory. The default date format is DD-MON-RR and the default timestamp format is DD-MON-RR HH.MI.SSXFF AM or DD-MON-RR HH.MI.SSXFF AM TZR. OceanBase Database allows you to change the date and time formats by modifying the session parameters or instance parameters.

Calendar format

The current OceanBase Database version supports only the Gregorian calendar.

Numeric and currency formats

The numeric and currency formats vary with different countries. The current OceanBaseDatabaseversion does not support the setting of the currency format. The default value delimiters supported are dots (.) and commas (,).

Character string comparison and sorting

The current OceanBase Database version supports only comparison and sorting of strings in binary, which cannot be modified.

Length semantics

To calculate the byte length of a string, you need to know how many bytes each character occupies in the character set.

In single-byte character sets, the number of bytes and the number of characters of a string are the same. In multi-byte character sets, one character contains one or more bytes. Especially in variable-width character sets, it is difficult to calculate the number of characters based on the byte length. The method of calculating the column length in bytes is called byte semantics , while the method of calculating the column length in characters is called character semantics .

Character semantics specifies the storage requirements of multi-character strings with variable widths. For example, assume that you must have a VARCHAR2 column in a Unicode database (AL32UTF8). This column can store up to five Chinese characters and five English characters. If you use the byte semantics, this column requires 15 bytes for the Chinese characters (with 3 bytes for each) and 5 bytes for the English characters (with 1 byte for each). The total number of bytes is 20. If you use the character semantics, this column requires 10 characters.

OceanBase Database in Oracle mode uses the byte semantics by default.

Unicode and SQL national character data types

Unicode is an encoding system that defines characters in most languages across the world. In Unicode, each character has a unique code, which is irrelevant to platforms, programs, or languages.

OceanBase Database in Oracle mode stores Unicode characters in two ways:

You can specify UTF8 as the character set when you create a tenant. This allows you to store UTF8-encoded characters in data of SQL character types, such as CHAR, VARCHAR2, CLOB, and LONG.
You can declare and use columns and variables of SQL national character types.

SQL national character types include NCHAR, NVARCHAR2, and NCLOB (which is not supported by the current OceanBase Database version). These character types are also called Unicode data types because they are used to store only Unicode data.

Unicode data uses a national character set, which is specified when during instance creation. OceanBase Database supports two national character sets: UTF8 and AL6UTF16 (the default character set).

When you declare the type of a column or variable to be NCHAR or NVARCHAR2, the specified length is the maximum number of characters it can store, rather than the number of bytes. The setting of the length semantics does not affect the national character. Example:

obclient> create table t1(id number not null primary key, s1 varchar2(16), s2 nvarchar2(16));
Query OK, 0 rows affected (0.05 sec)

obclient> insert into t1 values(1,'ENENEN','ENENENENENEN');
Query OK, 1 row affected (0.00 sec)

obclient> commit;
Query OK, 0 rows affected (0.00 sec)

obclient> select id,s1,dump(s1,16) s1_d, s2,dump(s2,16) s2_d from t1\G
*************************** 1. row ***************************
  ID: 1
  S1: ENENEN
S1_D: Typ=22 Len=6: d6,d0,d6,d0,d6,d0
  S2: ENENEN
S2_D: Typ=43 Len=12: 4e,2d,4e,2d,4e,2d,4e,2d,4e,2d,4e,2d
1 row in set (0.00 sec)