The NCHAR data type specifies the fixed-length UNICODE character data. When you create a database, the national character set defines the maximum column length. When you create a table that contains an NCHAR column, you define the column length in characters. Width specifications of the character data type NCHAR indicate the number of characters. The maximum column length is 2,000 bytes.
If you want to use less space to store Chinese characters, choose the NCHAR data type.
When you use an NCHAR column to store values, the database automatically pads the values that are shorter than the specified length with spaces to the specified length. When you specify lengths, CHAR is used as the unit of measurement. You cannot specify other units. Notice
You cannot insert a CHAR value into an NCHAR column or insert an NCHAR value into a CHAR column.
Syntax
NCHAR[(size)]
Parameters
| Parameter | Description |
|---|---|
size |
The length of a fixed-length character string. Based on the national character set, the maximum length is set to 2,000 bytes. By default, the minimum length of a fixed-length character string is one character. |
More information
Unicode character set
The Unicode character set is an encoding of characters. It provides UTF-8, UTF-16, UTF-32, and other compression and conversion encoding methods. An encoding method determines the size required to store a character. Chinese characters and English characters take up different spaces varying from storage methods.
Comparison of three encoding methods
| Encoding method | Number of bytes for encoding characters | BOM | Advantage | Disadvantage |
|---|---|---|---|---|
| UTF-8 | A variable-length encoding method that provides single-byte encoding for ASCII characters and multibyte encoding for non-ASCII characters. The minimum code unit is eight bits. | Without BOM: If a byte stream starts with EF BB BF at the beginning of a text, the text is encoded in UTF-8. | An ideal Unicode encoding method: This method is fully compatible with ASCII encoding, requires no BOM, features strong self-synchronization and error correction capabilities for network transmission and communication, and provides high scalability. | The variable-length encoding makes internal processing of the program more difficult. |
| UTF-16 | Two or four bytes. The minimum code unit is 16 bits. | With BOM: UTF-16LE (little-endian) is represented by FF FE, and UTF-16BE (big-endian) is represented by FE FF. | The earliest Unicode encoding method that has been applied to various scenarios. This method is suitable for Unicode processing in memory, and is used to encode strings in APIs across multiple programming languages. | This method is not compatible with ASCII encoding, and has poor scalability. The encoding is complex when surrogate pairs are used to encode code points in the supplementary planes. |
| UTF-32 | A fixed length of four bytes. The minimum code unit is 16 bits. | With BOM: UTF-16LE (little-endian) is represented by FF FE, and UTF-16BE (big-endian) is represented by FE FF. | A fixed-byte encoding that is easy to read and is internally processed by a compiler. This method provides a one-to-one mapping between Unicode code points and code units. | All characters are encoded in a fixed length of four bytes, which wastes storage space and bandwidth. This method is not compatible with ASCII encoding, has a poor scalability, and is not used in most cases. |
Database character set
Used to store the data types such as
CHAR,VARCHAR2, andCLOBUsed to identify the information such as table names, column names, and PL/SQL variables
Used to store SQL and PL/SQL program units
National character set
Used to store the data types such as
NCHAR,NVARCHAR2, andNCLOBThe national character set is essentially an additional character set that is selected for ApsaraDB for OceanBase. The national character set is mainly used to enhance the character processing capability of ApsaraDB for OceanBase. The
NCHARdata type uses the national character set. While using the database character set provided by theCHARdata type, theNCHARdata type provides an alternative to the database character set.