Create a table|V4.1.0| docs|Distributed Database

Create a table

Last Updated：2023-08-04 08:22:14 Updated

This topic describes the requirements and examples for creating a table.

About tables

A database table is a collection of two-dimensional arrays that represent and store the relationships between different objects in a database.

Tables in OceanBase Database can be classified into partitioned tables and non-partitioned tables based on the distribution of data.

A partitioned table contains multiple partitions. When you create a partitioned table, you must specify at least the partitioning columns and related partition information.
A non-partitioned table contains only one partition.

Preparations

Before you create a table, make sure that:

You have logged on to a MySQL tenant of OceanBase Database. For more information about how to connect to OceanBase Database, see Overview.

Note

You can query the oceanbase.DBA_OB_TENANTS view in the sys tenant to confirm the mode of the tenant to which you have logged on.
You have the CREATE privilege. For more information about how to view your privileges, see View user privileges. If you do not have the required privileges, contact the administrator. For more information, see Modify user privileges.
A database has been created. For more information, see Create a database.

Create a table

Use the CREATE TABLE statement to create a table and complete the table creation based on the content of this topic.

Define the table name

When you create a table, you must define a name for it. Observe the following requirements when you define a table name:

In MySQL mode of OceanBase Database, the name of each table must be unique in the database.
Use table names with meanings and avoid such names as table1.

Example 1: Create a table that stores order information.

obclient> CREATE TABLE ordr (c1 INT, c2 VARCHAR(50));
Query OK, 0 rows affected

Define columns

In a database, a column stores the values of a specific attribute. The name of the attribute is used as the column name. Apart from the column name, information about a column also includes the data type and maximum length (precision).

Observe the following requirements when you define columns for a table:

Select a proper data type for the data to be stored in columns.

For more information about the data types supported in MySQL mode of OceanBase Database, see Overview of data types.
For string data, we recommend that you use a variable-length string data type and specify the maximum length. Make sure that the maximum length specified is greater than the maximum number of characters to store. This is to avoid character truncation when the maximum length is exceeded.
Confirm whether primary key columns need to be defined for the table based on the requirements for [primary key columns](# Define primary key columns).
Confirm whether other constraints need to be added to columns based on the requirements for [other constraints](# Define other column constraints).
We recommend that you specify a default value for a column with the NOT NULL constraint. For a date or time column, you can set the current date or time of the database as the default value.

Define primary key columns

A PRIMARY KEY constraint imposes a primary key value rule on a key, which can be a column or set of columns. This rule ensures that the values in the group of one or more columns subject to the constraint uniquely identify the row. You can define only one PRIMARY KEY constraint on a table. The values of the primary key column or columns can be used as the unique identifiers of their corresponding rows. Therefore, each row can be named by a primary key value.

To specify a column as a primary key column, add the PRIMARY KEY keyword to the definition of this column. To define a PRIMARY KEY constraint on multiple columns, add the definition of the PRIMARY KEY constraint after the list of columns in the CREATE TABLE statement.

Observe the following requirements when you define primary key columns:

We recommend that you define one primary key column for a table. Each database table can contain a single collection of primary keys.

In OceanBase Database, users are not forced to define primary key columns for tables. However, primary key columns can help uniquely determine each row in a table and ensure that no duplicate rows exist. If no column can be used as the primary key, the system generates an auto-increment column as the hidden primary key column after table creation. For more information about auto-increment columns, see Define an auto-increment column for a table.

OceanBase Database allows you to add a primary key column for an existing table whose primary keys were not specified when the table was created.
The values of the collection of primary key columns are unique in the table.
A table can contain a maximum of 64 primary key columns, and the total length of data in the primary key columns cannot exceed 16 KB.
The values of a primary key column cannot be null or empty. You must enter values to a primary key column.
We recommend that a PRIMARY KEY constraint be explicitly identified by its name. For example, name a PRIMARY KEY constraint "PK_xxx".

Example 2: Define a PRIMARY KEY constraint on multiple columns.

obclient> CREATE TABLE test(c1 INT, c2 INT, CONSTRAINT PK_c1_c2 PRIMARY KEY(c1, c2));
Query OK, 0 rows affected

obclient> desc test;
+-------+---------+------+-----+---------+-------+
| Field | Type    | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| c1    | int(11) | NO   | PRI | NULL    |       |
| c2    | int(11) | NO   | PRI | NULL    |       |
+-------+---------+------+-----+---------+-------+
2 rows in set

In this example, c1 and c2 are defined as primary key columns and a PRIMARY KEY constraint named PK_c1_c2 is defined for them. The values of the c1 and c2 columns cannot be NULL or duplicate.

After you define primary key columns, you can delete them. For more information about operations related to PRIMARY KEY constraints, see Define column constraints.

Define other column constraints

Besides PRIMARY KEY constraints, OceanBase Database also supports NOT NULL, UNIQUE, FOREIGN KEY, and CHECK constraints. Constraints can simplify table queries, improve the query performance, and ensure that data is semantically valid.

The following describes constraints of different types:

NOT NULL constraint: The constrained column must not contain NULL values.

For a column with the NOT NULL constraint, you must specify the values for the column in the INSERT statement unless you have defined a non-null default value for the column.
UNIQUE constraint: The constrained column must not contain duplicate values. NULL values are not prohibited by this constraint type.
FOREIGN KEY constraint: The constrained column must contain only values from the primary key column of another table.

If you do not specify the constraint name when you create a FOREIGN KEY constraint, the system automatically allocates a constraint name in the table name_OBFK_creation timestamp format, for example, t1_OBFK_1627747200000000.

By default, OceanBase Database checks foreign keys. To disable or enable foreign key check, modify the foreign_key_checks tenant variable. For more information about the foreign_key_checks variable, see foreign_key_checks.
CHECK constraints: The values of a column in the table must conform to specified conditions.

You can define one or more CHECK constraints for a single column so that only specified values are allowed for the column. You can also define table-level CHECK constraints to apply the constraints to multiple columns. When you modify a table name, the CHECK constraint name remains unchanged. When you delete a table, the CHECK constraints applied to the table are also deleted.

If you do not specify the constraint name when you create a CHECK constraint, the system automatically allocates a constraint name in the table name_OBCHECK_creation timestamp format, for example: t1_OBCHECK_1629350823880271.

To constrain a single column, add a constraint keyword to the definition of this column. To constrain multiple columns, add the entire constraint definition after the list of columns in the CREATE TABLE statement.

Observe the following requirements when you define other column constraints:

We recommend that you add a NOT NULL constraint to fields without NULL values.
To reference the values in another table, use FOREIGN KEY constraints.
Composite primary keys cannot be used as foreign keys.
To avoid duplicate values in columns, use UNIQUE constraints.
Except NOT NULL constraints, we recommend that other constraints be explicitly identified by their names. For example, name UNIQUE constraints in the "UNI_xxx" format and FOREIGN KEY constraints in the "FK_xxx" format.

Example 3: Create a table named tbl1 and add a NOT NULL constraint on the col1 column.

obclient> CREATE TABLE tbl1(col1 INT NOT NULL,col2 INT);
Query OK, 0 rows affected

obclient> DESC tbl1;
+-------+---------+------+-----+---------+-------+
| Field | Type    | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| col1  | int(11) | NO   |     | NULL    |       |
| col2  | int(11) | YES  |     | NULL    |       |
+-------+---------+------+-----+---------+-------+
2 rows in set

In this example, when you insert data later, the values inserted into the col1 column cannot be NULL.

Example 4: Create a table named tbl2 and add a UNIQUE constraint on the col2 column.

obclient> CREATE TABLE tbl2(col1 INT UNIQUE,col2 INT);
Query OK, 0 rows affected

obclient> desc tbl2;
+-------+---------+------+-----+---------+-------+
| Field | Type    | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| col1  | int(11) | YES  | UNI | NULL    |       |
| col2  | int(11) | YES  |     | NULL    |       |
+-------+---------+------+-----+---------+-------+
2 rows in set

In this example, the col2 column must not contain duplicate values.

Example 5: Create a FOREIGN KEY constraint.

obclient> CREATE TABLE test(c1 INT, c2 INT, CONSTRAINT PK_c1 PRIMARY KEY(c1));
Query OK, 0 rows affected

obclient> CREATE TABLE tbl3(col1 INT PRIMARY KEY,col2 INT,CONSTRAINT FK_col2 FOREIGN KEY(col2) REFERENCES test(c1));
Query OK, 0 rows affected

obclient> SELECT * FROM information_schema.TABLE_CONSTRAINTS;
+--------------------+-------------------+-------------------------------+--------------+------------+-----------------+----------+
| CONSTRAINT_CATALOG | CONSTRAINT_SCHEMA | CONSTRAINT_NAME               | TABLE_SCHEMA | TABLE_NAME | CONSTRAINT_TYPE | ENFORCED |
+--------------------+-------------------+-------------------------------+--------------+------------+-----------------+----------+
| def                | xxx               | PRIMARY                       | xxx          | test       | PRIMARY KEY     | YES      |
| def                | xxx               | PRIMARY                       | xxx          | tbl3       | PRIMARY KEY     | YES      |
| def                | xxx               | FK_col2                       | xxx          | tbl3       | FOREIGN KEY     | YES      |
+--------------------+-------------------+-------------------------------+--------------+------------+-----------------+----------+
3 rows in set

In this example, the col2 column in the tbl3 table is associated with the c1 primary key column in the test table. After you create the constraint, you can query it in the information_schema.TABLE_CONSTRAINTS view.

Example 6: Create a table named tbl4 and add a CHECK constraint on the col1 column to specify that the values of this column be greater than 10.

obclient> CREATE TABLE tbl4(col1 INT CHECK(col1>10),col2 INT);
Query OK, 0 rows affected

obclient> INSERT INTO tbl4 VALUES(2,2);
ERROR 3819 (HY000): check constraint violated

obclient> INSERT INTO tbl4 VALUES(11,2);
Query OK, 1 row affected

In this example, after the CHECK constraint is added on the col1 column, an error is returned if any value inserted into the col1 column is not greater than 10.

For more information about how to define other column constraints, see Define column constraints.

Define an auto-increment column for a table

When you create a table in OceanBase Database, to ensure that the values of a numeric column are unique and incremental, you can set the type of the column to AUTO_INCREMENT, which indicates an auto-increment column.

An auto-increment column has three important attributes: start value, increment, and number of cached auto-increment values, which are respectively specified by using the following three tenant variables: auto_increment_offset, auto_increment_increment, and auto_increment_cache_size.

Variable	Description
auto_increment_cache_size	The number of cached auto-increment values. Value range: [1,100000000]. Default value: `1000000`. It is a global variable.
auto_increment_increment	The increment. Value range: [1, 65535]. Default value: `1`. It is a session-level variable.
auto_increment_offset	The start value of the auto-increment column. Value range: [1,65535]. Default value: `1`. It is a session-level variable.

You can change the values of the three variables as needed. For more information about how to change the values of system variables, see Overview of system variables.

Observe the following requirements when you define an auto-increment column:

AUTO_INCREMENT is an attribute of a data column that applies only to integer data columns.
AUTO_INCREMENT data columns must be configured with the NOT NULL attribute.
When you create a partitioned table in OceanBase Database, if an auto-increment column is used as a partitioning key, the values of the auto-increment column are globally unique but may not increment in a partition.

Assume that after you create an auto-increment column, you specify a value of the auto-increment column when you use the INSERT statement to insert data and the system variable SQL_MODE is not set to NO_AUTO_VALUE_ON_ZERO. If the specified value is 0, the system will insert the next value of the auto-increment column into the column. If the specified value is smaller than the current maximum value, the calculation of the next value of the auto-increment column is not affected. If the specified value is greater than the current maximum value, the auto-increment column uses the sum of the inserted value and the cached auto-increment value as the start value of the next increment operation.

Note

If the system variable `SQL_MODE` is set to `NO_AUTO_VALUE_ON_ZERO`, auto-increment values do not need to be generated for a column into which the value 0 is inserted.

Example 7: Create a table with an auto-increment column.

obclient> CREATE TABLE personal_info(id bigint NOT NULL AUTO_INCREMENT PRIMARY KEY, name varchar(50), gmt_create timestamp NOT NULL default current_timestamp);
Query OK, 0 rows affected

In this example, the id column is an auto-increment column. When you use the INSERT statement to insert data into this column, you do not need to specify values for the auto-increment column. The system automatically inserts values into this column.

obclient> INSERT INTO personal_info(name) VALUES('A'),('B'),('C');
Query OK, 3 rows affected
Records: 3  Duplicates: 0  Warnings: 0

obclient> SELECT * FROM personal_info;
+----+------+---------------------+
| id | name | gmt_create          |
+----+------+---------------------+
|  1 | A    | 2020-04-03 17:09:55 |
|  2 | B    | 2020-04-03 17:09:55 |
|  3 | C    | 2020-04-03 17:09:55 |
+----+------+---------------------+
3 rows in set

For more information about auto-increment columns, see Define an auto-increment column.

Select a partitioning strategy

When you create a table, you must clarify its partitioning strategy. If the table contains a large amount of data, we recommend that you partition the table. You must select a proper partitioning method based on the data to be stored in the table.

In MySQL mode of OceanBase Database, the following partitioning methods are supported:

RANGE and RANGE COLUMNS partitioning
LIST and LIST COLUMNS partitioning
HASH and KEY partitioning
Composite partitioning

Based on the partitioning dimension, partitioned tables are classified into partitioned tables and subpartitioned tables. Subpartitioned tables are secondary partitioned tables. Therefore, a partitioned table has one partitioning key and a subpartitioned table has two partitioning keys. A subpartitioned table can apply different partitioning strategies. In OceanBase Database, subpartitioned tables are classified into template-based subpartitioned tables and non-template-based subpartitioned tables.

RANGE and RANGE COLUMNS partitioning

In RANGE and RANGE COLUMNS partitioning, tables are partitioned based on the value range of the partitioning key of each partition. These partitioning methods apply when you need to perform range queries on the partitioning keys, for example, when partitions are defined based on the time field or price range.

Differences between RANGE partitioning and RANGE COLUMNS partitioning:

In RANGE partitioning, the partitioning key must be of the INT data type. For example, to partition a table based on the date field, you must use the YEAR() function to convert the date field into the INT type. In RANGE COLUMNS partitioning, the partitioning key does not need to be of an INT type. It can be of any data type.
In RANGE partitioning, the partitioning key can be an expression but not a column (column vector). For example, the partitioning key cannot be partition by range(c1, c2). In RANGE COLUMNS partitioning, the partitioning key cannot be an expression but can be multiple columns (column vectors).

In both partitioning methods, the VALUES LESS THAN(value) keyword is used to define each partition, where the values of the value field must be a group of consecutively incrementing values.

Example 8: Create a RANGE COLUMNS-partitioned table.

obclient> CREATE TABLE tb1_rc(col1 INT,col2 INT)
       PARTITION BY RANGE COLUMNS(col1)
        (PARTITION p0 VALUES LESS THAN(100),
         PARTITION p1 VALUES LESS THAN(200),
         PARTITION p2 VALUES LESS THAN(300)
        );
Query OK, 0 rows affected

In this example, the partitioning key can be of any data type. Therefore, the col1 column is used as the partitioning key. The tb1_rc table is partitioned based on the value ranges defined by 100, 200, and 300. p0, p1, and p2 are the partition names. Partition names must be unique in a table.

LIST and LIST COLUMNS partitioning

In LIST partitioning, a table is partitioned based on specific numeric values and the numeric values in the partitions do not overlap. The advantage of LIST partitioning is that you can partition unordered and unrelated data.

If you want to use multiple columns or a column of other data types as the partitioning key, you can choose LIST COLUMNS partitioning. LIST COLUMNS partitioning is a variant of LIST partitioning. During LIST COLUMNS partitioning, you can specify multiple partitioning keys. The partitioning keys can be of the following data types: INT, DATE, and DATETIME.

Differences between LIST partitioning and LIST COLUMNS partitioning:

The partitioning key must be of the INT data type in LIST partitioning, and can be of any type in LIST COLUMNS partitioning.
LIST partitioning supports only one partitioning key, which can be a column or an expression. In LIST COLUMNS partitioning, the partitioning key can be multiple columns (column vectors) but not an expression.

In both LIST and LIST COLUMNS partitioning, the VALUES IN(value_list) keyword is used to define each partition.

Example 9: Create a LIST-partitioned table.

obclient> CREATE TABLE tbl2_l (col1 INT,col2 DATE)
       PARTITION BY LIST(col1)
        (PARTITION p0 VALUES IN (100),
         PARTITION p1 VALUES IN (200)
        );
Query OK, 0 rows affected

In this example, the partitioning key in LIST partitioning must be of an INT type. Therefore, the col1 column is used as the partitioning key. The tbl2_l table is partitioned based on the value ranges defined by 100 and 200.

HASH and KEY partitioning

In HASH partitioning, you must specify the partitioning keys and the number of partitions. The system obtains an integer based on the HASH partitioning expression. Then the system performs a modulo operation on this integer and the number of partitions to determine the partition to which a specific row belongs.

KEY partitioning is similar to HASH partitioning. In both partitioning strategies, the partition to which a specific row belongs is determined by performing a modulo operation on the number of partitions. The difference is that in KEY partitioning, the system constructs a default internal hash function for the partitioning key and then performs the modulo operation. Therefore, you cannot determine the partition to which a row belongs through simple calculation.

Difference between KEY partitioning and HASH partitioning:

In HASH partitioning, the partitioning key must be of the INT data type. KEY partitioning does not have this requirement and the partitioning key can be a string column.
In HASH partitioning, the partitioning key can be an expression. In KEY partitioning, the partitioning key cannot be an expression.

Example 10: Create a HASH-partitioned table named tbl1_h.

obclient> CREATE TABLE tbl3_h(col1 INT,col2 VARCHAR(50))
    PARTITION BY HASH(col1) PARTITIONS 2;
Query OK, 0 rows affected

In this example, HASH partitioning requires the partitioning key to be of the INT data type. Therefore, the col1 column is used as the partitioning key to partition the tbl3_h table into two partitions. Partition names are not specified when the table is created. Therefore, the partitions are named by the system based on the naming rules. To be specific, the partitions are named as p0, p1, ..., and pn.

Composite partitioning (subpartitioning)

Composite partitioning partitions a table using one partitioning strategy and partitions each partition using a different partitioning strategy. It is suitable for business tables containing large amounts of data. Composite partitioning gives full play to the advantages of the two partitioning strategies that you use in combination.

Any of RANGE, RANGE COLUMNS, LIST, LIST COLUMNS, HASH, and KEY can be used as the subpartitioning strategy of a composite partitioned table. In OceanBase Database, subpartitioned tables are classified into template-based subpartitioned tables and non-template-based subpartitioned tables.

The following examples describe how to create a subpartitioned table. For more examples, see Create a partitioned table.

Example 11: Create a RANGE COLUMNS-RANGE-partitioned table by using a template.

obclient> CREATE TABLE tb1_m_rcr(col1 INT,col2 INT)
         PARTITION BY RANGE COLUMNS(col1)
         SUBPARTITION BY RANGE(col2)
         SUBPARTITION TEMPLATE
          (SUBPARTITION mp0 VALUES LESS THAN(3),
           SUBPARTITION mp1 VALUES LESS THAN(6),
           SUBPARTITION mp2 VALUES LESS THAN(9)
          )
          (PARTITION p0 VALUES LESS THAN(100),
           PARTITION p1 VALUES LESS THAN(200),
           PARTITION p2 VALUES LESS THAN(300)
          );
Query OK, 0 rows affected

obclient> SELECT table_name,partition_name,subpartition_name FROM information_schema.partitions;
+------------+----------------+-------------------+
| table_name | partition_name | subpartition_name |
+------------+----------------+-------------------+
| tb1_m_rcr  | p0             | p0smp0            |
| tb1_m_rcr  | p0             | p0smp1            |
| tb1_m_rcr  | p0             | p0smp2            |
| tb1_m_rcr  | p1             | p1smp0            |
| tb1_m_rcr  | p1             | p1smp1            |
| tb1_m_rcr  | p1             | p1smp2            |
| tb1_m_rcr  | p2             | p2smp0            |
| tb1_m_rcr  | p2             | p2smp1            |
| tb1_m_rcr  | p2             | p2smp2            |
+------------+----------------+-------------------+
9 rows in set

In this example, the partitioning key in RANGE partitioning must be of the INT data type. The col2 column can be used as the subpartitioning key, and the SUBPARTITION TEMPLATE keyword is used to create the subpartitioned table by using a template. When you use a template to create a subpartitioned table, the template defines subpartitions under each partition. In this case, subpartitions under each partition share the same definition. In this example, the table is partitioned by using RANGE partitioning and then subpartitioned by using LIST partitioning.

When you create a subpartitioned table by using a template, you do not need to separately specify the name of each subpartition after the subpartition definition is completed. The system names the subpartitions in the ($part_name)s($subpart_name) format based on the naming rules. In this example, the subpartitions in the p0 partition are named as p0smp0, p0smp1, and p0smp2.

Notice

If you use HASH or KEY partitioning to subpartition a table by specifying the number of subpartitions (such as SUBPARTITIONS 5), you do not need to specify the SUBPARTITION TEMPLATE keyword when you create a subpartitioned table by using a template.

Example 12: Create a LIST-LIST COLUMNS-subpartitioned table without using a template.

obclient> CREATE TABLE tbl2_f_llc(col1 INT,col2 DATE)
       PARTITION BY LIST(col1)
       SUBPARTITION BY LIST COLUMNS(col2)
        (PARTITION p0 VALUES IN(100)
           (SUBPARTITION sp0 VALUES IN('2021/04/01'),
            SUBPARTITION sp1 VALUES IN('2021/07/01'),
            SUBPARTITION sp2 VALUES IN('2021/10/01'),
            SUBPARTITION sp3 VALUES IN('2022/01/01')
           ),
         PARTITION p1 VALUES IN(200)
           (SUBPARTITION sp4 VALUES IN('2021/04/01'),
            SUBPARTITION sp5 VALUES IN('2021/07/01'),
            SUBPARTITION sp6 VALUES IN('2021/10/01'),
            SUBPARTITION sp7 VALUES IN('2022/01/01')
            )
         );
Query OK, 0 rows affected

In this example, a non-template-based subpartitioned table is created. You must define the subpartitions in each partition. The definitions of subpartitions in different partitions can be different or identical.

Example 13: Create a HASH-KEY-subpartitioned table without using a template.

obclient> CREATE TABLE tbl3_f_hk (col1 INT,col2 VARCHAR(50))
     PARTITION BY HASH(col1)
     SUBPARTITION BY KEY(col2)
      (PARTITION p1
       (SUBPARTITION sp0
       ,SUBPARTITION sp1
       ,SUBPARTITION sp2
       ,SUBPARTITION sp3
       ),
       PARTITION p2
       (SUBPARTITION sp4
       ,SUBPARTITION sp5
       ,SUBPARTITION sp6
       ,SUBPARTITION sp7
       )
      );
Query OK, 0 rows affected

In this example, KEY partitioning supports a string column as the partitioning key. Therefore, you can use the col2 column as the subpartitioning key. sp0, ..., and sp7 are subpartition names.

Create a table

About tables

Preparations

Create a table

Define the table name

Define columns

Define primary key columns

Define other column constraints

Define an auto-increment column for a table

Note

Select a partitioning strategy

Notice

What to do next