This topic describes how to create a table with some examples.
Overview
A table is a collection of two-dimensional arrays used to represent and store relationships between data objects. Through the proper design and use of database tables, data reliability, consistency, and query performance can be enhanced, effectively managing and utilizing the data within the database.
For more information about tables in OceanBase Database, see Overview of tables.
Prerequisites
Before you create a table, make sure that:
You have deployed an OceanBase cluster and created a MySQL tenant. For more information about how to deploy an OceanBase cluster, see Deployment overview.
You have connected to the MySQL tenant of OceanBase Database. For more information about how to connect to OceanBase Database, see Overview of connection methods.
You have created a database. For more information about how to create a database, see Create a database.
You have the
CREATEprivilege. For information about how to view your privileges, see View user privileges. If you do not have the privilege, contact the administrator. For more information about how to grant privileges, see Grant direct privileges.
Create a table by using a statement
You can use the CREATE TABLE statement to create a table.
Note
You can use the SHOW TABLES; statement to query information about tables in a database.
Define the table name
When you create a table, you must define a name for it. Observe the following requirements when you define a table name:
In the MySQL mode of OceanBase Database, the name of each table must be unique in the database.
The table name cannot exceed 64 characters in length.
We recommend that you define a meaningful table name and avoid such table names as
t1andtable1. For more information about the table naming conventions, see Table naming conventions.
Example 1: Create a table that stores order information.
Notice
Since no column information is added, the following SQL statement cannot be executed at the moment.
CREATE TABLE orders (...);
Define columns
In a database, a column stores the values of a specific attribute. The name of the attribute is used as the column name. Apart from the column name, information about a column also includes the data type and maximum length (precision).
Observe the following requirements when you define columns for a table:
Select a proper data type for the data to be stored in columns.
For more information about the data types supported in the MySQL mode of OceanBase Database, see Overview of data types.
For string data, we recommend that you use a variable-length string data type and specify the maximum length. Make sure that the maximum length specified is greater than the maximum number of characters to store. This is to avoid character truncation when the maximum length is exceeded.
Determine whether to define the
PRIMARY KEYconstraint on specific columns based on the requirements for thePRIMARY KEYconstraint.Determine whether to define other constraints on specific columns based on the requirement for other constraints.
When a column has the
NOT NULLconstraint, we recommend that you set a default value for the column. If the column is of a date or time data type, you can set the default value to the current time of the database.
Define a primary key column
A PRIMARY KEY constraint imposes a primary key value rule on a key, which can be a column or a set of columns. This rule ensures that each data row in a table can be uniquely identified by a key value. Only one PRIMARY KEY constraint can be defined on each database table. The values of the columns that make up this constraint (one or more columns) can serve as unique identifiers for each row, effectively naming each data row with this primary key value.
To set a column as a primary key column, append the PRIMARY KEY keyword to the definition of the column. To define a PRIMARY KEY constraint on multiple columns, add the PRIMARY KEY constraint after the list of all target columns in the CREATE TABLE statement.
Observe the following requirements when you define a primary key column:
We recommend that you define a primary key for each table. Each database table can have one primary key column set.
Although you are not forced to define a primary key for a table in OceanBase Database, we recommend that you do so to ensure that each row can be uniquely identified and that no duplicate rows exist in the table. If no existing columns can be used as the primary key, the system generates an auto-increment column as the hidden primary key after table creation. For more information about auto-increment columns, see Define an auto-increment column.
If you do not define a primary key column for a table when you create the table, you can add a primary key column to the table later.
The values of the primary key column set are unique in the table.
A primary key can contain at most 64 columns, and the total length of primary key data cannot exceed 16 KB.
A primary key cannot contain a null or empty value.
We recommend that you explicitly specify the name for a
PRIMARY KEYconstraint. For example, you can name thePRIMARY KEYconstraint "PK_xxx".
For more information about the PRIMARY KEY constraint, see PRIMARY KEY constraints.
Example 2: Define a PRIMARY KEY constraint on multiple columns.
obclient> CREATE TABLE test(c1 INT, c2 INT, CONSTRAINT PK_c1_c2 PRIMARY KEY(c1, c2));
Query OK, 0 rows affected
obclient> desc test;
+-------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| c1 | int(11) | NO | PRI | NULL | |
| c2 | int(11) | NO | PRI | NULL | |
+-------+---------+------+-----+---------+-------+
2 rows in set
In this example, columns c1 and c2 are defined as primary key columns, and the constraint name is PK_c1_c2. Values in the c1 and c2 columns cannot be NULL or duplicate.
After you define a PRIMARY KEY constraint, you can drop it. For more information, see Define column constraints.
Define other column constraints
In addition to the PRIMARY KEY constraint, OceanBase Database also supports the NOT NULL, UNIQUE, FOREIGN KEY, and CHECK constraints. The constraints can simplify table queries, improve query performance, and ensure semantic validity of data.
Constraints of different types are described as follows:
NOT NULL: the constrained column cannot containNULLvalues.For a column with the
NOT NULLconstraint, you must specify the value in anINSERTstatement unless you have defined a default value for the column.UNIQUE: the constrained column cannot contain duplicate values, but can contain more than oneNULLvalue.FOREIGN KEY: the values in the constrained column come from a primary key column of another table.If you do not specify the foreign key name when you create a
FOREIGN KEYconstraint, the system automatically assigns a constraint name in the format oftable name_OBFK_creation timestamp, such ast1_OBFK_1627747200000000.By default, OceanBase Database checks foreign keys. To disable or enable foreign key check, modify the
foreign_key_checkstenant variable. For more information about theforeign_key_checksvariable, see foreign_key_checks.CHECK: the constrained column must meet specified conditions.You can define one or more
CHECKconstraints for a single column so that only specified values are allowed for the column. You can also define table-levelCHECKconstraints to apply the constraints to multiple columns. When you modify a table name, theCHECKconstraint name remains unchanged. When you drop a table, theCHECKconstraints applied to the table are also dropped.If you do not specify the constraint name when you create a
CHECKconstraint, the system automatically assigns a constraint name in the format oftable name_OBCHECK_creation timestamp, such ast1_OBCHECK_1629350823880271.
To constrain a single column, add a constraint keyword to the definition of the column. To constrain multiple columns, add the definition of the constraint to the end of the list of all columns in the CREATE TABLE statement.
Observe the following requirements when you define additional column constraints:
We recommend that you add the
NOT NULLconstraint to columns that do not contain NULL values.If you want to reference a value from another table, use the
FOREIGN KEYconstraint.Composite primary keys cannot be used as foreign keys.
If you want to avoid duplicate values in a column, use the
UNIQUEconstraint.We recommend that you specify a name for each constraint, except
NOT NULL. For example, you can name aUNIQUEconstraint "UNI_xxx" and aFOREIGN KEYconstraint "FK_xxx".
Example 3: Create a table named tbl1 and set a NOT NULL constraint on the col1 column.
obclient> CREATE TABLE tbl1(col1 INT NOT NULL,col2 INT);
Query OK, 0 rows affected
obclient> DESC tbl1;
+-------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| col1 | int(11) | NO | | NULL | |
| col2 | int(11) | YES | | NULL | |
+-------+---------+------+-----+---------+-------+
2 rows in set
In this example, values subsequently inserted into the col1 column cannot be NULL.
Example 4: Create a table named tbl2 and define a UNIQUE constraint on the col1 column.
obclient> CREATE TABLE tbl2(col1 INT UNIQUE,col2 INT);
Query OK, 0 rows affected
obclient> desc tbl2;
+-------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| col1 | int(11) | YES | UNI | NULL | |
| col2 | int(11) | YES | | NULL | |
+-------+---------+------+-----+---------+-------+
2 rows in set
In this example, the col1 column must not contain duplicate values.
Example 5: Set a FOREIGN KEY constraint on a column of the tbl3 table.
obclient> CREATE TABLE test(c1 INT, c2 INT, CONSTRAINT PK_c1 PRIMARY KEY(c1));
Query OK, 0 rows affected
obclient> CREATE TABLE tbl3(col1 INT PRIMARY KEY,col2 INT,CONSTRAINT FK_col2 FOREIGN KEY(col2) REFERENCES test(c1));
Query OK, 0 rows affected
obclient> SELECT * FROM information_schema.TABLE_CONSTRAINTS;
+--------------------+-------------------+-------------------------------+--------------+------------+-----------------+----------+
| CONSTRAINT_CATALOG | CONSTRAINT_SCHEMA | CONSTRAINT_NAME | TABLE_SCHEMA | TABLE_NAME | CONSTRAINT_TYPE | ENFORCED |
+--------------------+-------------------+-------------------------------+--------------+------------+-----------------+----------+
| def | xxx | PRIMARY | xxx | test | PRIMARY KEY | YES |
| def | xxx | PRIMARY | xxx | tbl3 | PRIMARY KEY | YES |
| def | xxx | FK_col2 | xxx | tbl3 | FOREIGN KEY | YES |
+--------------------+-------------------+-------------------------------+--------------+------------+-----------------+----------+
3 rows in set
In this example, the col2 column in the tbl3 table is associated with the c1 primary key column in the test table. You can view the constraint in the information_schema.TABLE_CONSTRAINTS view.
Example 6: Create a table named tbl4 and add a CHECK constraint on the col1 column to specify that the values of this column must be greater than 10.
obclient> CREATE TABLE tbl4(col1 INT CHECK(col1>10),col2 INT);
Query OK, 0 rows affected
obclient> INSERT INTO tbl4 VALUES(2,2);
ERROR 3819 (HY000): check constraint violated
obclient> INSERT INTO tbl4 VALUES(11,2);
Query OK, 1 row affected
In this example, after the CHECK constraint is added on the col1 column, an error is returned if any value inserted into the col1 column is not greater than 10.
For more information about how to define other column constraints, see Define column constraints.
Define an auto-increment column for a table
When you create a table in OceanBase Database, to ensure that the values of a numeric column are unique and incremental, you can set the type of the column to AUTO_INCREMENT, which indicates an auto-increment column.
An auto-increment column has three important attributes: start value, increment, and number of cached auto-increment values, which are respectively specified by the auto_increment_cache_size, auto_increment_increment, and auto_increment_offset tenant variables.
| Variable | Description |
|---|---|
| auto_increment_cache_size | The cache size. Value range: [1, 100000000]. Default value: 1000000. This variable takes effect globally. |
| auto_increment_increment | The increment step. Value range: [1, 65535]. Default value: 1. This variable takes effect at the session level. |
| auto_increment_offset | The start value of the AUTO_INCREMENT column. Value range: [1, 65535]. Default value: 1. This variable takes effect at the session level. |
You can change the values of the three variables as needed. For more information about how to change the values of system variables, see Overview.
Observe the following requirements when you define an auto-increment column:
AUTO_INCREMENTis an attribute of a data column that applies only to integer data columns.AUTO_INCREMENTdata columns must be configured with theNOT NULLattribute.When you create a partitioned table in OceanBase Database, if an auto-increment column is used as a partitioning key, the values of the auto-increment column are globally unique but may not increment in a partition.
Assume that after you create an auto-increment column, you specify a value for the auto-increment column when you use the INSERT statement to insert data and the system variable SQL_MODE is not set to NO_AUTO_VALUE_ON_ZERO. If the specified value is 0, the system will insert the next value of the auto-increment column into the column. If the specified value is smaller than the current maximum value, the calculation of the next value of the auto-increment column is not affected. If the specified value is greater than the current maximum value, the auto-increment column uses the sum of the inserted value and the cached auto-increment value as the start value of the next increment operation.
Note
If the system variable SQL_MODE is set to NO_AUTO_VALUE_ON_ZERO, auto-increment values do not need to be generated for a column into which the value 0 is inserted.
Example 7: Create a table with an auto-increment column.
obclient> CREATE TABLE personal_info(id bigint NOT NULL AUTO_INCREMENT PRIMARY KEY, name varchar(50), gmt_create timestamp NOT NULL default current_timestamp);
Query OK, 0 rows affected
In this example, the id column is an auto-increment column. When you use the INSERT statement to insert data into this column, you do not need to specify values for the auto-increment column. The system automatically inserts values into this column.
obclient> INSERT INTO personal_info(name) VALUES('A'),('B'),('C');
Query OK, 3 rows affected
Records: 3 Duplicates: 0 Warnings: 0
obclient> SELECT * FROM personal_info;
+----+------+---------------------+
| id | name | gmt_create |
+----+------+---------------------+
| 1 | A | 2020-04-03 17:09:55 |
| 2 | B | 2020-04-03 17:09:55 |
| 3 | C | 2020-04-03 17:09:55 |
+----+------+---------------------+
3 rows in set
For more information about auto-increment columns, see Define an auto-increment column.
Select a partitioning strategy
When you create a table, you must clarify its partitioning strategy. If the table contains a large amount of data, we recommend that you partition the table. You must select a proper partitioning method based on the data to be stored in the table.
In the MySQL mode of OceanBase Database, the max_partition_num parameter specifies the maximum number of partitions supported for a single table of a tenant. The default value is 8192.
In the MySQL mode of OceanBase Database, the following partitioning methods are supported:
RANGE and RANGE COLUMNS partitioning
LIST and LIST COLUMNS partitioning
HASH and KEY partitioning
Composite partitioning
Based on the partitioning dimension, partitioned tables are classified into partitioned tables and subpartitioned tables. Subpartitioned tables are secondary partitioned tables. Therefore, a partitioned table has one partitioning key and a subpartitioned table has two partitioning keys. A subpartitioned table can apply different partitioning strategies. In OceanBase Database, subpartitioned tables are classified into template-based subpartitioned tables and non-template-based subpartitioned tables.
For more information about partitions, see Overview of partitions.
RANGE and RANGE COLUMNS partitioning
In RANGE and RANGE COLUMNS partitioning, tables are partitioned based on the value range of the partitioning key of each partition. These partitioning methods apply when you need to perform range queries on the partitioning keys, for example, when partitions are defined based on the time field or price range.
Differences between RANGE partitioning and RANGE COLUMNS partitioning are as follows:
In RANGE partitioning, the partitioning key must be of the INT data type. For example, to partition a table based on the date field, you must use the
YEAR()function to convert the date field into the INT type. In RANGE COLUMNS partitioning, the partitioning key does not need to be of the INT type. It can be of any data type.In RANGE partitioning, the partitioning key can be an expression but not a column (column vector). For example, the partitioning key cannot be
partition by range(c1, c2). In RANGE COLUMNS partitioning, the partitioning key cannot be an expression but can be multiple columns (column vectors).
In both RANGE and RANGE COLUMNS partitioning, the VALUES LESS THAN(value) keyword is used to define each partition. The values of value must be consecutively incrementing values.
Example 8: Create a RANGE COLUMNS-partitioned table.
obclient> CREATE TABLE tb1_rc(col1 INT,col2 INT)
PARTITION BY RANGE COLUMNS(col1)
(PARTITION p0 VALUES LESS THAN(100),
PARTITION p1 VALUES LESS THAN(200),
PARTITION p2 VALUES LESS THAN(300)
);
Query OK, 0 rows affected
In this example, the partitioning key can be of any data type. Therefore, the col1 column is used as the partitioning key. The tb1_rc table is partitioned based on the value ranges defined by 100, 200, and 300. p0, p1, and p2 are the partition names, which can be customized. Partition names must be unique in a table.
LIST and LIST COLUMNS partitioning
In LIST partitioning, a table is partitioned based on specific numeric values and the numeric values in the partitions do not overlap. The advantage of LIST partitioning is that you can partition unordered and unrelated data.
If you want to use multiple columns or a column of other data types as the partitioning key, you can choose LIST COLUMNS partitioning. LIST COLUMNS partitioning is a variant of LIST partitioning. During LIST COLUMNS partitioning, you can specify multiple partitioning keys. The partitioning keys can be of the following data types: INT, DATE, and DATETIME.
Differences between LIST partitioning and LIST COLUMNS partitioning are as follows:
The partitioning key must be of the INT data type in LIST partitioning, and can be of any type in LIST COLUMNS partitioning.
LIST partitioning supports only one partitioning key, which can be a column or an expression. In LIST COLUMNS partitioning, the partitioning key can be multiple columns (column vectors) but not an expression.
In both LIST and LIST COLUMNS partitioning, the VALUES IN(value_list) keyword is used to define each partition.
Example 9: Create a LIST-partitioned table.
obclient> CREATE TABLE tbl2_l (col1 INT,col2 DATE)
PARTITION BY LIST(col1)
(PARTITION p0 VALUES IN (100),
PARTITION p1 VALUES IN (200)
);
Query OK, 0 rows affected
In this example, the partitioning key in LIST partitioning must be of an INT type. Therefore, the col1 column is used as the partitioning key. The tbl2_l table is partitioned based on the value ranges defined by 100 and 200.
HASH and KEY partitioning
In HASH partitioning, you must specify the partitioning keys and the number of partitions. The system obtains an integer based on the HASH partitioning expression. Then the system performs a modulo operation on this integer and the number of partitions to determine the partition to which a specific row belongs.
KEY partitioning is similar to HASH partitioning. In both partitioning strategies, the partition to which a specific row belongs is determined by performing a modulo operation on the number of partitions. The difference is that in KEY partitioning, the system constructs a default internal hash function for the partitioning key and then performs the modulo operation. Therefore, you cannot determine the partition to which a row belongs through simple calculation.
Differences between KEY partitioning and HASH partitioning are as follows:
In HASH partitioning, the partitioning key must be of the INT data type. KEY partitioning does not have this requirement and the partitioning key can be a string column.
In HASH partitioning, the partitioning key can be an expression. In KEY partitioning, the partitioning key cannot be an expression.
Example 10: Create a HASH-partitioned table named tbl3_h.
obclient> CREATE TABLE tbl3_h(col1 INT,col2 VARCHAR(50))
PARTITION BY HASH(col1) PARTITIONS 2;
Query OK, 0 rows affected
In this example, HASH partitioning requires the partitioning key to be of the INT data type. Therefore, the col1 column is used as the partitioning key to partition the tbl3_h table into two partitions. Partition names are not specified when the table is created. Therefore, the partitions are named by the system based on the naming rules. To be specific, the partitions are named as p0, p1, ..., and pn.
Composite partitioning (subpartitioning)
Composite partitioning partitions a table using one partitioning strategy and partitions each partition using a different partitioning strategy. It is suitable for business tables containing large amounts of data. Composite partitioning gives full play to the advantages of the two partitioning strategies that you use in combination.
Any of RANGE, RANGE COLUMNS, LIST, LIST COLUMNS, HASH, and KEY can be used as the subpartitioning strategy of a composite partitioned table. In OceanBase Database, subpartitioned tables are classified into template-based subpartitioned tables and non-template-based subpartitioned tables.
The following examples briefly show you how to create a subpartitioned table.
Example 11: Create a RANGE COLUMNS-RANGE-subpartitioned table by using a template.
obclient> CREATE TABLE tb1_m_rcr(col1 INT,col2 INT)
PARTITION BY RANGE COLUMNS(col1)
SUBPARTITION BY RANGE(col2)
SUBPARTITION TEMPLATE
(SUBPARTITION mp0 VALUES LESS THAN(3),
SUBPARTITION mp1 VALUES LESS THAN(6),
SUBPARTITION mp2 VALUES LESS THAN(9)
)
(PARTITION p0 VALUES LESS THAN(100),
PARTITION p1 VALUES LESS THAN(200),
PARTITION p2 VALUES LESS THAN(300)
);
Query OK, 0 rows affected
obclient> SELECT table_name,partition_name,subpartition_name FROM information_schema.partitions;
+------------+----------------+-------------------+
| table_name | partition_name | subpartition_name |
+------------+----------------+-------------------+
| tb1_m_rcr | p0 | p0smp0 |
| tb1_m_rcr | p0 | p0smp1 |
| tb1_m_rcr | p0 | p0smp2 |
| tb1_m_rcr | p1 | p1smp0 |
| tb1_m_rcr | p1 | p1smp1 |
| tb1_m_rcr | p1 | p1smp2 |
| tb1_m_rcr | p2 | p2smp0 |
| tb1_m_rcr | p2 | p2smp1 |
| tb1_m_rcr | p2 | p2smp2 |
+------------+----------------+-------------------+
9 rows in set
In this example, the partitioning key in RANGE partitioning must be of an INT type. Therefore, the col2 column can be used as the subpartitioning partitioning key, and the SUBPARTITION TEMPLATE keyword is used to create the subpartitioned table by using a template. When you use a template to create a subpartitioned table, the template defines subpartitions under each partition. In this case, subpartitions under each partition share the same definition. In this example, LIST subpartitioning is performed after RANGE partitioning.
When you create a subpartitioned table by using a template, you do not need to separately specify the name of each subpartition after the subpartition definition is completed. The system names the subpartitions in the format of ($part_name)s($subpart_name) based on the naming rules. In this example, the subpartitions in the p0 partition are named as p0smp0, p0smp1, and p0smp2.
Notice
If you use HASH or KEY partitioning to subpartition a table by specifying the number of subpartitions, such as SUBPARTITIONS 5, you do not need to specify the SUBPARTITION TEMPLATE keyword when you create a subpartitioned table by using a template.
Example 12: Create a LIST-LIST COLUMNS-subpartitioned table without using a template.
obclient> CREATE TABLE tbl2_f_llc(col1 INT,col2 DATE)
PARTITION BY LIST(col1)
SUBPARTITION BY LIST COLUMNS(col2)
(PARTITION p0 VALUES IN(100)
(SUBPARTITION sp0 VALUES IN('2021/04/01'),
SUBPARTITION sp1 VALUES IN('2021/07/01'),
SUBPARTITION sp2 VALUES IN('2021/10/01'),
SUBPARTITION sp3 VALUES IN('2022/01/01')
),
PARTITION p1 VALUES IN(200)
(SUBPARTITION sp4 VALUES IN('2021/04/01'),
SUBPARTITION sp5 VALUES IN('2021/07/01'),
SUBPARTITION sp6 VALUES IN('2021/10/01'),
SUBPARTITION sp7 VALUES IN('2022/01/01')
)
);
Query OK, 0 rows affected
In this example, a non-template-based subpartitioned table is created. You must define the subpartitions in each partition. The definitions of subpartitions in different partitions can be different or identical.
Example 13: Create a HASH-KEY-subpartitioned table without using a template.
obclient> CREATE TABLE tbl3_f_hk (col1 INT,col2 VARCHAR(50))
PARTITION BY HASH(col1)
SUBPARTITION BY KEY(col2)
(PARTITION p1
(SUBPARTITION sp0
,SUBPARTITION sp1
,SUBPARTITION sp2
,SUBPARTITION sp3
),
PARTITION p2
(SUBPARTITION sp4
,SUBPARTITION sp5
,SUBPARTITION sp6
,SUBPARTITION sp7
)
);
Query OK, 0 rows affected
In this example, KEY partitioning supports a string column as a partitioning key. Therefore, the col2 column can be used as the subpartitioning key. Here, sp0, ..., and sp7 are the specified subpartition names.
Create a replicated table
A replicated table is a special type of table in OceanBase Database. Such a table can read the latest modification of data from any "healthy" replica. Replicated tables are suitable for scenarios with low write frequency and high read frequency.
For more information, see Create a replicated table in Create a table.
Notice
Replicated tables can be created only in user tenants but not in the sys tenant.
The SQL syntax for creating a replicated table is as follows:
CREATE TABLE table_name column_definition DUPLICATE_SCOPE='none | cluster';
where
table_namespecifies the name of the table.column_definitionspecifies the column information of the table, including column definitions and primary key definitions.DUPLICATE_SCOPEspecifies the attribute of the replica table. Valid values include:none: The table is a normal table.cluster: The table is a replicated table. The leader must copy transactions to all full-featured and read-only replicas in the current tenant.
Example 14: Create a replica table named test_tbl14.
CREATE TABLE test_tbl14 (col1 INT,col2 INT) DUPLICATE_SCOPE= 'cluster';
What to do next
After creating a table, you can perform the following operations:
Use the
INSERTstatement to insert data into the table. For more information about how to insert data into a table, see Insert data.Create indexes on columns in the table to improve the query performance. For more information about how to create an index, see Create an index.
References
- For more information about how to query the attributes of a table, see Query the definition of a table.
- For more information about how to drop a table, see Drop a table.
- For more information about how to modify the schema of a table, see Modify a table.
- For more information about how to create a partitioned table, see Create a partitioned table.