You can use the CREATE TABLE statement to create a table.
This topic describes how to create a nonpartitioned table. For information about how to create and use a partitioned table, see Create a partitioned table.
Create a non-partitioned table
Creating a non-partitioned table means creating a table with only one partition.
The following example shows how to create a non-partitioned table:
obclient>CREATE TABLE table_name1(w_id int
, w_ytd decimal(12,2)
, w_tax decimal(4,4)
, w_name varchar(10)
, w_street_1 varchar(20)
, w_street_2 varchar(20)
, w_city varchar(20)
, w_state char(2)
, w_zip char(9)
, unique(w_name, w_city)
, primary key(w_id)
);
Query OK, 0 rows affected (0.09 sec)
obclient>CREATE TABLE table_name2 (c_w_id int NOT NULL
, c_d_id int NOT null
, c_id int NOT null
, c_discount decimal(4, 4)
, c_credit char(2)
, c_last varchar(16)
, c_first varchar(16)
, c_middle char(2)
, c_balance decimal(12, 2)
, c_ytd_payment decimal(12, 2)
, c_payment_cnt int
, c_credit_lim decimal(12, 2)
, c_street_1 varchar(20)
, c_street_2 varchar(20)
, c_city varchar(20)
, c_state char(2)
, c_zip char(9)
, c_phone char(16)
, c_since date
, c_delivery_cnt int
, c_data varchar(500)
, index icust(c_last, c_d_id, c_w_id, c_first, c_id)
, FOREIGN KEY (c_w_id) REFERENCES table_name1(w_id)
, primary key (c_w_id, c_d_id, c_id)
);
Query OK, 0 rows affected
The example creates two tables and defines constraints for the columns, including primary keys and foreign keys on different columns. For more information about primary keys and foreign keys, see Define the constraint type for a column.
When you create columns for a table, choose the appropriate data type. For more information about SQL data types, see Data types.
Note
To improve performance and facilitate later maintenance, we recommend that you define a primary key or unique key for a table. If no suitable column can be used as the primary key, you can omit the PRIMARY KEY clause when you create the table. After the table is created, the system automatically specifies an auto-increment column as the hidden primary key for the table. For more information about auto-increment columns, see Define an auto-increment column.
Create a replicated table
A replicated table is a special type of table in OceanBase Database. This table can read the latest data modifications from any healthy replica. For users who have low write requirements but high read latency and load balancing requirements, a replicated table is an excellent choice.
After a user creates a replicated table, a replica of the table is created on every OBServer node of the tenant. One of these replicas is selected as the leader, which accepts write requests, while the remaining replicas only accept read requests.
All replicas need to report their status to the leader, primarily their replay progress, which indicates the synchronization progress of data. Generally, the replay progress of a follower replica lags slightly behind that of the leader. As long as the lag does not exceed a certain threshold, the leader considers the replica to be "healthy" and can quickly replay the modifications from the leader. Once the leader considers a replica to be "healthy" for a certain period, it grants a lease to the follower. In simple terms, the leader trusts the follower to remain "healthy" and provide strongly consistent read services for a period of time. During this "trust" period, the leader confirms the replay progress of the follower before committing each transaction in the replicated table. After the follower replays the modifications of the transaction, the leader reports to the user that the transaction has been successfully committed. At this point, the user can read the modifications of the committed transaction from the follower.
The replicated table feature was introduced in OceanBase Database V3.x. However, due to significant changes in the architecture of OceanBase Database V4.x, the replicated table feature in V4.x has been adapted to the new single-node log stream architecture. Specifically, V4.x introduces a partition-based readable version number verification mechanism and a log stream-based lease granting mechanism to ensure the correctness of strongly consistent reads.
Additionally, the replicated table feature in V4.x has improved the ability to switch leaders without killing transactions. In V3.x, uncommitted transactions in replicated tables would fail when a leader switch was initiated by the user or the load balancer. However, in V4.x, these transactions can continue to execute after the leader switch. Compared to V3.x, the replicated table feature in V4.x also offers better write transaction performance and stronger disaster recovery capabilities, with a lower impact on read operations when a replica fails.
Limitations of replicated tables
Replicated tables:
Creation limitations: The sys and meta tenants do not support creating replicated tables because they do not have broadcast log streams.
Write performance is affected by the number of nodes: Since replicated tables require synchronization to all replicas, the more nodes, the greater the impact on write performance.
- Workaround: Avoid having both write and read operations on replicated tables within the same transaction. Pure write or pure read operations on replicated tables are fine.
Attribute conversion:
- Replicated tables and table groups are mutually exclusive. Modifying the table group attribute of a replicated table will result in an error. Additionally, when converting a regular table to a replicated table, if the regular table belongs to a table group, the attribute change command will also result in an error.
- Converting a replicated table depends on load balancing and transfer. Ensure that the relevant parameters are enabled.
Routing:
If a transaction contains write operations on a replicated table, subsequent queries on the replicated table may be randomly routed to a follower replica, which may be unreadable. The observer internally forwards the query to the leader replica of the replicated table, which may affect query performance.
- ODP V4.3.3 has adjusted the routing strategy. If there are write operations on a replicated table, subsequent queries will be routed to the leader replica of the replicated table.
Queries that join a replicated table with a regular table will be randomly routed based on the replicated table. For
JOINqueries, ODP routes based on the first table. This may route to a non-leader node of the regular table, resulting in a remote plan.When a regular table is converted to a replicated table, ODP is unaware of the change and cannot route based on the replicated table to distribute the load.
Within a transaction, if changes have been made to a replicated table, subsequent queries will generate an execution plan that selects the local replica. However, due to the changes, an error indicating that the replica is unreadable will be reported. The SQL retry will select the leader replica, and the plan cache will not hit, resulting in poor query performance.
Note
Within a transaction, if changes have been made to a replicated table, OceanBase Database will always prioritize selecting the leader replica of the replicated table for queries, rather than directly selecting the local replica. This ensures that OceanBase Database improves query efficiency and avoids selecting the wrong replica.
Broadcast log streams:
- Each user tenant can have at most one broadcast log stream.
- Attribute conversion between broadcast log streams and regular log streams is not supported.
- Broadcast log streams cannot be manually deleted and will be deleted along with the tenant.
Syntax for creating a replicated table
To create a replicated table, add the DUPLICATE_SCOPE option to the CREATE TABLE statement. Only user tenants can create replicated tables, and the sys tenant cannot create replicated tables. The SQL statement for creating a replicated table is as follows:
CREATE TABLE table_name column_definition DUPLICATE_SCOPE='none | cluster';
The DUPLICATE_SCOPE parameter specifies the attribute of the replicated table. The valid values are as follows:
none: Indicates that the table is a regular table.cluster: Indicates that the table is a replicated table, and the leader needs to replicate transactions to all F and R replicas of the current tenant.
If you do not specify DUPLICATE_SCOPE when creating a table, the default value is none.
CREATE TABLE dup_t1 (c1 int,c2 int) DUPLICATE_SCOPE= 'cluster';
When the first replicated table of a tenant is created, the system simultaneously creates a special log stream called a broadcast log stream. All subsequent replicated tables will be created on this broadcast log stream. The key difference between a broadcast log stream and a regular log stream is that a broadcast log stream automatically deploys a replica on every OBServer node within the tenant. This ensures that, under ideal conditions, a replicated table can provide strongly consistent reads from any OBServer node. You can use the following SQL statement to view the broadcast log stream where the replicated tables of a tenant are located:
SELECT * FROM oceanbase.DBA_OB_LS WHERE flag LIKE "%DUPLICATE%";
The query result is as follows.
+-------+--------+--------------+---------------+-------------+---------------------+----------+---------------------+---------------------+-----------+-----------+
| LS_ID | STATUS | PRIMARY_ZONE | UNIT_GROUP_ID | LS_GROUP_ID | CREATE_SCN | DROP_SCN | SYNC_SCN | READABLE_SCN | FLAG | UNIT_LIST |
+-------+--------+--------------+---------------+-------------+---------------------+----------+---------------------+---------------------+-----------+-----------+
| 1003 | NORMAL | z1;z2 | 0 | 0 | 1683267390195713284 | NULL | 1683337744205408139 | 1683337744205408139 | DUPLICATE | |
+-------+--------+--------------+---------------+-------------+---------------------+----------+---------------------+---------------------+-----------+-----------+
1 rows in set
In the example, the log stream with LS_ID 1003 is the broadcast log stream, and all replicated tables of the tenant are created on this log stream. For more information about broadcast log streams, see Replica introduction.
After a replicated table is created, you can perform insert and read/write operations on it, just like a regular table. However, for read requests, if you connect to the database using a proxy, the read request may be routed to any OBServer node for execution. If you connect to the database directly, the system will execute the read request on the directly connected OBServer node as long as the local replica is readable. For more information about database connection methods, see Overview of connection methods.
Create a new table by copying data from an existing table
Copy data from a table
You can use the CREATE TABLE AS SELECT statement to copy data from a table. However, the structure will not be completely consistent, and information such as constraints, indexes, default values, and partitions will be lost.
Example statement:
obclient>CREATE TABLE t1_copy AS SELECT * FROM t1;
Query OK, 3 rows affected
Copy the structure of a table
You can use the CREATE TABLE LIKE statement to copy the structure of a table, but you cannot copy the data.
Example statement:
obclient>CREATE TABLE t1_like like t1;
Query OK, 0 rows affected
Create a rowstore table
OceanBase Database allows you to create a rowstore table and convert a rowstore table to a columnstore table.
If the default_table_store_format parameter is set to row (the default value), a rowstore table is created by default. If the default_table_store_format parameter is not set to row, you can specify the WITH COLUMN GROUP(all columns) option to create a rowstore table.
For more information about converting a rowstore table to a columnstore table, see Change a table. For more information about creating a columnstore index, see Create an index.
You can specify the WITH COLUMN GROUP(all columns) option to create a rowstore table.
Here is an example:
CREATE TABLE tbl1_cg (col1 INT PRIMARY KEY, col2 VARCHAR(50)) WITH COLUMN GROUP(all columns);
Note
If you specify the WITH COLUMN GROUP(all columns) option to create a rowstore table, the table remains in the rowstore format even if you later execute the DROP COLUMN GROUP(all columns) statement to drop the column group.
Create a columnstore table
OceanBase Database allows you to create a columnstore table, convert a rowstore table to a columnstore table, and create a columnstore index. By default, a rowstore table is created when you create a table. You can specify the WITH COLUMN GROUP option to explicitly specify the columnstore or rowstore-columnstore redundancy mode.
For more information about converting a rowstore table to a columnstore table, see Change a table. For more information about creating a columnstore index, see Create an index.
You can specify the WITH COLUMN GROUP(all columns, each column) option to create a rowstore-columnstore redundancy table.
Here is an example:
CREATE TABLE tbl1_cg (col1 INT PRIMARY KEY, col2 VARCHAR(50)) WITH COLUMN GROUP(all columns, each column);
You can specify the WITH COLUMN GROUP(each column) option to create a columnstore table.
Here is an example:
CREATE TABLE tbl2_cg (col1 INT PRIMARY KEY, col2 VARCHAR(50)) WITH COLUMN GROUP(each column);
When you create and use a columnstore table, if you import a large amount of data, you need to perform a major compaction to improve the read performance and perform statistics collection to adjust the execution strategy.
Major compaction: After you import data in batches, we recommend that you perform a major compaction. This helps improve the read performance because the major compaction organizes fragmented data to be more contiguous in physical storage, thereby reducing disk I/O during reads. After you import data, trigger a major compaction in the tenant to ensure that all data is compacted to the baseline layer. For more information, see
MAJOR and MINOR.Statistics collection: After the major compaction is completed, we recommend that you collect statistics. This is important for the optimizer to generate effective query plans and execution strategies. Execute GATHER_SCHEMA_STATS to collect statistics for all tables. You can also monitor the statistics collection progress by querying the GV$OB_OPT_STAT_GATHER_MONITOR view.
Note that as the amount of data in a columnstore table increases, the speed of major compaction may decrease.
Specify the update model of a table
You can specify the update model of a table when you create it. The syntax is to add the MERGE_ENGINE table option to the CREATE TABLE statement. The SQL statement is as follows:
CREATE TABLE table_name column_definition
MERGE_ENGINE = {delete_insert | partial_update}
WITH COLUMN GROUP([all columns,] each column);
Note
After you specify the MERGE_ENGINE parameter when you create a table, you cannot modify the parameter.
The MERGE_ENGINE parameter specifies the update model of a table. Valid values: partial_update and delete_insert.
partial_update: The default value. This indicates that the existing update mode is retained.delete_insert: This indicates that themerge on writemode is used. In this mode, theupdateoperation is converted to a combination ofdeleteandinsertoperations, and all columns are written.
If you do not specify the MERGE_ENGINE option, the value of the default_table_merge_engine parameter is used.
Scenarios
The delete_insert update model is recommended for OLAP scenarios. The partial_update update model is recommended for other scenarios. For more information about configuration scenarios, see Best practices.
Here are some examples:
Create a columnstore table
mer_tbl1with themerge on writeupdate model.CREATE TABLE mer_tbl1 (col1 INT, col2 INT) MERGE_ENGINE = delete_insert WITH COLUMN GROUP(each column);Create a rowstore-columnstore redundancy table
mer_tbl2with themerge on writeupdate model.CREATE TABLE mer_tbl2 (col1 INT, col2 INT) MERGE_ENGINE = delete_insert WITH COLUMN GROUP(all columns, each column);
Create a heap-organized table
OceanBase Database supports two types of tables: index-organized tables (IOTs) and heap-organized tables.
Limitations
For a heap-organized table, the unique key local or the primary key of the primary key table must contain all the partitioning keys.
When you create an index on a heap-organized table, the index name must be different from the name of the first column of the primary key.
OceanBase Database does not support the following DDL operations on heap-organized tables:
- Primary key constraint operations
- Change a column to a primary key
- Change the type or length of a primary key column from a larger to a smaller value
- Add a primary key column
- Change a primary key column to an auto-increment column
- Drop a primary key column
- Split a partition
Syntax for specifying the table organization
To specify the table organization when you create a table, add the ORGANIZATION option to the CREATE TABLE statement. The syntax is as follows:
CREATE TABLE table_name column_definition ORGANIZATION [=] {INDEX | HEAP};
The ORGANIZATION option specifies the storage order of data rows in the table. Valid values are as follows:
INDEX: indicates that the table is an index-organized table.HEAP: indicates that the table is a heap-organized table.
If you do not specify the ORGANIZATION option, its value is the same as that of the default_table_organization parameter.
Here are some examples:
Specify the
ORGANIZATIONoption asHEAPwhen you create a table.CREATE TABLE ora_tbl1 (col1 INT, col2 INT) ORGANIZATION = HEAP;Do not specify the
ORGANIZATIONoption when you create a table.After you change the value of the tenant-level parameter
default_table_organization, you can create a table. For more information about thedefault_table_organizationparameter, see default_table_organization.Note
The
default_table_organizationparameter is supported only for MySQL user tenants of OceanBase Database in the current version. It is not supported for the sys (system) tenant and Oracle user tenants.Set the value of the
default_table_organizationparameter toHEAP.ALTER SYSTEM SET default_table_organization = 'HEAP';Create a table named
ora_tbl2.CREATE TABLE ora_tbl2(col1 INT, col2 INT);View the definition of the
ora_tbl2table.SHOW CREATE TABLE ora_tbl2;The returned result is as follows:
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Table | Create Table | +----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | ora_tbl2 | CREATE TABLE `ora_tbl2` ( `col1` int(11) DEFAULT NULL, `col2` int(11) DEFAULT NULL ) ORGANIZATION HEAP DEFAULT CHARSET = utf8mb4 ROW_FORMAT = DYNAMIC COMPRESSION = 'zstd_1.3.8' REPLICA_NUM = 1 BLOCK_SIZE = 16384 USE_BLOOM_FILTER = FALSE ENABLE_MACRO_BLOCK_BLOOM_FILTER = FALSE TABLET_SIZE = 134217728 PCTFREE = 0 | +----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set
Create a temporary table
A temporary table is a special table at the session level. It is mainly used to temporarily store intermediate data.
Note
For OceanBase Database V4.4.x, the database supports creating temporary tables starting from V4.4.2.
Limitations
Routing of temporary tables: Since temporary tables must support load balancing like regular tables, they cannot be bound to a specific server by default. Consider disabling load balancing if needed.
In the serializable isolation level (Serializable), OceanBase Database does not support writing to temporary tables created within the same transaction.
If a temporary table with the same name as a regular table exists in the current session, it is not recommended to use the database-level statistics collection feature in this session, as the results may not meet expectations.
Temporary tables are cleaned up by background logic. To avoid significant impact on business operations, the cleanup speed is limited. Therefore, it is recommended to manually delete temporary tables when they are heavily used.
Temporary tables cannot be accessed via DBLink.
When connecting to an OBServer using ODP, you need to manually adjust the server_protocal, client_session_id_version, and proxy_id parameters.
Here are the steps:
Execute the following command to modify the communication protocol configuration of OBProxy.
obclient> ALTER proxyconfig SET server_protocol = 'OceanBase 2.0';Execute the following command to set the algorithm for generating the client session ID (Client Session ID) to version 2.
obclient> ALTER proxyconfig SET client_session_id_version = 2;Execute the following command to change the ODP ID to 1. Different ODPS should be set to different numbers to ensure that the generated Client Session IDs do not conflict.
obclient> ALTER proxyconfig SET proxy_id = 1;
Examples
Create a temporary table named tbl1:
obclient> CREATE TEMPORARY TABLE tbl1(col1 INT);