You can use the CREATE TABLE statement to create a table.
This topic describes how to create a non-partitioned table. For information about how to create and use a partitioned table, see Create a partitioned table.
Create a non-partitioned table
Creating a non-partitioned table means creating a table with only one partition.
Here is an example of creating a non-partitioned table:
obclient>CREATE TABLE table_name1(w_id int
, w_ytd decimal(12,2)
, w_tax decimal(4,4)
, w_name varchar(10)
, w_street_1 varchar(20)
, w_street_2 varchar(20)
, w_city varchar(20)
, w_state char(2)
, w_zip char(9)
, unique(w_name, w_city)
, primary key(w_id)
);
Query OK, 0 rows affected (0.09 sec)
obclient>CREATE TABLE table_name2 (c_w_id int NOT NULL
, c_d_id int NOT null
, c_id int NOT null
, c_discount decimal(4, 4)
, c_credit char(2)
, c_last varchar(16)
, c_first varchar(16)
, c_middle char(2)
, c_balance decimal(12, 2)
, c_ytd_payment decimal(12, 2)
, c_payment_cnt int
, c_credit_lim decimal(12, 2)
, c_street_1 varchar(20)
, c_street_2 varchar(20)
, c_city varchar(20)
, c_state char(2)
, c_zip char(9)
, c_phone char(16)
, c_since date
, c_delivery_cnt int
, c_data varchar(500)
, index icust(c_last, c_d_id, c_w_id, c_first, c_id)
, FOREIGN KEY (c_w_id) REFERENCES table_name1(w_id)
, primary key (c_w_id, c_d_id, c_id)
);
Query OK, 0 rows affected
In the example, two tables are created, and constraints are defined for the columns in the tables, including primary keys and foreign keys on different columns. For more information about primary keys and foreign keys, see Define the constraint type for a column.
When you create columns for a table, choose the correct data type. For more information about SQL data types, see Data types.
Note
To improve performance and facilitate later maintenance, we recommend that you define a primary key or a unique key for a table when you create it. If no suitable column can be used as the primary key, you can omit the primary key when you create the table. After the table is created, the system will automatically assign an auto-increment column as the hidden primary key for the table. For more information about auto-increment columns, see Define an auto-increment column.
Create a replicated table
A replicated table is a special type of table in OceanBase Database. You can read the latest modifications to data from any healthy replica of a replicated table. A replicated table is a good choice for users who have a low write frequency and high requirements for read latency and load balancing.
After you create a replicated table, a replica of the table is created on each OBServer node in the tenant where the table is created. One replica is selected as the leader to accept write requests, and the other replicas only accept read requests.
All replicas report their status to the leader, mainly the replay progress of each replica, which indicates the synchronization progress of data. Generally, the replay progress of a follower replica lags slightly behind that of the leader. If the lag does not exceed a threshold, the leader considers the replica to be healthy and can quickly replay the modifications on the leader. After the leader considers a replica to be healthy for a certain period of time, it grants a lease to the follower. In other words, the leader trusts the follower to remain healthy and provide strong-consistency read services for a certain period of time. During this trust period, the leader confirms the replay progress of the follower before committing each transaction on the replicated table. After the follower replays the modifications of the transaction, the leader reports to the user that the transaction is committed. At this point, the user can read the modifications of the transaction from the follower.
The replicated table feature was available in OceanBase Database V3.x. In OceanBase Database V4.x, the architecture of OceanBase Database has undergone significant changes. To adapt to the new architecture of a single log stream, V4.x replicated tables implement partition-based readable version number verification and log stream-based lease granting mechanisms to ensure the correctness of strong-consistency reads.
In addition, V4.x replicated tables support the ability to switch the leader without killing transactions. When a user or load balancer initiates a leader switch, uncommitted transactions on the replicated table can continue to run after the leader switch, unlike in V3.x. Compared with V3.x, V4.x replicated tables offer better write transaction performance and stronger disaster recovery capabilities. The impact of a replica failure on read operations is lower in V4.x than in V3.x.
Limitations on replicated tables
Replicated tables:
Limitations on creating replicated tables: You cannot create a replicated table in the sys or meta tenant. This is because these tenants do not have broadcast log streams.
Write performance is affected by the number of nodes: When you write data to a replicated table, the data must be synchronized to all replicas. Therefore, the more nodes you have, the more significant the impact on write performance.
- Workaround: Avoid writing and reading data from a replicated table in the same transaction. This is because a transaction that only writes data to a replicated table or only reads data from a replicated table is acceptable.
Attribute conversion:
- A replicated table and a table group are mutually exclusive. Modifying the table group attribute of a replicated table will result in an error. When converting a regular table to a replicated table, if the regular table belongs to a table group, the attribute conversion command will also result in an error.
- Converting a replicated table depends on load balancing and transfer. Ensure that the relevant configuration items are enabled.
Routing:
If a transaction contains write operations on a replicated table, subsequent queries on the replicated table may be randomly routed to a follower node. This can lead to issues where the follower node is not readable. The observer internally forwards the query to the leader node of the replicated table, which can affect query performance.
- In ODP V4.3.3, the routing strategy has been adjusted. If there are write operations on a replicated table, subsequent queries will be routed to the leader node of the replicated table.
Queries that join a replicated table with a regular table will be randomly routed based on the replicated table. For
JOINqueries, ODP routes based on the first table. This may route to a non-leader node of the regular table, leading to remote plans.When converting a regular table to a replicated table, ODP cannot detect this change and cannot route queries to the replicated table to balance the load.
Within a transaction, if a replicated table has been modified, subsequent queries on the replicated table will generate an execution plan that selects the local replica. However, due to the modification, an error indicating that the replica is not readable will be returned. The SQL retry will select the leader replica, and the plan cache will not hit, resulting in poor query performance.
Note
Starting from OceanBase Database V4.3.5 BP2, within a transaction, if a replicated table has been modified, the system will always prioritize selecting the leader replica of the replicated table during queries, rather than directly selecting the local replica. This ensures that OceanBase Database improves query efficiency and avoids selecting the wrong replica.
Broadcast log streams:
- Each user tenant can have at most one broadcast log stream.
- Attribute conversion between broadcast log streams and regular log streams is not supported.
- Manual deletion of broadcast log streams is not supported. They are deleted automatically when the tenant is deleted.
Syntax for creating a replicated table
To create a replicated table, add the DUPLICATE_SCOPE option to the CREATE TABLE statement. Only user tenants can create replicated tables, and the sys tenant cannot create replicated tables. The SQL statement for creating a replicated table is as follows:
CREATE TABLE table_name column_definition DUPLICATE_SCOPE='none | cluster';
The DUPLICATE_SCOPE parameter specifies the properties of the replicated table. Valid values are as follows:
none: indicates that the table is a regular table.cluster: indicates that the table is a replicated table, and the leader needs to replicate transactions to all F and R replicas in the current tenant.
If you do not specify the DUPLICATE_SCOPE parameter when you create a table, the default value is none.
CREATE TABLE dup_t1 (c1 int,c2 int) DUPLICATE_SCOPE= 'cluster';
When the first replicated table is created in a tenant, the system automatically creates a special log stream called a broadcast log stream. All subsequent replicated tables are created on this broadcast log stream. Unlike regular log streams, broadcast log streams automatically deploy replicas on each OBServer node within the tenant. This ensures that, under ideal conditions, replicated tables can provide strong consistency reads on any OBServer node. You can use the following SQL statement to view the broadcast log stream where the replicated tables are located:
SELECT * FROM oceanbase.DBA_OB_LS WHERE flag LIKE "%DUPLICATE%";
The query result is as follows.
+-------+--------+--------------+---------------+-------------+---------------------+----------+---------------------+---------------------+-----------+
| LS_ID | STATUS | PRIMARY_ZONE | UNIT_GROUP_ID | LS_GROUP_ID | CREATE_SCN | DROP_SCN | SYNC_SCN | READABLE_SCN | FLAG |
+-------+--------+--------------+---------------+-------------+---------------------+----------+---------------------+---------------------+-----------+
| 1003 | NORMAL | z1;z2 | 0 | 0 | 1683267390195713284 | NULL | 1683337744205408139 | 1683337744205408139 | DUPLICATE |
+-------+--------+--------------+---------------+-------------+---------------------+----------+---------------------+---------------------+-----------+
1 rows in set
In the example, the log stream with LS_ID of 1003 is the broadcast log stream, and all replicated tables in the tenant are created on this log stream. For more information about broadcast log streams, see Replica introduction.
After a replicated table is created, you can perform insert and read/write operations on it, just like a regular table. However, for read requests, if you connect to the database through a Proxy, the read request may be routed to any OBServer node for execution. If you connect to the database directly, the system will execute the read request on the directly connected OBServer node, provided that the local replica is readable. For more information about database connection methods, see Overview of connection methods.
Create a new table by copying data from an existing table
Copy table data
You can use the CREATE TABLE AS SELECT statement to copy table data. However, the structure is not consistent with the original table, and constraints, indexes, default values, partitions, and other information are lost.
Example:
obclient>CREATE TABLE t1_copy AS SELECT * FROM t1;
Query OK, 3 rows affected
Copy table structure
You can use the CREATE TABLE LIKE statement to copy table structure. However, table data cannot be copied.
Example:
obclient>CREATE TABLE t1_like like t1;
Query OK, 0 rows affected
Create a rowstore table
OceanBase Database supports creating rowstore tables and converting rowstore tables to columnstore tables.
When the default_table_store_format parameter is set to row (the default value), the default table type is rowstore. If the default_table_store_format parameter is not set to row, you can create a rowstore table by specifying the WITH COLUMN GROUP(all columns) option.
For more information about converting rowstore tables to columnstore tables, see Change a table. For more information about creating columnstore indexes, see Create an index.
You can create a rowstore table by specifying the WITH COLUMN GROUP(all columns) option.
Here is an example:
CREATE TABLE tbl1_cg (col1 INT PRIMARY KEY, col2 VARCHAR(50)) WITH COLUMN GROUP(all columns);
Note
If you specify the WITH COLUMN GROUP(all columns) option to create a rowstore table, the table remains a rowstore table even if you later execute the DROP COLUMN GROUP(all columns) statement to drop the column group.
Create a columnstore table
OceanBase Database allows you to create a columnstore table, convert a rowstore table to a columnstore table, and create a columnstore index. By default, OceanBase Database creates a rowstore table when you create a table. You can specify the WITH COLUMN GROUP option to explicitly set the table as a columnstore table or a rowstore-columnstore redundant table.
For more information about how to convert a rowstore table to a columnstore table, see Change a table. For more information about how to create a columnstore index, see Create an index.
You can specify WITH COLUMN GROUP(all columns, each column) to create a rowstore-columnstore redundant table.
Here is an example:
CREATE TABLE tbl1_cg (col1 INT PRIMARY KEY, col2 VARCHAR(50)) WITH COLUMN GROUP(all columns, each column);
You can specify WITH COLUMN GROUP(each column) to create a columnstore table.
Here is an example:
CREATE TABLE tbl2_cg (col1 INT PRIMARY KEY, col2 VARCHAR(50)) WITH COLUMN GROUP(each column);
When you create and use a columnstore table, if you import a large amount of data, you must perform a major compaction to improve the read performance and perform statistics collection to adjust the execution strategy.
Major compaction: After you import data in batches, we recommend that you perform a major compaction. This helps improve the read performance because the major compaction organizes fragmented data into more contiguous physical storage, thereby reducing disk I/O during reads. After you import data, trigger a major compaction in the tenant to ensure that all data is merged into the baseline layer. For more information, see
MAJOR AND MINOR.Statistics collection: After the major compaction is completed, we recommend that you collect statistics. This is important for the optimizer to generate effective query plans and execution strategies. Execute GATHER_SCHEMA_STATS to collect statistics for all tables. You can also monitor the progress of statistics collection by querying the GV$OB_OPT_STAT_GATHER_MONITOR view.
Note that as the data volume of the columnstore table increases, the speed of major compaction may decrease.
Specify the update model for a table
To specify the update model for a table when you create it, add the MERGE_ENGINE table option to the CREATE TABLE statement. The SQL statement is as follows:
CREATE TABLE table_name column_definition
MERGE_ENGINE = {delete_insert | partial_update}
WITH COLUMN GROUP([all columns,] each column);
Note
- After you specify the
MERGE_ENGINEparameter when you create a table, you cannot modify its value. - For OceanBase Database V4.3.5, the
MERGE_ENGINEparameter is supported starting from V4.3.5 BP3.
The MERGE_ENGINE parameter specifies the update model for the table. Valid values:
partial_update: The default value. The existing update mode is retained.delete_insert: Themerge on writemode is adopted. Theupdateoperation is converted to a combination ofdeleteandinsertoperations, and all columns are written.
If you do not specify the MERGE_ENGINE option, its value is the same as the value of the default_table_merge_engine parameter.
Scenarios
We recommend that you use delete_insert for OLAP scenarios and partial_update for other scenarios. For more information about configuration scenarios, see Best Practices.
Here are some examples:
Create a pure columnstore table
mer_tbl1with themerge on writeupdate model.CREATE TABLE mer_tbl1 (col1 INT, col2 INT) MERGE_ENGINE = delete_insert WITH COLUMN GROUP(each column);Create a rowstore-columnstore redundant table
mer_tbl2with themerge on writeupdate model.CREATE TABLE mer_tbl2 (col1 INT, col2 INT) MERGE_ENGINE = delete_insert WITH COLUMN GROUP(all columns, each column);
Create a heap-organized table
OceanBase Database supports index-organized tables and heap-organized tables.
Note
For OceanBase Database V4.3.5, heap-organized tables are supported starting from V4.3.5 BP1.
Limitations
For a local unique key of a heap-organized table or a primary key of a primary key table, all partitioning columns must be included.
When you create an index on a heap-organized table, the index name cannot be the same as the name of the first column of the primary key.
OceanBase Database does not support the following DDL operations on heap-organized tables:
- Primary key constraint operations
- Modifying a column to a primary key
- Changing the type or length of a primary key column from a larger to a smaller size
- Adding a primary key column
- Modifying a primary key column to an auto-increment column
- Dropping a primary key column
- Splitting partitions
Syntax for specifying the table organization
To specify the table organization when you create a table, add the ORGANIZATION table option after the CREATE TABLE statement. The SQL statement is as follows:
CREATE TABLE table_name column_definition ORGANIZATION [=] {INDEX | HEAP};
Note
In OceanBase Database V4.3.5, the CREATE TABLE statement supports the ORGANIZATION option starting from V4.3.5 BP1.
The ORGANIZATION parameter specifies the storage order of data rows in the table. Valid values:
INDEX: indicates that the table model is an index-organized table.HEAP: indicates that the table model is a heap-organized table.
If you do not specify the ORGANIZATION option, its value is the same as the value of the default_table_organization parameter.
Here are some examples:
Specify the
ORGANIZATIONattribute asHEAPwhen you create a table.CREATE TABLE ora_tbl1 (col1 INT, col2 INT) ORGANIZATION = HEAP;Do not specify the
ORGANIZATIONoption when you create a table.Change the tenant-level parameter
default_table_organization, and then create a table. For more information about thedefault_table_organizationparameter, see default_table_organization.Note
The
default_table_organizationparameter applies only to MySQL-compatible user tenants in OceanBase Database. It does not apply to sys (system) tenants and Oracle mode user tenants.Set the value of the
default_table_organizationparameter toHEAP.ALTER SYSTEM SET default_table_organization = 'HEAP';Create a table named
ora_tbl2.CREATE TABLE ora_tbl2(col1 INT, col2 INT);View the definition of the
ora_tbl2table.SHOW CREATE TABLE ora_tbl2;The return result is as follows:
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Table | Create Table | +----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | ora_tbl2 | CREATE TABLE `ora_tbl2` ( `col1` int(11) DEFAULT NULL, `col2` int(11) DEFAULT NULL ) ORGANIZATION HEAP DEFAULT CHARSET = utf8mb4 ROW_FORMAT = DYNAMIC COMPRESSION = 'zstd_1.3.8' REPLICA_NUM = 1 BLOCK_SIZE = 16384 USE_BLOOM_FILTER = FALSE ENABLE_MACRO_BLOCK_BLOOM_FILTER = FALSE TABLET_SIZE = 134217728 PCTFREE = 0 | +----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set
Create a temporary table
In MySQL-compatible mode of OceanBase Database, a temporary table is a session-level special table used to temporarily store intermediate data.
Note
OceanBase Database V4.3.5 BP4 supports creating temporary tables.
You can execute the CREATE TEMPORARY TABLE statement to create a temporary table. The syntax is as follows:
CREATE TEMPORARY TABLE table_name column_definition;
Limitations
Routing of temporary tables: To support load balancing, temporary tables cannot be bound to a specific server by default. You can disable load balancing for temporary tables.
In the Serializable isolation level, OceanBase Database does not allow writing to temporary tables created within the same transaction.
If a temporary table with the same name as a regular table exists in the current session, using the database-level statistics collection feature in the session may yield unexpected results.
Temporary tables are cleaned up in the background. To minimize the impact of their DDL operations on business, the cleanup speed is limited. Therefore, it is recommended to manually delete temporary tables when they are no longer needed.
Temporary tables cannot be accessed through a DBLink.
When connecting to an OBServer through ODP, you need to manually adjust the
server_protocol,client_session_id_version, andproxy_idparameters.Here are the steps:
Execute the following command to modify the communication protocol configuration of OBProxy.
ALTER proxyconfig SET server_protocol = 'OceanBase 2.0';Execute the following command to set the algorithm for generating the client session ID (Client Session ID) to version 2.
ALTER proxyconfig SET client_session_id_version = 2;Execute the following command to set the ODP ID to 1. Different ODPs must have different IDs to avoid conflicts in the generated Client Session IDs.
ALTER proxyconfig SET proxy_id = 1;
Examples
Create a temporary table named tbl1:
CREATE TEMPORARY TABLE tbl1(col1 INT);