If a table already contains data and you want to import incremental data, you can use the incremental direct load feature. Although you can import incremental data using a full direct load, this process will overwrite all existing data, resulting in poor import performance. In contrast, incremental direct load only processes the incremental data, ensuring better performance.
This topic describes how to use the LOAD DATA, INSERT INTO SELECT, and CREATE TABLE AS SELECT statements to perform an incremental direct load.
Incremental direct load strategy
OceanBase Database V4.6.0 optimizes incremental direct load. The new incremental direct load strategy divides data into two parts. The part with primary key conflicts is written to a Mini SSTable (incremental SSTable), and the part without primary key conflicts is written to a Major SSTable (baseline SSTable). This way, for cases with few conflicts, the new incremental direct load strategy allows multiple imports without compromising query performance.
Considerations
When you use direct load, note the following considerations:
Direct load is not recommended for small data volumes that can be imported within a few minutes, because incremental direct load triggers a major compaction.
Direct load supports non-unique local indexes on the target table.
The
LOAD DATAstatement can be executed in a multi-row transaction. When you execute theLOAD DATAstatement, it automatically commits the previous transaction.When you use the
INSERT INTO SELECTstatement for direct load, only Parallel Data Manipulation Language (PDML) is supported. Direct load is not supported for non-PDML statements.When you use the
LOAD DATAorINSERT INTO SELECTstatement for direct load to a specified partition, the target table cannot be a replicated table, and the table cannot contain auto-increment columns, identity columns, or global indexes.For V4.3.5, incremental direct load has the following changes starting from V4.3.5 BP1:
- If the last-level partition of the target table is a hash or key partition, you can perform direct load at the partition level.
- Only heap tables with one unique local index are supported. If the target table contains multiple unique local indexes or global indexes, the import operation fails.
- In the
LOAD DATAorINSERT INTO SELECTstatement, if multiple direct load tasks have overlapping partitions, parallel direct load is not supported. If no partitions overlap, parallel direct load is supported. - When the foreign_key_checks session-level variable is set to False, the direct load operation does not check foreign key constraints.
Use the LOAD DATA statement to perform a direct load
The LOAD DATA statement can perform a direct load by using the DIRECT() hint. If you do not specify the hint, the system determines the import behavior based on the default_load_mode configuration item.
Limitations
- During import, you cannot execute two write operations at the same time (i.e., you cannot write to one table while another write operation is in progress). This is because a table lock is placed on the table during import, and only read operations are allowed throughout the import process.
- Import is not supported in triggers.
- Import is not supported for tables that contain generated columns (some indexes create hidden generated columns, for example,
KEY idx_c2 (c2(16)) GLOBAL). - Import is not supported for single rows exceeding 2 MB.
- Import is not supported for Liboblog and flashback queries.
- Incremental direct load is not supported for tables with foreign keys.
- Incremental direct load is not supported for tables with
GLOBALindexes. - Incremental direct load is not supported for tables with constraints.
- Incremental direct load is not supported for tables with more than one
LOCALunique index and no primary key. - Incremental direct load is not supported for tables with a unique index and a primary key.
Syntax
LOAD DATA /*+ [DIRECT(need_sort,max_error,{'inc'|'inc_replace'})] parallel(N) */ [REMOTE_OSS | LOCAL] INFILE 'file_name' INTO TABLE table_name [PARTITION(PARTITION_OPTION)] [COMPRESSION]...
For more information about the LOAD DATA syntax, see LOAD DATA.
Parameter description:
| Parameter | Description |
|---|---|
| DIRECT() | Enables direct load by using the hint. The following table describes the parameters in DIRECT().
NoticeWhen the |
| parallel(N) | Required. The parallelism for data loading. Default value: 4. |
| REMOTE_OSS | LOCAL | Optional.
|
| file_name | The path and name of the input file. This parameter corresponds to the type of REMOTE_OSS | LOCAL.
NoteWhen you import files from the object storage, make sure that the following conditions are met:
|
| table_name | The name of the table to which the data is imported. You can specify any number of columns in the table. |
| PARTITION_OPTION | The name of the partition when you perform direct load for a partition.
|
| COMPRESSION | The compression format of the file. Valid values:
|
Note
The LOAD DATA statement supports wildcard-based multi-file import, just like full direct load. However, the wildcard in the LOAD DATA statement only takes effect on the OBServer node (the wildcard does not take effect when you log in to the OBServer node from another machine and use the LOAD DATA statement).
Examples
The operation steps for incremental direct load in the LOAD DATA statement are the same as those for full direct load. You only need to change the value of full to inc or inc_replace in the statement.
Note
The following example describes the procedure for importing data from a server-side file. In addition, OceanBase Database also supports the LOCAL INFILE method for loading local files. For more information about the LOAD DATA LOCAL INFILE statement, see Use the LOAD DATA statement to load data
Log in to the server where the OBServer node is located, and create a test table named
tbl1in the/home/admindirectory.Note
The
LOAD DATAstatement in OceanBase Database supports loading only local input files on the OBServer node. Therefore, you need to copy the file to an OBServer node before you import the file.[xxx@xxx /home/admin]# ssh admin@10.10.10.1[admin@xxx /home/admin]# vi tbl1.csv 1,11 2,22 3,33Set the file path for import.
Set the system variable
secure_file_privto specify the path that can be accessed when you import or export files.Notice
To improve security, you can set only the
secure_file_privvariable by connecting to the database by using a local Unix socket. For more information, see secure_file_priv.Log in to the server where the OBServer node is located.
[xxx@xxx /home/admin]# ssh admin@10.10.10.1Connect to the
mysql001tenant by using a local Unix socket.obclient -S /home/admin/oceanbase/run/sql.sock -uroot@mysql001 -p******Set the import path to
/home/admin.obclient [(none)]> SET GLOBAL secure_file_priv = "/home/admin"; Query OK, 0 rows affected
Connect to the database again, and execute the
LOAD /*+ DIRECT */ DATAstatement to import data to a table.Create a table named
tbl1.obclient [test]> CREATE TABLE tbl1 ( col1 INT PRIMARY KEY, col2 INT ) PARTITION BY RANGE (col1) SUBPARTITION BY RANGE (col1) ( PARTITION p0 VALUES LESS THAN (100) ( SUBPARTITION p0_1 VALUES LESS THAN (50), SUBPARTITION p0_2 VALUES LESS THAN (100) ), PARTITION p1 VALUES LESS THAN (200) ( SUBPARTITION p1_1 VALUES LESS THAN (150), SUBPARTITION p1_2 VALUES LESS THAN (200) ) ); Query OK, 0 rows affectedQuery whether data exists in the
tbl1table. At this time, the query result shows that the table is empty.obclient [test]> SELECT * FROM tbl1; Empty setImport data from the
tbl1.csvfile to thetbl1table.Specify that all columns of the
tbl1table be imported.obclient [test]> LOAD DATA /*+ direct(true,1024,'inc_replace') parallel(16) */ INFILE '/home/admin/tbl1.csv' INTO TABLE tbl1 FIELDS TERMINATED BY ','; Query OK, 3 rows affected Records: 3 Deleted: 0 Skipped: 0 Warnings: 0Specify that any column of the
tbl1table be imported. For example, specify thecol1andcol2columns.obclient [test]> LOAD DATA /*+ direct(true,1024,'inc_replace') parallel(16) */ INFILE '/home/admin/tbl1.csv' INTO TABLE tbl1 FIELDS TERMINATED BY ','(col1,col2); Query OK, 3 rows affected Records: 3 Deleted: 0 Skipped: 0 Warnings: 0(Optional) Specify that the partitions of the
tbl1table be imported.obclient [test]> LOAD DATA /*+ direct(true,1024,'inc_replace') parallel(16) */ INFILE '/home/admin/tbl1.csv' INTO TABLE tbl1 partition(p0, p1) FIELDS TERMINATED BY ',';(Optional) Specify that the subpartitions of the
tbl1table be imported.obclient [test]> LOAD DATA /*+ direct(true,1024,'inc_replace') parallel(16) */ INFILE '/home/admin/tbl1.csv' INTO TABLE tbl1 partition(p0sp0_1, p1sp1_1) FIELDS TERMINATED BY ',';Use the
default_load_modeparameter for data import.Set the
default_load_modeparameter toINC_DIRECT_WRITEorINC_REPLACE_DIRECT_WRITE.obclient [test]> ALTER SYSTEM SET default_load_mode ='INC_DIRECT_WRITE';Do not specify the Hint for the
LOAD DATAstatement.obclient [test]> LOAD DATA INFILE '/home/admin/tbl1.csv' INTO TABLE tbl2 FIELDS TERMINATED BY ',';
Query whether data exists in the
tbl1table.obclient [test]> SELECT * FROM tbl1;The query result is as follows:
+------+------+ | col1 | col2 | +------+------+ | 1 | 11 | | 2 | 22 | | 3 | 33 | +------+------+ 3 rows in setThis result shows that data exists in the
tbl2table.
Log in to the server where the OBServer node is located, and create a test table named
tbl1in the/home/admindirectory.Note
The
LOAD DATAstatement in OceanBase Database supports loading only local input files on the OBServer node. Therefore, you need to copy the file to an OBServer node before you import the file.[xxx@xxx /home/admin]# ssh admin@10.10.10.1[admin@xxx /home/admin]# vi tbl1.csv 1,11 2,22 3,33Set the file path for import.
Set the system variable
secure_file_privto specify the path that can be accessed when you import or export files.Notice
For security reasons, you can only modify the global variable
secure_file_privby using a local Unix socket connection to the database. For more information, see secure_file_priv.Log in to the server where the OBServer node is located.
[xxx@xxx /home/admin]# ssh admin@10.10.10.1Run the following command to connect to the
oracle001tenant by using a local Unix socket connection.obclient -S /home/admin/oceanbase/run/sql.sock -usys@oracle001 -p******Set the import path to
/home/admin.obclient [(none)]> SET GLOBAL secure_file_priv = "/home/admin"; Query OK, 0 rows affected
After you reconnect to the database, use the
LOAD /*+ DIRECT */ DATAstatement to import data.Create a table named
tbl2.obclient [test]> CREATE TABLE tbl2 ( col1 INT PRIMARY KEY, col2 INT ) PARTITION BY RANGE (col1) SUBPARTITION BY RANGE (col1) ( PARTITION p0 VALUES LESS THAN (100) ( SUBPARTITION sp0_1 VALUES LESS THAN (50), SUBPARTITION sp0_2 VALUES LESS THAN (100) ), PARTITION p1 VALUES LESS THAN (200) ( SUBPARTITION sp1_1 VALUES LESS THAN (150), SUBPARTITION sp1_2 VALUES LESS THAN (200) ) ); Query OK, 0 rows affectedQuery whether the
tbl2table contains data. The result shows that the table is empty.obclient [test]> SELECT * FROM tbl2; Empty setImport data from the
tbl1.csvfile to thetbl2table by using direct load.Import data to all columns of the
tbl2table.obclient [test]> LOAD DATA /*+ direct(true,1024,'inc_replace') parallel(16) */ INFILE '/home/admin/tbl1.csv' INTO TABLE tbl2 FIELDS TERMINATED BY ','; Query OK, 3 rows affected Records: 3 Deleted: 0 Skipped: 0 Warnings: 0Import data to specified columns of the
tbl2table. For example, import data to thecol1andcol2columns.obclient [test]> LOAD DATA /*+ direct(true,1024,'inc_replace') parallel(16) */ INFILE '/home/admin/tbl1.csv' INTO TABLE tbl2 FIELDS TERMINATED BY ','(col1,col2); Query OK, 3 rows affected Records: 3 Deleted: 0 Skipped: 0 Warnings: 0(Optional) Import data to partitions of the
tbl2table.obclient [test]> LOAD DATA /*+ direct(true,1024,'inc_replace') parallel(16) */ INFILE '/home/admin/tbl1.csv' INTO TABLE tbl2 partition(p0, p1) FIELDS TERMINATED BY ',';(Optional) Import data to subpartitions of the
tbl2table.obclient [test]> LOAD DATA /*+ direct(true,1024,'inc_replace') parallel(16) */ INFILE '/home/admin/tbl1.csv' INTO TABLE tbl2 partition(p0sp0_1, p1sp1_1) FIELDS TERMINATED BY ',';Import data by using the
default_load_modeparameter.Set the value of
default_load_modetoINC_DIRECT_WRITEorINC_REPLACE_DIRECT_WRITE.obclient [test]> ALTER SYSTEM SET default_load_mode ='INC_DIRECT_WRITE';Do not specify the Hint in the
LOAD DATAstatement.obclient [test]> LOAD DATA INFILE '/home/admin/tbl1.csv' INTO TABLE tbl2 FIELDS TERMINATED BY ',';
Verify whether data has been imported to the
tbl2table.obclient [test]> SELECT * FROM tbl2;The query result is as follows:
+------+------+ | col1 | col2 | +------+------+ | 1 | 11 | | 2 | 22 | | 3 | 33 | +------+------+ 3 rows in setThe result indicates that data has been imported to the
tbl2table.
Use the INSERT INTO SELECT statement to bypass data import
The INSERT INTO SELECT statement can bypass data import by using the direct() hint with the enable_parallel_dml hint. If no hint is specified, the behavior of data import is determined by the default_load_mode configuration item.
Limitations
- Only PDML (Parallel Data Manipulation Language) is supported for direct load. Non-PDML statements cannot be used for direct load. For more information about parallel DML, see Parallel DML.
- During a direct load, you cannot execute two write operations simultaneously (i.e., you cannot write to one table at the same time). This is because a table lock is placed on the table during the import, and only read operations are allowed during the entire import.
- Direct load is not supported in triggers.
- Tables with generated columns are not supported. Some indexes generate hidden generated columns. For example, the index KEY
idx_c2(c2(16)) GLOBAL generates a hidden generated column. - Direct load is not supported for Liboblog and flashback queries.
- Tables with indexes (excluding primary keys) are not supported for incremental direct load.
- Tables with foreign keys are not supported for incremental direct load.
- Tables with
GLOBALindexes are not supported for incremental direct load. - Tables with constraints are not supported for incremental direct load.
- Tables without a primary key and with more than one
LOCALunique index are not supported for incremental direct load. - Tables with a primary key and a unique index are not supported for incremental direct load.
Syntax
INSERT /*+ [DIRECT(need_sort,max_error,{'inc'|'inc_replace'})] enable_parallel_dml parallel(N) */ INTO table_name [PARTITION(PARTITION_OPTION)] select_sentence
For more information about the INSERT INTO syntax, see INSERT (MySQL mode) and INSERT (Oracle mode).
Parameters:
| Parameter | Description |
|---|---|
| DIRECT() | Enables direct load by using the hint. DIRECT() specifies the following parameters:
|
| enable_parallel_dml | The parallelism of data loading.
NoteGenerally, you must use both the |
| parallel(N) | The parallelism of data loading. This parameter is required. The value is an integer greater than 1. |
| table_name | The name of the table to which the data is imported. You can specify any number of columns in the table. |
| PARTITION_OPTION | The name of the partition to which the data is imported when you use direct load for partitioning.
|
Example
The steps for incremental direct load in an INSERT INTO SELECT statement are the same as those for full direct load. The only difference is that you need to replace the value of the full field with inc or inc_replace.
Example 1: Full example of incremental direct load using the INSERT INTO SELECT statement.
Use direct load to import some data from table tbl2 to table tbl1.
Query whether table
tbl1contains data. The result indicates that the table is empty.obclient [test]> SELECT * FROM tbl1; Empty setQuery whether table
tbl2contains data.obclient [test]> SELECT * FROM tbl2;The query result indicates that table
tbl2contains data.+------+------+------+ | col1 | col2 | col3 | +------+------+------+ | 1 | a1 | 11 | | 2 | a2 | 22 | | 3 | a3 | 33 | +------+------+------+ 3 rows in setUse direct load to import data from table
tbl2to tabletbl1.Specify the hint for the
INSERT INTO SELECTstatement.Do not specify partitioned direct load.
obclient [test]> INSERT /*+ DIRECT(true, 0, 'inc_replace') enable_parallel_dml parallel(16) */ INTO tbl1 SELECT t2.col1,t2.col3 FROM tbl2 t2; Query OK, 3 rows affected Records: 3 Duplicates: 0 Warnings: 0(Optional) Specify partitioned direct load.
obclient [test]> INSERT /*+ DIRECT(true, 0, 'inc_replace') enable_parallel_dml parallel(16) */ INTO tbl1 partition(p0, p1) SELECT t2.col1,1 FROM tbl2 partition(p0, p1) t2; Query OK, 3 rows affected Records: 3 Duplicates: 0 Warnings: 0obclient [test]> INSERT /*+ DIRECT(true, 0, 'inc_replace') enable_parallel_dml parallel(16) */ INTO tbl1 partition(p0, p1) SELECT t2.col1,1 FROM tbl2 partition(p0sp0_1, p1sp1_1) t2; Query OK, 3 rows affected Records: 3 Duplicates: 0 Warnings: 0
Do not specify the hint for the
INSERT INTO SELECTstatement.Set the value of the
default_load_modeparameter toINC_DIRECT_WRITEorINC_REPLACE_DIRECT_WRITE.obclient [test]> ALTER SYSTEM SET default_load_mode ='INC_DIRECT_WRITE';obclient [test]> INSERT INTO tbl1 SELECT t2.col1,t2.col3 FROM tbl1 t2;
Query whether data has been imported to table
tbl1.obclient [test]> SELECT * FROM tbl1;The query result is as follows:
+------+------+ | col1 | col2 | +------+------+ | 1 | 11 | | 2 | 22 | | 3 | 33 | +------+------+ 3 rows in setThe result indicates that data has been imported to table
tbl1.(Optional) In the
Notefield of the return result of theEXPLAIN EXTENDEDstatement, check whether data is written by direct load.obclient [test]> EXPLAIN EXTENDED INSERT /*+ direct(true, 0, 'inc_replace') enable_parallel_dml parallel(16) */ INTO tbl1 SELECT t2.col1,t2.col3 FROM tbl2 t2;The return result is as follows:
+--------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Query Plan | +--------------------------------------------------------------------------------------------------------------------------------------------------------------+ | ============================================================================== | | |ID|OPERATOR |NAME |EST.ROWS|EST.TIME(us)| | | ------------------------------------------------------------------------------ | | |0 |PX COORDINATOR | |3 |27 | | | |1 |└─EXCHANGE OUT DISTR |:EX10001 |3 |27 | | | |2 | └─INSERT | |3 |26 | | | |3 | └─EXCHANGE IN DISTR | |3 |1 | | | |4 | └─EXCHANGE OUT DISTR (RANDOM)|:EX10000 |3 |1 | | | |5 | └─SUBPLAN SCAN |ANONYMOUS_VIEW1|3 |1 | | | |6 | └─PX BLOCK ITERATOR | |3 |1 | | | |7 | └─TABLE FULL SCAN |t2 |3 |1 | | | ============================================================================== | | Outputs & filters: | | ------------------------------------- | | 0 - output(nil), filter(nil), rowset=16 | | 1 - output(nil), filter(nil), rowset=16 | | dop=16 | | 2 - output(nil), filter(nil) | | columns([{tbl1: ({tbl1: (tbl1.__pk_increment(0x7efa518277d0), tbl1.col1(0x7efa518119c0), tbl1.col3(0x7efa51811e00))})}]), partitions(p0), | | column_values([T_HIDDEN_PK(0x7efa51827c10)], [column_conv(INT,PS:(11,0),NULL,ANONYMOUS_VIEW1.col1(0x7efa51826f50))(0x7efa51828030)], [column_conv(INT, | | PS:(11,0),NULL,ANONYMOUS_VIEW1.col3(0x7efa51827390))(0x7efa5182ff60)]) | | 3 - output([T_HIDDEN_PK(0x7efa51827c10)], [ANONYMOUS_VIEW1.col1(0x7efa51826f50)], [ANONYMOUS_VIEW1.col3(0x7efa51827390)]), filter(nil), rowset=16 | | 4 - output([T_HIDDEN_PK(0x7efa51827c10)], [ANONYMOUS_VIEW1.col1(0x7efa51826f50)], [ANONYMOUS_VIEW1.col3(0x7efa51827390)]), filter(nil), rowset=16 | | dop=16 | | 5 - output([ANONYMOUS_VIEW1.col1(0x7efa51826f50)], [ANONYMOUS_VIEW1.col3(0x7efa51827390)]), filter(nil), rowset=16 | | access([ANONYMOUS_VIEW1.col1(0x7efa51826f50)], [ANONYMOUS_VIEW1.col3(0x7efa51827390)]) | | 6 - output([t2.col1(0x7efa51825f10)], [t2.col3(0x7efa518267c0)]), filter(nil), rowset=16 | | 7 - output([t2.col1(0x7efa51825f10)], [t2.col3(0x7efa518267c0)]), filter(nil), rowset=16 | | access([t2.col1(0x7efa51825f10)], [t2.col3(0x7efa518267c0)]), partitions(p0) | | is_index_back=false, is_global_index=false, | | range_key([t2.__pk_increment(0x7efa51856410)]), range(MIN ; MAX)always true | | Used Hint: | | ------------------------------------- | | /*+ | | | | USE_PLAN_CACHE( NONE ) | | PARALLEL(16) | | ENABLE_PARALLEL_DML | | DIRECT(TRUE, 0, 'INC_REPLACE') | | */ | | Qb name trace: | | ------------------------------------- | | stmt_id:0, stmt_type:T_EXPLAIN | | stmt_id:1, INS$1 | | stmt_id:2, SEL$1 | | Outline Data: | | ------------------------------------- | | /*+ | | BEGIN_OUTLINE_DATA | | PARALLEL(@"SEL$1" "t2"@"SEL$1" 16) | | FULL(@"SEL$1" "t2"@"SEL$1") | | USE_PLAN_CACHE( NONE ) | | PARALLEL(16) | | ENABLE_PARALLEL_DML | | OPTIMIZER_FEATURES_ENABLE('4.3.3.0') | | DIRECT(TRUE, 0, 'INC_REPLACE') | | END_OUTLINE_DATA | | */ | | Optimization Info: | | ------------------------------------- | | t2: | | table_rows:3 | | physical_range_rows:3 | | logical_range_rows:3 | | index_back_rows:0 | | output_rows:3 | | table_dop:16 | | dop_method:Global DOP | | avaiable_index_name:[tbl2] | | stats info:[version=0, is_locked=0, is_expired=0] | | dynamic sampling level:0 | | estimation method:[DEFAULT, STORAGE] | | Plan Type: | | DISTRIBUTED | | Note: | | Degree of Parallelism is 16 because of hint | | Direct-mode is enabled in insert into select | +--------------------------------------------------------------------------------------------------------------------------------------------------------------+ 77 rows in set (0.009 sec)
Example 2: Specify the primary partition for incremental direct load for a RANGE-HASH partitioned table.
The target table is a secondary partitioned table. The primary partition is a RANGE partition, and the secondary partition is a HASH partition. Specify the primary partition for incremental direct load.
Create a table
tbl1with two partitions. The primary partition is a RANGE partition, and the secondary partition is a HASH partition.obclient [test]> CREATE TABLE tbl1(col1 INT,col2 INT) PARTITION BY RANGE COLUMNS(col1) SUBPARTITION BY HASH(col2) SUBPARTITIONS 3 ( PARTITION p0 VALUES LESS THAN(10), PARTITION p1 VALUES LESS THAN(20));Use direct load to import data from table
tbl2to thep0andp1partitions of tabletbl1.obclient [test]> insert /*+ direct(true, 0, 'inc_replace') enable_parallel_dml parallel(3) append */ into tbl1 partition(p0,p1) select * from tbl2 where col1 <20;
Example 1: Full example of incremental direct load using the INSERT INTO SELECT statement.
Use direct load to import some data from table tbl4 to table tbl3.
Query whether table
tbl3contains data. The result indicates that the table is empty.obclient [test]> SELECT * FROM tbl3; Empty setQuery whether table
tbl4contains data.obclient [test]> SELECT * FROM tbl4;The query result indicates that table
tbl4contains data.+------+------+------+ | COL1 | COL2 | COL3 | +------+------+------+ | 1 | a1 | 11 | | 2 | a2 | 22 | | 3 | a3 | 33 | +------+------+------+ 3 rows in set (0.000 sec)Use direct load to import data from table
tbl4to tabletbl3.Specify the hint for the
INSERT INTO SELECTstatement.Do not specify partitioned direct load.
obclient [test]> INSERT /*+ direct(true, 0, 'inc_replace') enable_parallel_dml parallel(16) */ INTO tbl3 SELECT t2.col1, t2.col3 FROM tbl4 t2 WHERE ROWNUM <= 10000; Query OK, 3 rows affected Records: 3 Duplicates: 0 Warnings: 0(Optional) Specify partitioned direct load.
obclient [test]> INSERT /*+ DIRECT(true, 0, 'inc_replace') enable_parallel_dml parallel(16) */ INTO tbl3 partition(p0, p1) SELECT t2.col1,1 FROM tbl4 partition(p0, p1) t2; Query OK, 3 rows affected Records: 3 Duplicates: 0 Warnings: 0obclient [test]> INSERT /*+ DIRECT(true, 0, 'inc_replace') enable_parallel_dml parallel(16) */ INTO tbl3 partition(p0, p1) SELECT t2.col1,1 FROM tbl4 partition(p0sp0_1, p1sp1_1) t2; Query OK, 3 rows affected Records: 3 Duplicates: 0 Warnings: 0
Do not specify the hint for the
INSERT INTO SELECTstatement.Set the value of the
default_load_modeparameter toINC_DIRECT_WRITEorINC_REPLACE_DIRECT_WRITE.obclient [test]> ALTER SYSTEM SET default_load_mode ='INC_DIRECT_WRITE';obclient [test]> INSERT INTO tbl3 SELECT t2.col1, t2.col3 FROM tbl4 t2 WHERE ROWNUM <= 10000;
Query whether data has been imported to table
tbl3.obclient [test]> SELECT * FROM tbl3;The query result is as follows:
+------+------+ | col1 | col3 | +------+------+ | 1 | 11 | | 2 | 22 | | 3 | 33 | +------+------+ 3 rows in setThe result indicates that data has been imported to table
tbl3.(Optional) In the
Notefield of the return result of theEXPLAIN EXTENDEDstatement, check whether data is written by direct load.obclient [test]> EXPLAIN EXTENDED INSERT /*+ direct(true, 0, 'inc_replace') enable_parallel_dml parallel(16) */ INTO tbl3 SELECT t2.col1,t2.col3 FROM tbl4 t2 WHERE ROWNUM <= 10000; The return result is as follows: ```shell +--------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Query Plan | +--------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | ========================================================================================= | | |ID|OPERATOR |NAME |EST.ROWS|EST.TIME(us)| | | ----------------------------------------------------------------------------------------- | | |0 |PX COORDINATOR | |3 |34 | | | |1 |└─EXCHANGE OUT DISTR |:EX10002 |3 |33 | | | |2 | └─INSERT | |3 |32 | | | |3 | └─EXCHANGE IN DISTR | |3 |7 | | | |4 | └─EXCHANGE OUT DISTR (RANDOM) |:EX10001 |3 |7 | | | |5 | └─MATERIAL | |3 |3 | | | |6 | └─SUBPLAN SCAN |ANONYMOUS_VIEW1|3 |3 | | | |7 | └─LIMIT | |3 |3 | | | |8 | └─EXCHANGE IN DISTR | |3 |3 | | | |9 | └─EXCHANGE OUT DISTR |:EX10000 |3 |1 | | | |10| └─LIMIT | |3 |1 | | | |11| └─PX BLOCK ITERATOR | |3 |1 | | | |12| └─COLUMN TABLE FULL SCAN|T2 |3 |1 | | | ========================================================================================= | | Outputs & filters: | | ------------------------------------- | | 0 - output(nil), filter(nil), rowset=16 | | 1 - output(nil), filter(nil), rowset=16 | | dop=16 | | 2 - output(nil), filter(nil) | | columns([{TBL3: ({TBL3: (TBL3.__pk_increment(0x7efaad22b0e0), TBL3.COL1(0x7efaad2123f0), TBL3.COL3(0x7efaad212830))})}]), partitions(p0), | | column_values([T_HIDDEN_PK(0x7efaad22b520)], [column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.COL1(0x7efaad22a860))(0x7efaad22b930)], [column_conv(NUMBER, | | PS:(-1,0),NULL,ANONYMOUS_VIEW1.COL3(0x7efaad22aca0))(0x7efaad233850)]) | | 3 - output([T_HIDDEN_PK(0x7efaad22b520)], [ANONYMOUS_VIEW1.COL1(0x7efaad22a860)], [ANONYMOUS_VIEW1.COL3(0x7efaad22aca0)]), filter(nil), rowset=16 | | 4 - output([T_HIDDEN_PK(0x7efaad22b520)], [ANONYMOUS_VIEW1.COL1(0x7efaad22a860)], [ANONYMOUS_VIEW1.COL3(0x7efaad22aca0)]), filter(nil), rowset=16 | | is_single, dop=1 | | 5 - output([ANONYMOUS_VIEW1.COL1(0x7efaad22a860)], [ANONYMOUS_VIEW1.COL3(0x7efaad22aca0)]), filter(nil), rowset=16 | | 6 - output([ANONYMOUS_VIEW1.COL1(0x7efaad22a860)], [ANONYMOUS_VIEW1.COL3(0x7efaad22aca0)]), filter(nil), rowset=16 | | access([ANONYMOUS_VIEW1.COL1(0x7efaad22a860)], [ANONYMOUS_VIEW1.COL3(0x7efaad22aca0)]) | | 7 - output([T2.COL1(0x7efaad229470)], [T2.COL3(0x7efaad229f40)]), filter(nil), rowset=16 | | limit(cast(FLOOR(cast(10000, NUMBER(-1, -85))(0x7efaad227ef0))(0x7efaad25ac30), BIGINT(-1, 0))(0x7efaad25b6d0)), offset(nil) | | 8 - output([T2.COL1(0x7efaad229470)], [T2.COL3(0x7efaad229f40)]), filter(nil), rowset=16 | | 9 - output([T2.COL1(0x7efaad229470)], [T2.COL3(0x7efaad229f40)]), filter(nil), rowset=16 | | dop=16 | | 10 - output([T2.COL1(0x7efaad229470)], [T2.COL3(0x7efaad229f40)]), filter(nil), rowset=16 | | limit(cast(FLOOR(cast(10000, NUMBER(-1, -85))(0x7efaad227ef0))(0x7efaad25ac30), BIGINT(-1, 0))(0x7efaad25b6d0)), offset(nil) | | 11 - output([T2.COL1(0x7efaad229470)], [T2.COL3(0x7efaad229f40)]), filter(nil), rowset=16 | | 12 - output([T2.COL1(0x7efaad229470)], [T2.COL3(0x7efaad229f40)]), filter(nil), rowset=16 | | access([T2.COL1(0x7efaad229470)], [T2.COL3(0x7efaad229f40)]), partitions(p0) | | limit(cast(FLOOR(cast(10000, NUMBER(-1, -85))(0x7efaad227ef0))(0x7efaad25ac30), BIGINT(-1, 0))(0x7efaad25b6d0)), offset(nil), is_index_back=false, | | is_global_index=false, | | range_key([T2.__pk_increment(0x7efaad25a720)]), range(MIN ; MAX)always true | | Used Hint: | | ------------------------------------- | | /*+ | | | | USE_PLAN_CACHE( NONE ) | | PARALLEL(16) | | ENABLE_PARALLEL_DML | | DIRECT(TRUE, 0, 'INC_REPLACE') | | */ | | Qb name trace: | | ------------------------------------- | | stmt_id:0, stmt_type:T_EXPLAIN | | stmt_id:1, INS$1 | | stmt_id:2, SEL$1 | | stmt_id:3, parent:SEL$1 > SEL$658037CB > SEL$DCAFFB86 | | Outline Data: | | ------------------------------------- | | /*+ | | BEGIN_OUTLINE_DATA | | PARALLEL(@"SEL$DCAFFB86" "T2"@"SEL$1" 16) | | FULL(@"SEL$DCAFFB86" "T2"@"SEL$1") | | USE_COLUMN_TABLE(@"SEL$DCAFFB86" "T2"@"SEL$1") | | MERGE(@"SEL$658037CB" < "SEL$1") | | USE_PLAN_CACHE( NONE ) | | PARALLEL(16) | | ENABLE_PARALLEL_DML | | OPTIMIZER_FEATURES_ENABLE('4.3.3.0') | | DIRECT(TRUE, 0, 'INC_REPLACE') | | END_OUTLINE_DATA | | */ | | Optimization Info: | | ------------------------------------- | | T2: | | table_rows:3 | | physical_range_rows:3 | | logical_range_rows:3 | | index_back_rows:0 | | output_rows:3 | | table_dop:16 | | dop_method:Global DOP | | avaiable_index_name:[TBL4] | | stats info:[version=0, is_locked=0, is_expired=0] | | dynamic sampling level:0 | | estimation method:[DEFAULT, STORAGE] | | Plan Type: | | DISTRIBUTED | | Note: | | Degree of Parallelism is 16 because of hint | | Direct-mode is enabled in insert into select | | Expr Constraints: | | cast(FLOOR(cast(10000, NUMBER(-1, -85))), BIGINT(-1, 0)) >= 1 result is TRUE | +--------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 96 rows in set (0.007 sec)
Example 2: Import data to a partition level by specifying the major partitions.
The destination table is a minor partitioned table. The major partitions are Range partitioned, and the minor partitions are Hash partitioned. The minor partitions are imported incrementally by using direct load for the major partitions.
Create a table
tbl1with two major partitions. The major partitions are Range partitioned, and the minor partitions are Hash partitioned.obclient [test]> CREATE TABLE tbl1_1 ( col1 INT, col2 INT ) PARTITION BY RANGE (col1) SUBPARTITION BY HASH (col2) SUBPARTITION TEMPLATE ( SUBPARTITION sp1, SUBPARTITION sp2, SUBPARTITION sp3 ) ( PARTITION p0 VALUES LESS THAN (10), PARTITION p1 VALUES LESS THAN (20) );Use direct load to import data from table
tbl2to major partitionsp0andp1of tabletbl1.obclient [test]> insert /*+ direct(true, 0, 'inc_replace') enable_parallel_dml parallel(3) append */ into tbl1 partition(p0,p1) select * from tbl2 where col1 <20;
Import data using the CREATE TABLE AS SELECT statement
The CREATE TABLE AS SELECT statement specifies the direct load method by using the DIRECT() and PARALLE() hints. If no hints are specified, the import behavior is determined based on the configuration item default_load_mode.
Syntax
CREATE /*+ [DIRECT(need_sort,max_error,{'inc'|'inc_replace'})] parallel(N) */ TABLE table_name [AS] select_sentence
For more information about the CREATE TABLE AS SELECT statement, see CREATE TABLE (MySQL mode) and CREATE TABLE (Oracle mode).
Parameters:
| Parameter | Description |
|---|---|
| DIRECT() | Enables direct load by using the hint. DIRECT() parameters are described as follows:
NoticeWhen |
| parallel(N) | The parallelism of data loading. This parameter is required. The value is an integer greater than 1. |
Examples
Use direct load to import data from table tbl1 to another table. Perform the following steps in sequence.
Create table
tbl1.obclient [test]> CREATE TABLE tbl1(c1 int);Insert data into table
tbl1.obclient [test]> INSERT INTO tbl1 VALUES (1),(2),(3);Query data from table
tbl1.obclient [test]> SELECT * FROM tbl1;The query result is as follows:
+------+ | c1 | +------+ | 1 | | 2 | | 3 | +------+ 3 rows in set (0.027 sec)Use direct load to import data from table
tbl1to tabletbl2.Specify the hint for the
CREATE TABLE AS SELECTstatement.Use the
inchint for direct load. Theinchint is specified as follows:DIRECT(need_sort = false, max_error = 100, inc).obclient [test]> CREATE /*+ direct(true, 0, 'inc') parallel(4) */ TABLE tbl2 AS SELECT * FROM tbl1;Use the
inc_replacehint for direct load. Theinc_replacehint is specified as follows:DIRECT(need_sort = false, max_error = 100, inc_replace).obclient [test]> CREATE /*+ direct(true, 0, 'inc_replace') parallel(4) */ TABLE tbl2 AS SELECT * FROM tbl1;
Do not specify the hint for the
CREATE TABLE AS SELECTstatement.Set the value of the
default_load_modeparameter toINC_DIRECT_WRITEorINC_REPLACE_DIRECT_WRITE.obclient [test]> ALTER SYSTEM SET default_load_mode ='INC_DIRECT_WRITE';obclient [test]> CREATE TABLE tbl2 AS SELECT * FROM tbl1;
Verify whether data has been imported to table
tbl2.obclient [test]> SELECT * FROM tbl2;The query result is as follows:
+------+ | c1 | +------+ | 1 | | 3 | | 2 | +------+ 3 rows in set (0.050 sec)The result indicates that data has been imported to table
tbl2.
Use direct load to import data from table tbl1 to another table. Perform the following steps in sequence.
Create table
tbl1.obclient [SYS]> CREATE TABLE tbl1(c1 int);Insert data into table
tbl1.obclient [SYS]> INSERT INTO tbl1 VALUES (1),(2),(3);Query data from table
tbl1.obclient [SYS]> SELECT * FROM tbl1;The query result is as follows:
+------+ | C1 | +------+ | 1 | | 2 | | 3 | +------+ 3 rows in set (0.045 sec)Use direct load to import data from table
tbl1to tabletbl2.Specify the hint for the
CREATE TABLE AS SELECTstatement.Use the
inchint for direct load. Theinchint is specified as follows:DIRECT(need_sort = false, max_error = 100, inc).obclient [SYS]> CREATE /*+ direct(true, 0, 'inc') parallel(4) */ TABLE tbl2 AS SELECT * FROM tbl1;Use the
inc_replacehint for direct load. Theinc_replacehint is specified as follows:DIRECT(need_sort = false, max_error = 100, inc_replace).obclient [SYS]> CREATE /*+ direct(true, 0, 'inc_replace') parallel(4) */ TABLE tbl2 AS SELECT * FROM tbl1;
Do not specify the hint for the
CREATE TABLE AS SELECTstatement.Set the value of the
default_load_modeparameter toINC_DIRECT_WRITEorINC_REPLACE_DIRECT_WRITE.obclient [SYS]> ALTER SYSTEM SET default_load_mode ='INC_DIRECT_WRITE';obclient [SYS]> CREATE TABLE tbl2 AS SELECT * FROM tbl1;
Verify whether data has been imported to the
tbl2table.obclient [SYS]> SELECT * FROM tbl2;The query result is as follows:
+------+ | C1 | +------+ | 3 | | 2 | | 1 | +------+ 3 rows in set (0.013 sec)The result indicates that data has been imported to the
tbl2table.
