When you want to import incremental data to a table that contains data, you can use the incremental direct load feature. Although you can use full direct load to import incremental data, this approach will rewrite all the original data and has poor import performance. The incremental direct load feature differs from full direct load in that it only operates on incremental data during the import process, ensuring high import performance.
This topic describes how to implement incremental direct load by using the LOAD DATA, INSERT INTO SELECT, and CREATE TABLE AS SELECT statements.
Considerations
Take note of the following considerations when you use direct load:
Data imported in incremental direct load mode triggers a minor compaction. If the data volume is small and can be imported within minutes, we recommend that you do not use incremental direct load mode.
Incremental direct load supports a target table with a non-unique local index.
The
LOAD DATAstatement supports running in multi-row transactions and proactively commits the previous transaction during execution.When you use the
INSERT INTO SELECTstatement to perform direct load, only Parallel Data Manipulation Language (PDML) is supported. Non-PDML statements cannot be used for direct load.Take note that when you use specified partition direct load in the
LOAD DATAandINSERT INTO SELECTstatements, the target table must not be a replicated table and must not contain auto-increment columns, identifier columns, or global indexes.For V4.3.5, incremental direct load has the following changes starting from V4.3.5 BP1:
- If the last-level partition of the target table is hash or key partitioned, partition-level direct load is supported.
- Only one local unique index is supported. If the target table has multiple local or global unique indexes, the import operation will fail.
- In the
LOAD DATAandINSERT INTO SELECTstatements, parallel partition import is supported only when the import tasks do not have overlapping partitions. - When the foreign_key_checks session variable is set to False, foreign key constraints are not checked during direct load operations.
Use the LOAD DATA statement for direct load
The LOAD DATA statement supports the DIRECT() hint for incremental direct load. If the hint is not specified, the behavior of importing data is determined based on the default_load_mode parameter.
Limitations
- Only one write statement can be executed at a time during the import process because the import process locks the table and allows only read operations.
- DDL operations are not supported.
- Tables with generated columns are not supported. Some indexes generate hidden generated columns, for example, KEY
idx_c2(c2(16)) GLOBAL. - Data import for a single row exceeding 2 MB is not supported.
- Liboblog and flashback queries are not supported.
- Tables with foreign keys are not supported.
Syntax
LOAD DATA /*+ [DIRECT(need_sort,max_error,{'inc'|'inc_replace'})] parallel(N) */ [REMOTE_OSS | LOCAL] INFILE 'file_name' INTO TABLE table_name [PARTITION(PARTITION_OPTION)] [COMPRESSION]...
For more information about the LOAD DATA syntax, see LOAD DATA.
Parameter description:
| Parameter | Description |
|---|---|
| DIRECT() | Enables direct load. The DIRECT() parameters are described as follows:
NoteWhen |
| parallel(N) | Required. The degree of parallelism for data loading. The default value is 4. |
| REMOTE_OSS | LOCAL | Optional.
|
| file_name | The path and name of the input file. The format of this parameter corresponds to the REMOTE_OSS | LOCAL type.
NoteWhen you import a file from OSS, make sure of the following:
|
| table_name | The name of the table to which the data is imported. You can specify any number of columns in the table. |
| PARTITION_OPTION | The name of the partition to be imported in bypass mode.
|
| COMPRESSION | Specifies the compression format of the file. Valid values are:
|
Note
Like full direct load, incremental direct load supported importing data from multiple files by using wildcards in the LOAD DATA statement.
Example
The steps for enabling incremental direct load by using the LOAD DATA statement are the same as those for enabling full direct load, except that the full field is replaced with the inc or inc_replace field.
Note
The following example shows how to import data from a server-side file. OceanBase Database supports importing data to a local file using the LOCAL INFILE method with the LOAD DATA statement. For more information about LOAD DATA LOCAL INFILE, see Import data by using the LOAD DATA statement.
Log in to the server where the OBServer node to be connected is located and create the
tbl1test data in the/home/admindirectory.Note
The
LOAD DATAstatement in OceanBase Database only supports loading input files stored on the local storage of an OBServer node. Therefore, you need to copy the file to an OBServer node before importing it.[xxx@xxx /home/admin]# ssh admin@10.10.10.1[admin@xxx /home/admin]# vi tbl1.csv 1,11 2,22 3,33Set the path of the imported file.
Set the system variable
secure_file_privto specify the path that can be accessed when you import or export files.Notice
For security reasons, when setting the system variable
secure_file_priv, the SQL statement to modify this global variable can only be executed through a local Socket connection to the database. For more information, see secure_file_priv.Log in to the server where the OBServer node to connect to resides.
[xxx@xxx /home/admin]# ssh admin@10.10.10.1Execute the following command to connect to the
mysql001tenant using a local Unix socket.obclient -S /home/admin/oceanbase/run/sql.sock -uroot@mysql001 -p******Set the import path to
/home/admin.obclient [(none)]> SET GLOBAL secure_file_priv = "/home/admin"; Query OK, 0 rows affected
After the database is reconnected, use the
LOAD /*+ DIRECT */ DATAstatement to import data.Create a table named
tbl1.obclient [test]> CREATE TABLE tbl1 ( col1 INT PRIMARY KEY, col2 INT ) PARTITION BY RANGE (col1) SUBPARTITION BY RANGE (col1) ( PARTITION p0 VALUES LESS THAN (100) ( SUBPARTITION p0_1 VALUES LESS THAN (50), SUBPARTITION p0_2 VALUES LESS THAN (100) ), PARTITION p1 VALUES LESS THAN (200) ( SUBPARTITION p1_1 VALUES LESS THAN (150), SUBPARTITION p1_2 VALUES LESS THAN (200) ) ); Query OK, 0 rows affectedQuery table
tbl1for data. The table is empty.obclient [test]> SELECT * FROM tbl1; Empty setUse direct load to import the data from the
tbl1.csvfile to thetbl1table.Import data to all columns of the
tbl1table.obclient [test]> LOAD DATA /*+ direct(true,1024,'inc_replace') parallel(16) */ INFILE '/home/admin/tbl1.csv' INTO TABLE tbl1 FIELDS TERMINATED BY ','; Query OK, 3 rows affected Records: 3 Deleted: 0 Skipped: 0 Warnings: 0Import data to specified columns of the
tbl1table. For example, specifycol1andcol2.obclient [test]> LOAD DATA /*+ direct(true,1024,'inc_replace') parallel(16) */ INFILE '/home/admin/tbl1.csv' INTO TABLE tbl1 FIELDS TERMINATED BY ','(col1,col2); Query OK, 3 rows affected Records: 3 Deleted: 0 Skipped: 0 Warnings: 0(Optional) Specify to import data to the partitions of the
tbl1table.obclient [test]> LOAD DATA /*+ direct(true,1024,'inc_replace') parallel(16) */ INFILE '/home/admin/tbl1.csv' INTO TABLE tbl1 partition(p0, p1) FIELDS TERMINATED BY ',';(Optional) Specify to import data to the subpartition
tbl1.obclient [test]> LOAD DATA /*+ direct(true,1024,'inc_replace') parallel(16) */ INFILE '/home/admin/tbl1.csv' INTO TABLE tbl1 partition(p0sp0_1, p1sp1_1) FIELDS TERMINATED BY ',';Use the
default_load_modeparameter to import data.Set the value of
default_load_modetoINC_DIRECT_WRITEorINC_REPLACE_DIRECT_WRITE.obclient [test]> ALTER SYSTEM SET default_load_mode ='INC_DIRECT_WRITE';Do not specify hints in the
LOAD DATAstatement.obclient [test]> LOAD DATA INFILE '/home/admin/tbl1.csv' INTO TABLE tbl2 FIELDS TERMINATED BY ',';
Check whether data has been imported to the
tbl1table.obclient [test]> SELECT * FROM tbl1;The query result is as follows:
+------+------+ | col1 | col2 | +------+------+ | 1 | 11 | | 2 | 22 | | 3 | 33 | +------+------+ 3 rows in setThe result shows that data has been imported into table
tbl2.
Log in to the server where the OBServer node to be connected is located and create a test table named
tbl1in the/home/admindirectory.Note
The
LOAD DATAstatement in OceanBase Database only supports loading input files that are stored on the local storage of an OBServer node. Therefore, you need to copy the file to an OBServer node before importing it.[xxx@xxx /home/admin]# ssh admin@10.10.10.1[admin@xxx /home/admin]# vi tbl1.csv 1,11 2,22 3,33Set the path of the imported file.
Set the system variable
secure_file_privto configure the path that can be accessed during file import or export.Notice
For security reasons, when setting the system variable
secure_file_priv, the SQL statement to modify this global variable can only be executed through a local Socket connection to the database. For more information, see secure_file_priv.Log in to the server where the OBServer node to connect to resides.
[xxx@xxx /home/admin]# ssh admin@10.10.10.1Execute the following command to connect to the
oracle001tenant using a local Unix socket.obclient -S /home/admin/oceanbase/run/sql.sock -usys@oracle001 -p******Set the import path to
/home/admin.obclient [(none)]> SET GLOBAL secure_file_priv = "/home/admin"; Query OK, 0 rows affected
After the database is reconnected, use the
LOAD /*+ DIRECT */ DATAstatement to import data.Create a table named
tbl2.obclient [test]> CREATE TABLE tbl2 ( col1 INT PRIMARY KEY, col2 INT ) PARTITION BY RANGE (col1) SUBPARTITION BY RANGE (col1) ( PARTITION p0 VALUES LESS THAN (100) ( SUBPARTITION sp0_1 VALUES LESS THAN (50), SUBPARTITION sp0_2 VALUES LESS THAN (100) ), PARTITION p1 VALUES LESS THAN (200) ( SUBPARTITION sp1_1 VALUES LESS THAN (150), SUBPARTITION sp1_2 VALUES LESS THAN (200) ) ); Query OK, 0 rows affectedQuery table
tbl2for data. The table appears to be empty.obclient [test]> SELECT * FROM tbl2; Empty setUse direct load to import the data from the
tbl1.csvfile to thetbl2table.Import data to all columns of the
tbl2table.obclient [test]> LOAD DATA /*+ direct(true,1024,'inc_replace') parallel(16) */ INFILE '/home/admin/tbl1.csv' INTO TABLE tbl2 FIELDS TERMINATED BY ','; Query OK, 3 rows affected Records: 3 Deleted: 0 Skipped: 0 Warnings: 0Specify the columns of the
tbl2table to import data. For example, you can specifycol1andcol2.obclient [test]> LOAD DATA /*+ direct(true,1024,'inc_replace') parallel(16) */ INFILE '/home/admin/tbl1.csv' INTO TABLE tbl2 FIELDS TERMINATED BY ','(col1,col2); Query OK, 3 rows affected Records: 3 Deleted: 0 Skipped: 0 Warnings: 0(Optional) Specify to import data to the subpartition
tbl2.obclient [test]> LOAD DATA /*+ direct(true,1024,'inc_replace') parallel(16) */ INFILE '/home/admin/tbl1.csv' INTO TABLE tbl2 partition(p0, p1) FIELDS TERMINATED BY ',';(Optional) Specify to import data to subpartitions of the
tbl2table.obclient [test]> LOAD DATA /*+ direct(true,1024,'inc_replace') parallel(16) */ INFILE '/home/admin/tbl1.csv' INTO TABLE tbl2 partition(p0sp0_1, p1sp1_1) FIELDS TERMINATED BY ',';Use the
default_load_modeparameter to import data.Set the value of
default_load_modetoINC_DIRECT_WRITEorINC_REPLACE_DIRECT_WRITE.obclient [test]> ALTER SYSTEM SET default_load_mode ='INC_DIRECT_WRITE';Do not specify hints in the
LOAD DATAstatement.obclient [test]> LOAD DATA INFILE '/home/admin/tbl1.csv' INTO TABLE tbl2 FIELDS TERMINATED BY ',';
Check whether data has been imported to the
tbl2table.obclient [test]> SELECT * FROM tbl2;The query result is as follows:
+------+------+ | col1 | col2 | +------+------+ | 1 | 11 | | 2 | 22 | | 3 | 33 | +------+------+ 3 rows in setThe result shows that data has been imported to table
tbl2.
Use the INSERT INTO SELECT statement for direct load
The INSERT INTO SELECT statement uses the direct() hint along with the enable_parallel_dml hint to perform direct load. If you do not specify these hints, the import mode is determined based on the default_load_mode parameter.
Limitations
- Only PDML (Parallel Data Manipulation Language) is supported. Data cannot be imported in bypass mode for statements that are not PDML. For more information about PDML, see Parallel DML.
- Two write statements for writing data to one table cannot be executed at the same time. The reason is that the table is locked during data import and only read operations are supported.
- This feature is not supported for tables with triggers.
- This feature is not supported for tables with generated columns. Some indexes generate hidden generated columns, for example, KEY
idx_c2(c2(16)) GLOBAL. - This feature is not supported with Liboblog or flashback queries.
- Tables with indexes (excluding primary keys) are not supported.
- Tables with foreign keys are not supported.
Syntax
INSERT /*+ [DIRECT(need_sort,max_error,{'inc'|'inc_replace'})] enable_parallel_dml parallel(N) */ INTO table_name [PARTITION(PARTITION_OPTION)] select_sentence
For more information about the INSERT INTO syntax, see INSERT (MySQL-compatible mode) and INSERT (Oracle-compatible mode).
Parameter description:
| Parameter | Description |
|---|---|
| DIRECT() | Enables direct load. The parameters of the DIRECT() hint are described in the following table.
|
| enable_parallel_dml | The degree of parallelism for loading data.
NoteIn most cases, the |
| parallel(N) | The degree of parallelism for loading data. This parameter is required and must be an integer greater than 1. |
| table_name | The name of the table to which the data is imported. You can specify any number of columns in the table. |
| PARTITION_OPTION | The name of the partition for direct load:
|
Example
The procedure for enabling incremental direct load by using the INSERT INTO SELECT statement is the same as that for enabling full direct load, except that the full field is replaced with the inc or inc_replace field.
Example 1: A complete example of using INSERT INTO SELECT for incremental direct load.
Use direct load to import some data from tbl2 to tbl1.
Query table
tbl1for data. The table is empty.obclient [test]> SELECT * FROM tbl1; Empty setQuery whether the
tbl2table has data.obclient [test]> SELECT * FROM tbl2;Data is queried from the
tbl2table.+------+------+------+ | col1 | col2 | col3 | +------+------+------+ | 1 | a1 | 11 | | 2 | a2 | 22 | | 3 | a3 | 33 | +------+------+------+ 3 rows in setUse direct load to import data from table
tbl2to tabletbl1.Specifies the hint for the
INSERT INTO SELECTstatement.The data is imported without specifying partitions.
obclient [test]> INSERT /*+ DIRECT(true, 0, 'inc_replace') enable_parallel_dml parallel(16) */ INTO tbl1 SELECT t2.col1,t2.col3 FROM tbl2 t2; Query OK, 3 rows affected Records: 3 Duplicates: 0 Warnings: 0(Optional) Specify partitions to import.
obclient [test]> INSERT /*+ DIRECT(true, 0, 'inc_replace') enable_parallel_dml parallel(16) */ INTO tbl1 partition(p0, p1) SELECT t2.col1,1 FROM tbl2 partition(p0, p1) t2; Query OK, 3 rows affected Records: 3 Duplicates: 0 Warnings: 0obclient [test]> INSERT /*+ DIRECT(true, 0, 'inc_replace') enable_parallel_dml parallel(16) */ INTO tbl1 partition(p0, p1) SELECT t2.col1,1 FROM tbl2 partition(p0sp0_1, p1sp1_1) t2; Query OK, 3 rows affected Records: 3 Duplicates: 0 Warnings: 0
Do not specify hints for the
INSERT INTO SELECTstatement.Set the value of the
default_load_modeparameter toINC_DIRECT_WRITEorINC_REPLACE_DIRECT_WRITE.obclient [test]> ALTER SYSTEM SET default_load_mode ='INC_DIRECT_WRITE';obclient [test]> INSERT INTO tbl1 SELECT t2.col1,t2.col3 FROM tbl1 t2;
Check whether data has been imported to the
tbl1table.obclient [test]> SELECT * FROM tbl1;The query result is as follows:
+------+------+ | col1 | col2 | +------+------+ | 1 | 11 | | 2 | 22 | | 3 | 33 | +------+------+ 3 rows in setThe result shows that data has been imported into table
tbl1.(Optional) In the
Notefield of the result of theEXPLAIN EXTENDEDstatement, check whether the data is written through direct load.obclient [test]> EXPLAIN EXTENDED INSERT /*+ direct(true, 0, 'inc_replace') enable_parallel_dml parallel(16) */ INTO tbl1 SELECT t2.col1,t2.col3 FROM tbl2 t2;The return result is as follows:
+--------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Query Plan | +--------------------------------------------------------------------------------------------------------------------------------------------------------------+ | ============================================================================== | | |ID|OPERATOR |NAME |EST.ROWS|EST.TIME(us)| | | ------------------------------------------------------------------------------ | | |0 |PX COORDINATOR | |3 |27 | | | |1 |└─EXCHANGE OUT DISTR |:EX10001 |3 |27 | | | |2 | └─INSERT | |3 |26 | | | |3 | └─EXCHANGE IN DISTR | |3 |1 | | | |4 | └─EXCHANGE OUT DISTR (RANDOM)|:EX10000 |3 |1 | | | |5 | └─SUBPLAN SCAN |ANONYMOUS_VIEW1|3 |1 | | | |6 | └─PX BLOCK ITERATOR | |3 |1 | | | |7 | └─TABLE FULL SCAN |t2 |3 |1 | | | ============================================================================== | | Outputs & filters: | | ------------------------------------- | | 0 - output(nil), filter(nil), rowset=16 | | 1 - output(nil), filter(nil), rowset=16 | | dop=16 | | 2 - output(nil), filter(nil) | | columns([{tbl1: ({tbl1: (tbl1.__pk_increment(0x7efa518277d0), tbl1.col1(0x7efa518119c0), tbl1.col3(0x7efa51811e00))})}]), partitions(p0), | | column_values([T_HIDDEN_PK(0x7efa51827c10)], [column_conv(INT,PS:(11,0),NULL,ANONYMOUS_VIEW1.col1(0x7efa51826f50))(0x7efa51828030)], [column_conv(INT, | | PS:(11,0),NULL,ANONYMOUS_VIEW1.col3(0x7efa51827390))(0x7efa5182ff60)]) | | 3 - output([T_HIDDEN_PK(0x7efa51827c10)], [ANONYMOUS_VIEW1.col1(0x7efa51826f50)], [ANONYMOUS_VIEW1.col3(0x7efa51827390)]), filter(nil), rowset=16 | | 4 - output([T_HIDDEN_PK(0x7efa51827c10)], [ANONYMOUS_VIEW1.col1(0x7efa51826f50)], [ANONYMOUS_VIEW1.col3(0x7efa51827390)]), filter(nil), rowset=16 | | dop=16 | | 5 - output([ANONYMOUS_VIEW1.col1(0x7efa51826f50)], [ANONYMOUS_VIEW1.col3(0x7efa51827390)]), filter(nil), rowset=16 | | access([ANONYMOUS_VIEW1.col1(0x7efa51826f50)], [ANONYMOUS_VIEW1.col3(0x7efa51827390)]) | | 6 - output([t2.col1(0x7efa51825f10)], [t2.col3(0x7efa518267c0)]), filter(nil), rowset=16 | | 7 - output([t2.col1(0x7efa51825f10)], [t2.col3(0x7efa518267c0)]), filter(nil), rowset=16 | | access([t2.col1(0x7efa51825f10)], [t2.col3(0x7efa518267c0)]), partitions(p0) | | is_index_back=false, is_global_index=false, | | range_key([t2.__pk_increment(0x7efa51856410)]), range(MIN ; MAX)always true | | Used Hint: | | ------------------------------------- | | /*+ | | | | USE_PLAN_CACHE( NONE ) | | PARALLEL(16) | | ENABLE_PARALLEL_DML | | DIRECT(TRUE, 0, 'INC_REPLACE') | | */ | | Qb name trace: | | ------------------------------------- | | stmt_id:0, stmt_type:T_EXPLAIN | | stmt_id:1, INS$1 | | stmt_id:2, SEL$1 | | Outline Data: | | ------------------------------------- | | /*+ | | BEGIN_OUTLINE_DATA | | PARALLEL(@"SEL$1" "t2"@"SEL$1" 16) | | FULL(@"SEL$1" "t2"@"SEL$1") | | USE_PLAN_CACHE( NONE ) | | PARALLEL(16) | | ENABLE_PARALLEL_DML | | OPTIMIZER_FEATURES_ENABLE('4.3.3.0') | | DIRECT(TRUE, 0, 'INC_REPLACE') | | END_OUTLINE_DATA | | */ | | Optimization Info: | | ------------------------------------- | | t2: | | table_rows:3 | | physical_range_rows:3 | | logical_range_rows:3 | | index_back_rows:0 | | output_rows:3 | | table_dop:16 | | dop_method:Global DOP | | avaiable_index_name:[tbl2] | | stats info:[version=0, is_locked=0, is_expired=0] | | dynamic sampling level:0 | | estimation method:[DEFAULT, STORAGE] | | Plan Type: | | DISTRIBUTED | | Note: | | Degree of Parallelism is 16 because of hint | | Direct-mode is enabled in insert into select | +--------------------------------------------------------------------------------------------------------------------------------------------------------------+ 77 rows in set (0.009 sec)
Example 2: Specify to import data to a subpartition of a RANGE-HASH partitioned table.
The target table is doubly partitioned, with the first level partitioned by RANGE and the second level partitioned by HASH. The incremental direct load is specified for the first-level partitions.
Create a table named
tbl1with two partitions. The first level partitioning is by RANGE, and the second level partitioning is by HASH.obclient [test]> CREATE TABLE tbl1(col1 INT,col2 INT) PARTITION BY RANGE COLUMNS(col1) SUBPARTITION BY HASH(col2) SUBPARTITIONS 3 ( PARTITION p0 VALUES LESS THAN(10), PARTITION p1 VALUES LESS THAN(20));Use direct load to import the data from table
tbl2to partitionsp0andp1of tabletbl1.obclient [test]> insert /*+ direct(true, 0, 'inc_replace') enable_parallel_dml parallel(3) append */ into tbl1 partition(p0,p1) select * from tbl2 where col1 <20;
Example 1: Complete example of using INSERT INTO SELECT for incremental direct load.
Use direct load to import some data from table tbl4 to tbl3.
Query table
tbl3for data. The table is empty.obclient [test]> SELECT * FROM tbl3; Empty setQuery whether the
tbl4table has data.obclient [test]> SELECT * FROM tbl4;Data is found in the
tbl4table.+------+------+------+ | COL1 | COL2 | COL3 | +------+------+------+ | 1 | a1 | 11 | | 2 | a2 | 22 | | 3 | a3 | 33 | +------+------+------+ 3 rows in set (0.000 sec)Use direct load to import the data from table
tbl4to tabletbl3.Specifies the hint for the
INSERT INTO SELECTstatement.The data is imported without specifying partitions.
obclient [test]> INSERT /*+ direct(true, 0, 'inc_replace') enable_parallel_dml parallel(16) */ INTO tbl3 SELECT t2.col1, t2.col3 FROM tbl4 t2 WHERE ROWNUM <= 10000; Query OK, 3 rows affected Records: 3 Duplicates: 0 Warnings: 0(Optional) Specify partitions to import.
obclient [test]> INSERT /*+ DIRECT(true, 0, 'inc_replace') enable_parallel_dml parallel(16) */ INTO tbl3 partition(p0, p1) SELECT t2.col1,1 FROM tbl4 partition(p0, p1) t2; Query OK, 3 rows affected Records: 3 Duplicates: 0 Warnings: 0obclient [test]> INSERT /*+ DIRECT(true, 0, 'inc_replace') enable_parallel_dml parallel(16) */ INTO tbl3 partition(p0, p1) SELECT t2.col1,1 FROM tbl4 partition(p0sp0_1, p1sp1_1) t2; Query OK, 3 rows affected Records: 3 Duplicates: 0 Warnings: 0
Do not specify hints for the
INSERT INTO SELECTstatement.Set the value of the
default_load_modeparameter toINC_DIRECT_WRITEorINC_REPLACE_DIRECT_WRITE.obclient [test]> ALTER SYSTEM SET default_load_mode ='INC_DIRECT_WRITE';obclient [test]> INSERT INTO tbl3 SELECT t2.col1, t2.col3 FROM tbl4 t2 WHERE ROWNUM <= 10000;
Check whether data has been imported to the
tbl3table.obclient [test]> SELECT * FROM tbl3;The query result is as follows:
+------+------+ | col1 | col3 | +------+------+ | 1 | 11 | | 2 | 22 | | 3 | 33 | +------+------+ 3 rows in setThe result shows that data has been imported into table
tbl3.(Optional) In the
Notefield of the result set of theEXPLAIN EXTENDEDstatement, check whether the data is written through direct load.obclient [test]> EXPLAIN EXTENDED INSERT /*+ direct(true, 0, 'inc_replace') enable_parallel_dml parallel(16) */ INTO tbl3 SELECT t2.col1,t2.col3 FROM tbl4 t2 WHERE ROWNUM <= 10000; The return result is as follows: ```shell +--------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Query Plan | +--------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | ========================================================================================= | | |ID|OPERATOR |NAME |EST.ROWS|EST.TIME(us)| | | ----------------------------------------------------------------------------------------- | | |0 |PX COORDINATOR | |3 |34 | | | |1 |└─EXCHANGE OUT DISTR |:EX10002 |3 |33 | | | |2 | └─INSERT | |3 |32 | | | |3 | └─EXCHANGE IN DISTR | |3 |7 | | | |4 | └─EXCHANGE OUT DISTR (RANDOM) |:EX10001 |3 |7 | | | |5 | └─MATERIAL | |3 |3 | | | |6 | └─SUBPLAN SCAN |ANONYMOUS_VIEW1|3 |3 | | | |7 | └─LIMIT | |3 |3 | | | |8 | └─EXCHANGE IN DISTR | |3 |3 | | | |9 | └─EXCHANGE OUT DISTR |:EX10000 |3 |1 | | | |10| └─LIMIT | |3 |1 | | | |11| └─PX BLOCK ITERATOR | |3 |1 | | | |12| └─COLUMN TABLE FULL SCAN|T2 |3 |1 | | | ========================================================================================= | | Outputs & filters: | | ------------------------------------- | | 0 - output(nil), filter(nil), rowset=16 | | 1 - output(nil), filter(nil), rowset=16 | | dop=16 | | 2 - output(nil), filter(nil) | | columns([{TBL3: ({TBL3: (TBL3.__pk_increment(0x7efaad22b0e0), TBL3.COL1(0x7efaad2123f0), TBL3.COL3(0x7efaad212830))})}]), partitions(p0), | | column_values([T_HIDDEN_PK(0x7efaad22b520)], [column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.COL1(0x7efaad22a860))(0x7efaad22b930)], [column_conv(NUMBER, | | PS:(-1,0),NULL,ANONYMOUS_VIEW1.COL3(0x7efaad22aca0))(0x7efaad233850)]) | | 3 - output([T_HIDDEN_PK(0x7efaad22b520)], [ANONYMOUS_VIEW1.COL1(0x7efaad22a860)], [ANONYMOUS_VIEW1.COL3(0x7efaad22aca0)]), filter(nil), rowset=16 | | 4 - output([T_HIDDEN_PK(0x7efaad22b520)], [ANONYMOUS_VIEW1.COL1(0x7efaad22a860)], [ANONYMOUS_VIEW1.COL3(0x7efaad22aca0)]), filter(nil), rowset=16 | | is_single, dop=1 | | 5 - output([ANONYMOUS_VIEW1.COL1(0x7efaad22a860)], [ANONYMOUS_VIEW1.COL3(0x7efaad22aca0)]), filter(nil), rowset=16 | | 6 - output([ANONYMOUS_VIEW1.COL1(0x7efaad22a860)], [ANONYMOUS_VIEW1.COL3(0x7efaad22aca0)]), filter(nil), rowset=16 | | access([ANONYMOUS_VIEW1.COL1(0x7efaad22a860)], [ANONYMOUS_VIEW1.COL3(0x7efaad22aca0)]) | | 7 - output([T2.COL1(0x7efaad229470)], [T2.COL3(0x7efaad229f40)]), filter(nil), rowset=16 | | limit(cast(FLOOR(cast(10000, NUMBER(-1, -85))(0x7efaad227ef0))(0x7efaad25ac30), BIGINT(-1, 0))(0x7efaad25b6d0)), offset(nil) | | 8 - output([T2.COL1(0x7efaad229470)], [T2.COL3(0x7efaad229f40)]), filter(nil), rowset=16 | | 9 - output([T2.COL1(0x7efaad229470)], [T2.COL3(0x7efaad229f40)]), filter(nil), rowset=16 | | dop=16 | | 10 - output([T2.COL1(0x7efaad229470)], [T2.COL3(0x7efaad229f40)]), filter(nil), rowset=16 | | limit(cast(FLOOR(cast(10000, NUMBER(-1, -85))(0x7efaad227ef0))(0x7efaad25ac30), BIGINT(-1, 0))(0x7efaad25b6d0)), offset(nil) | | 11 - output([T2.COL1(0x7efaad229470)], [T2.COL3(0x7efaad229f40)]), filter(nil), rowset=16 | | 12 - output([T2.COL1(0x7efaad229470)], [T2.COL3(0x7efaad229f40)]), filter(nil), rowset=16 | | access([T2.COL1(0x7efaad229470)], [T2.COL3(0x7efaad229f40)]), partitions(p0) | | limit(cast(FLOOR(cast(10000, NUMBER(-1, -85))(0x7efaad227ef0))(0x7efaad25ac30), BIGINT(-1, 0))(0x7efaad25b6d0)), offset(nil), is_index_back=false, | | is_global_index=false, | | range_key([T2.__pk_increment(0x7efaad25a720)]), range(MIN ; MAX)always true | | Used Hint: | | ------------------------------------- | | /*+ | | | | USE_PLAN_CACHE( NONE ) | | PARALLEL(16) | | ENABLE_PARALLEL_DML | | DIRECT(TRUE, 0, 'INC_REPLACE') | | */ | | Qb name trace: | | ------------------------------------- | | stmt_id:0, stmt_type:T_EXPLAIN | | stmt_id:1, INS$1 | | stmt_id:2, SEL$1 | | stmt_id:3, parent:SEL$1 > SEL$658037CB > SEL$DCAFFB86 | | Outline Data: | | ------------------------------------- | | /*+ | | BEGIN_OUTLINE_DATA | | PARALLEL(@"SEL$DCAFFB86" "T2"@"SEL$1" 16) | | FULL(@"SEL$DCAFFB86" "T2"@"SEL$1") | | USE_COLUMN_TABLE(@"SEL$DCAFFB86" "T2"@"SEL$1") | | MERGE(@"SEL$658037CB" < "SEL$1") | | USE_PLAN_CACHE( NONE ) | | PARALLEL(16) | | ENABLE_PARALLEL_DML | | OPTIMIZER_FEATURES_ENABLE('4.3.3.0') | | DIRECT(TRUE, 0, 'INC_REPLACE') | | END_OUTLINE_DATA | | */ | | Optimization Info: | | ------------------------------------- | | T2: | | table_rows:3 | | physical_range_rows:3 | | logical_range_rows:3 | | index_back_rows:0 | | output_rows:3 | | table_dop:16 | | dop_method:Global DOP | | avaiable_index_name:[TBL4] | | stats info:[version=0, is_locked=0, is_expired=0] | | dynamic sampling level:0 | | estimation method:[DEFAULT, STORAGE] | | Plan Type: | | DISTRIBUTED | | Note: | | Degree of Parallelism is 16 because of hint | | Direct-mode is enabled in insert into select | | Expr Constraints: | | cast(FLOOR(cast(10000, NUMBER(-1, -85))), BIGINT(-1, 0)) >= 1 result is TRUE | +--------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 96 rows in set (0.007 sec)
Example 2: For a RANGE-HASH subpartitioned table, specify the direct load of data for the partition.
The target table is doubly partitioned, with the first level partitioned by RANGE and the second level partitioned by HASH. The incremental direct load is specified for the first-level partitions.
Create a table named
tbl1with two partitions. The first partition is range-partitioned, and the second partition is hash-partitioned.obclient [test]> CREATE TABLE tbl1_1 ( col1 INT, col2 INT ) PARTITION BY RANGE (col1) SUBPARTITION BY HASH (col2) SUBPARTITION TEMPLATE ( SUBPARTITION sp1, SUBPARTITION sp2, SUBPARTITION sp3 ) ( PARTITION p0 VALUES LESS THAN (10), PARTITION p1 VALUES LESS THAN (20) );Use direct load to import the data from table
tbl2into thep0andp1partitions of tabletbl1.obclient [test]> insert /*+ direct(true, 0, 'inc_replace') enable_parallel_dml parallel(3) append */ into tbl1 partition(p0,p1) select * from tbl2 where col1 <20;
Import data by using the CREATE TABLE AS SELECT statement
The CREATE TABLE AS SELECT statement specifies the direct load mode for the data import by using the DIRECT() and PARALLEL() hints. If no hint is specified, the data import mode is determined based on the default_load_mode parameter.
Syntax
CREATE /*+ [DIRECT(need_sort,max_error,{'inc'|'inc_replace'})] parallel(N) */ TABLE table_name [AS] select_sentence
For more information about the CREATE TABLE AS SELECT syntax, see CREATE TABLE (MySQL-compatible mode) and CREATE TABLE (Oracle-compatible mode).
Parameter description:
| Parameter | Description |
|---|---|
| DIRECT() | Enables direct load. The DIRECT() parameter takes the following values:
NoticeIf |
| parallel(N) | The degree of parallelism for loading data. This parameter is required and must be an integer greater than 1. |
Examples
Use direct load to import data from table tbl1 to other tables.
Create a table named
tbl1.obclient [test]> CREATE TABLE tbl1(c1 int);Insert data into the
tbl1table.obclient [test]> INSERT INTO tbl1 VALUES (1),(2),(3);Query the data in the
tbl1table.obclient [test]> SELECT * FROM tbl1;The query result is as follows:
+------+ | c1 | +------+ | 1 | | 2 | | 3 | +------+ 3 rows in set (0.027 sec)Use the direct load method to import data from table
tbl1to tabletbl2.Specifies the hint for the
CREATE TABLE AS SELECTstatement.Use the
inchint to bypass data import.obclient [test]> CREATE /*+ direct(true, 0, 'inc') parallel(4) */ TABLE tbl2 AS SELECT * FROM tbl1;Use the
inc_replacehint to perform direct load.obclient [test]> CREATE /*+ direct(true, 0, 'inc_replace') parallel(4) */ TABLE tbl2 AS SELECT * FROM tbl1;
Do not specify hints in the
CREATE TABLE AS SELECTstatement.Set the value of the
default_load_modeparameter toINC_DIRECT_WRITEorINC_REPLACE_DIRECT_WRITE.obclient [test]> ALTER SYSTEM SET default_load_mode ='INC_DIRECT_WRITE';obclient [test]> CREATE TABLE tbl2 AS SELECT * FROM tbl1;
Check whether data has been imported to the
tbl2table.obclient [test]> SELECT * FROM tbl2;The query result is as follows:
+------+ | c1 | +------+ | 1 | | 3 | | 2 | +------+ 3 rows in set (0.050 sec)The result shows that data has been imported into table
tbl2.
Use direct load to import data from table tbl1 to other tables.
Create a table named
tbl1.obclient [SYS]> CREATE TABLE tbl1(c1 int);Insert data into the
tbl1table.obclient [SYS]> INSERT INTO tbl1 VALUES (1),(2),(3);Query the data in the
tbl1table.obclient [SYS]> SELECT * FROM tbl1;The query result is as follows:
+------+ | C1 | +------+ | 1 | | 2 | | 3 | +------+ 3 rows in set (0.045 sec)Use direct load to import data from table
tbl1to tabletbl2.Specifies the hint for the
CREATE TABLE AS SELECTstatement.Use the
inchint to perform direct load.obclient [SYS]> CREATE /*+ direct(true, 0, 'inc') parallel(4) */ TABLE tbl2 AS SELECT * FROM tbl1;Use the
inc_replacehint to perform direct load.obclient [SYS]> CREATE /*+ direct(true, 0, 'inc_replace') parallel(4) */ TABLE tbl2 AS SELECT * FROM tbl1;
Do not specify hints in the
CREATE TABLE AS SELECTstatement.Set the value of the
default_load_modeparameter toINC_DIRECT_WRITEorINC_REPLACE_DIRECT_WRITE.obclient [SYS]> ALTER SYSTEM SET default_load_mode ='INC_DIRECT_WRITE';obclient [SYS]> CREATE TABLE tbl2 AS SELECT * FROM tbl1;
Verify whether data has been imported to the
tbl2table.obclient [SYS]> SELECT * FROM tbl2;The query result is as follows:
+------+ | C1 | +------+ | 3 | | 2 | | 1 | +------+ 3 rows in set (0.013 sec)The result shows that data has been imported into table
tbl2.