This topic describes how to perform a full direct load by using the LOAD DATA, INSERT INTO SELECT, and CREATE TABLE AS SELECT statements.
Considerations
When performing a full direct load, note the following:
OceanBase Database supports direct load for LOB data types starting from V4.3.0.
During a direct load, a table lock is applied, making the table read-only and preventing other data from being written to it.
Full direct load is suitable for scenarios such as the initial import of large tables, data migration at the scale of 10 GB to TB, and environments with limited CPU and memory resources. This is because the direct load execution path is shorter, saving CPU overhead.
The
LOAD DATAstatement supports execution within multi-row transactions and will proactively commit any previous transactions during execution.When using the
INSERT INTO SELECTstatement for direct load, only Parallel Data Manipulation Language (PDML) is supported. Non-PDML cannot be used for direct load.When performing partition-specific direct load using the
LOAD DATAorINSERT INTO SELECTstatements, note that the target table cannot be a replicated table, nor can it contain auto-increment columns, identity columns, or global indexes.For V4.3.5, starting from V4.3.5 BP1, the following changes apply to full direct load:
- If the target table's last-level partition is a hash/key partition, partition-level direct load is supported.
- When the session-level variable foreign_key_checks is set to False, the direct load operation will not perform foreign key constraint checks.
For
LOAD DATAstatements, full direct load on tables with unique indexes does not currently support theREPLACEorIGNOREkeywords when duplicate unique index keys are encountered, nor does it support setting error-tolerant rows.
Use the LOAD DATA statement for direct load
The LOAD DATA statement performs direct load by using the append/direct() hint. If no hint is specified, the behavior of data import is determined based on the parameter default_load_mode.
Notice
OceanBase Database optimizes the data import speed of the LOAD DATA statement through parallel processing. This operation divides the data into multiple subtasks that are executed in parallel, with each subtask treated as an independent transaction. The execution order is not fixed. As a result, for tables without primary keys, the data write order may differ from the order in the original file.
Limitations
- During the import process, it is not possible to execute two write operations simultaneously because the import process applies a table lock, and only read operations are allowed throughout the entire import process.
- Triggers are not supported during the import process.
- Tables containing generated columns are not supported (some indexes may create hidden generated columns, such as
KEY idx_c2 (c2(16)) GLOBAL). - Importing data with a single row exceeding 2 MB is not supported.
- Liboblog and flashback queries are not supported.
Syntax
LOAD DATA /*+ [APPEND | DIRECT(need_sort,max_error,'full')] parallel(N) */ [REMOTE_OSS | LOCAL] INFILE 'file_name' INTO TABLE table_name [PARTITION(PARTITION_OPTION)] [COMPRESSION]...
For more information about the LOAD DATA syntax, see LOAD DATA.
Parameter description:
| Parameter | Description | |
|---|---|---|
| APPEND | DIRECT() | Enables the direct load feature using hints.
|
|
| parallel(N) | Required. The degree of parallelism for loading data. The default value is 4. | |
| REMOTE_OSS | LOCAL | Optional.
|
| file_name | Specifies the path and file name of the input file. The format of this parameter corresponds to the type of REMOTE_OSS | LOCAL:
NoteWhen importing files from OSS, ensure the following:
|
|
| table_name | The name of the table to which data is imported. You can specify any number of columns in the table. | |
| PARTITION_OPTION | Specifies the partition names for direct load:
|
|
| COMPRESSION | Specifies the compression format of the file. Valid values are:
|
Example
Note
The following example shows how to import data from a server file. The LOAD DATA statement of OceanBase Database also supports importing data by loading local files (LOCAL INFILE). For more information about LOAD DATA LOCAL INFILE, see Import data by using the LOAD DATA statement.
Log in to the machine where the OBServer node is located and create test data in the
/home/admindirectory.Note
The
LOAD DATAstatement in OceanBase Database only supports loading input files that are stored on the local storage of an OBServer node. Therefore, you need to copy the file to an OBServer node before importing it.Log in to the server where the OBServer node resides.
[xxx@xxx /home/admin]# ssh admin@10.10.10.1Create the test data file
tbl1.csv.[admin@xxx /home/admin]# vi tbl1.csv 1,11 2,22 3,33
Set the path of the imported file.
Set the system variable
secure_file_privto configure the path that can be accessed when you import or export files.Notice
For security reasons, when you set the system variable
secure_file_priv, you can connect to the database only through a local socket to execute the SQL statement that modifies the global variable. For more information, see secure_file_priv.Log in to the server where the OBServer node to connect to resides.
[xxx@xxx /home/admin]# ssh admin@10.10.10.1Execute the following command to connect to the
mysql001tenant using a local Unix socket.obclient -S /home/admin/oceanbase/run/sql.sock -uroot@mysql001 -p******Set the import path to
/home/admin.obclient [(none)]> SET GLOBAL secure_file_priv = "/home/admin"; Query OK, 0 rows affected
After the database is reconnected, use the
LOAD /*+ DIRECT */ DATAstatement to import data.Create a table named
tbl2.obclient [test]> CREATE TABLE tbl2 ( col1 INT PRIMARY KEY, col2 INT ) PARTITION BY RANGE (col1) SUBPARTITION BY RANGE (col1) ( PARTITION p0 VALUES LESS THAN (100) ( SUBPARTITION p0_1 VALUES LESS THAN (50), SUBPARTITION p0_2 VALUES LESS THAN (100) ), PARTITION p1 VALUES LESS THAN (200) ( SUBPARTITION p1_1 VALUES LESS THAN (150), SUBPARTITION p1_2 VALUES LESS THAN (200) ) ); Query OK, 0 rows affectedQuery table
tbl2for data. The table appears to be empty.obclient [test]> SELECT * FROM tbl2; Empty setUse the direct load method to import the data from the
tbl1.csvfile to thetbl2table.Import data to all columns of the
tbl2table.obclient [test]> LOAD DATA /*+ direct(true,1024,'full') parallel(16) */ INFILE '/home/admin/tbl1.csv' INTO TABLE tbl2 FIELDS TERMINATED BY ','; Query OK, 3 rows affected Records: 3 Deleted: 0 Skipped: 0 Warnings: 0Specify the columns of the
tbl2table to import data. For example, you can specifycol1andcol2.obclient [test]> LOAD DATA /*+ direct(true,1024,'full') parallel(16) */ INFILE '/home/admin/tbl1.csv' INTO TABLE tbl2 FIELDS TERMINATED BY ','(col1,col2); Query OK, 3 rows affected Records: 3 Deleted: 0 Skipped: 0 Warnings: 0Import data to the subpartition
tbl2of the partitioned table.obclient [test]> LOAD DATA /*+ direct(true,1024,'full') parallel(16) */ INFILE '/home/admin/tbl1.csv' INTO TABLE tbl2 partition(p0, p1) FIELDS TERMINATED BY ',';Import data to subpartitions of table
tbl2.obclient [test]> LOAD DATA /*+ direct(true,1024,'full') parallel(16) */ INFILE '/home/admin/tbl1.csv' INTO TABLE tbl2 partition(p0sp0_1, p1sp1_1) FIELDS TERMINATED BY ',';Use the
default_load_modeparameter to import data.Set the value of
default_load_modetoFULL_DIRECT_WRITE.obclient [test]> ALTER SYSTEM SET default_load_mode ='FULL_DIRECT_WRITE';Do not specify hints in the
LOAD DATAstatement.obclient [test]> LOAD DATA INFILE '/home/admin/tbl1.csv' INTO TABLE tbl2 FIELDS TERMINATED BY ',';
Verify whether data has been imported to the
tbl2table.obclient [test]> SELECT * FROM tbl2;The query result is as follows:
+------+------+ | col1 | col2 | +------+------+ | 1 | 11 | | 2 | 22 | | 3 | 33 | +------+------+ 3 rows in setThe result shows that data has been imported into table
tbl2.
Log in to the server where the OBServer node to be connected is located and create the test data
tbl1in the/home/admindirectory.Note
The
LOAD DATAstatement in OceanBase Database only supports loading input files that are stored on the local disk of an OBServer node. Therefore, you need to copy the file to an OBServer node before importing it.Log in to the server where the OBServer node resides.
[xxx@xxx /home/admin]# ssh admin@10.10.10.1Create the test data
tbl1.csv.[admin@xxx /home/admin]# vi tbl1.csv 1,11 2,22 3,33
Set the path of the imported file.
Set the system variable
secure_file_privto specify the path that can be accessed when you import or export files.Notice
For security reasons, when you set the system variable
secure_file_priv, you can connect to the database only through a local socket to execute the SQL statement that modifies the global variable. For more information, see secure_file_priv.Log in to the server where the OBServer node to connect to resides.
[xxx@xxx /home/admin]# ssh admin@10.10.10.1Execute the following command to connect to the
oracle001tenant using the local Unix socket connection method.obclient -S /home/admin/oceanbase/run/sql.sock -usys@oracle001 -p******Set the import path to
/home/admin.obclient [(none)]> SET GLOBAL secure_file_priv = "/home/admin"; Query OK, 0 rows affected
After the database is reconnected, use the
LOAD /*+ DIRECT */ DATAstatement to import data.Create a table named
tbl2.obclient [SYS]> CREATE TABLE tbl2 ( col1 INT PRIMARY KEY, col2 INT ) PARTITION BY RANGE (col1) SUBPARTITION BY RANGE (col1) ( PARTITION p0 VALUES LESS THAN (100) ( SUBPARTITION p0_1 VALUES LESS THAN (50), SUBPARTITION p0_2 VALUES LESS THAN (100) ), PARTITION p1 VALUES LESS THAN (200) ( SUBPARTITION p1_1 VALUES LESS THAN (150), SUBPARTITION p1_2 VALUES LESS THAN (200) ) ); Query OK, 0 rows affectedQuery table
tbl2for data. The table is empty.obclient [SYS]> SELECT * FROM tbl2; Empty setUse the direct load method to import the data from the
tbl1.csvfile to thetbl2table.Import data to all columns of table
tbl2.obclient [SYS]> LOAD DATA /*+ direct(true,1024,'full') parallel(16) */ INFILE '/home/admin/tbl1.csv' INTO TABLE tbl2 FIELDS TERMINATED BY ','; Query OK, 3 rows affected Records: 3 Deleted: 0 Skipped: 0 Warnings: 0Specify the columns of the
tbl2table to import data. For example, you can specifycol1andcol2.obclient [SYS]> LOAD DATA /*+ direct(true,1024,'full') parallel(16) */ INFILE '/home/admin/tbl1.csv' INTO TABLE tbl2 FIELDS TERMINATED BY ','(col1,col2); Query OK, 3 rows affected Records: 3 Deleted: 0 Skipped: 0 Warnings: 0(Optional) Specify to import data to the subpartitions of the
tbl2table.obclient [SYS]> LOAD DATA /*+ direct(true,1024,'full') parallel(16) */ INFILE '/home/admin/tbl1.csv' INTO TABLE tbl2 partition(p0, p1) FIELDS TERMINATED BY ',';(Optional) Specify to import data to subpartitions of the
tbl2table.obclient [SYS]> LOAD DATA /*+ direct(true,1024,'full') parallel(16) */ INFILE '/home/admin/tbl1.csv' INTO TABLE tbl2 partition(p0sp0_1, p1sp1_1) FIELDS TERMINATED BY ',';Use the
default_load_modeparameter to import data.Set the value of
default_load_modetoFULL_DIRECT_WRITE.obclient [SYS]> ALTER SYSTEM SET default_load_mode ='FULL_DIRECT_WRITE';Do not specify hints in the
LOAD DATAstatement.obclient [SYS]> LOAD DATA INFILE '/home/admin/tbl1.csv' INTO TABLE tbl2 FIELDS TERMINATED BY ',';
Check whether data has been imported to the
tbl2table.obclient [SYS]> SELECT * FROM tbl2;The query result is as follows:
+------+------+ | col1 | col2 | +------+------+ | 1 | 11 | | 2 | 22 | | 3 | 33 | +------+------+ 3 rows in setThe result shows that data has been imported into table
tbl2.
Use the INSERT INTO SELECT statement for direct load
The INSERT INTO SELECT statement can be used to perform direct load by using the append/direct() hint and the enable_parallel_dml variable. If no hint is specified, the behavior of importing data is determined by the default_load_mode parameter.
Limitations
- Only PDML is supported. Non-PDML statements cannot use direct load. For more details about parallel DML, see Parallel DML.
- During the import process, executing two write operations at the same time, such as writing to the same table, is not allowed. This is because a table lock is applied at the start of the process, and only read operations are permitted throughout the entire import.
- Direct load is classified as a DDL statement and cannot be executed within a multi-statement transaction, which is a transaction containing multiple operations.
- It cannot be executed within a
BEGINblock. Autocommitmust be set to 1.
- It cannot be executed within a
- Not supported for use in triggers.
- Tables with generated columns are not supported. Some indexes may create hidden generated columns, such as
KEY idx_c2 (c2(16)) GLOBAL. - Liboblog and flashback queries are not supported.
Syntax
INSERT /*+ [APPEND |DIRECT(need_sort,max_error,'full')] enable_parallel_dml parallel(N) */ INTO table_name [PARTITION(PARTITION_OPTION)] select_sentence
For more information about the INSERT INTO syntax, see INSERT (MySQL mode) and INSERT (Oracle mode).
Parameter description:
| Parameter | Description |
|---|---|
| APPEND | DIRECT() | Uses hints to enable direct load.
|
| enable_parallel_dml | The degree of parallelism for loading data.
NoteGenerally, the |
| parallel(N) | The degree of parallelism for loading data. This parameter is required and must be an integer greater than 1. |
| table_name | The name of the table to which the data is imported. You can specify any number of columns in the table. |
| PARTITION_OPTION | Specifies the partition names for direct load:
|
Example
Example 1: Use the INSERT INTO SELECT statement to perform full direct load.
Use direct load to import part of the data from tbl2 to tbl1.
Query table
tbl1for data. The table is empty.obclient [test]> SELECT * FROM tbl1; Empty setQuery whether the
tbl2table has data.obclient [test]> SELECT * FROM tbl2;Data is queried from the
tbl2table.+------+------+------+ | col1 | col2 | col3 | +------+------+------+ | 1 | a1 | 11 | | 2 | a2 | 22 | | 3 | a3 | 33 | +------+------+------+ 3 rows in setUse the direct load method to import the data from table
tbl2to tabletbl1.Specify a hint for the
INSERT INTO SELECTstatement.The data is imported without specifying partitions.
obclient [test]> INSERT /*+ DIRECT(true, 0, 'full') enable_parallel_dml parallel(16) */ INTO tbl1 SELECT t2.col1,t2.col3 FROM tbl2 t2; Query OK, 3 rows affected Records: 3 Duplicates: 0 Warnings: 0(Optional) Specify partitions to import.
obclient [test]> INSERT /*+ DIRECT(true, 0, 'full') enable_parallel_dml parallel(16) */ INTO tbl1 partition(p0, p1) SELECT t2.col1,1 FROM tbl2 partition(p0, p1) t2; Query OK, 3 rows affected Records: 3 Duplicates: 0 Warnings: 0obclient [test]> INSERT /*+ DIRECT(true, 0, 'full') enable_parallel_dml parallel(16) */ INTO tbl1 partition(p0, p1) SELECT t2.col1,1 FROM tbl2 partition(p0sp0_1, p1sp1_1) t2; Query OK, 3 rows affected Records: 3 Duplicates: 0 Warnings: 0
Do not specify hints for the
INSERT INTO SELECTstatement.Set the value of the
default_load_modeparameter toFULL_DIRECT_WRITE.obclient [test]> ALTER SYSTEM SET default_load_mode ='FULL_DIRECT_WRITE';obclient [test]> INSERT INTO tbl1 SELECT t2.col1,t2.col3 FROM tbl1 t2;
Check whether data has been imported to the
tbl1table.obclient [test]> SELECT * FROM tbl1;The query result is as follows:
+------+------+ | col1 | col2 | +------+------+ | 1 | 11 | | 2 | 22 | | 3 | 33 | +------+------+ 3 rows in setThe result shows that data has been imported into table
tbl1.(Optional) Check the
Notesection in the results returned by theEXPLAIN EXTENDEDstatement to see if the data was written using direct load.obclient [test]> EXPLAIN EXTENDED INSERT /*+ direct(true, 0, 'full') enable_parallel_dml parallel(16) */ INTO tbl1 SELECT t2.col1,t2.col3 FROM tbl2 t2;The return result is as follows:
+--------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Query Plan | +--------------------------------------------------------------------------------------------------------------------------------------------------------------+ | ============================================================================== | | |ID|OPERATOR |NAME |EST.ROWS|EST.TIME(us)| | | ------------------------------------------------------------------------------ | | |0 |PX COORDINATOR | |3 |27 | | | |1 |└─EXCHANGE OUT DISTR |:EX10001 |3 |27 | | | |2 | └─INSERT | |3 |26 | | | |3 | └─EXCHANGE IN DISTR | |3 |1 | | | |4 | └─EXCHANGE OUT DISTR (RANDOM)|:EX10000 |3 |1 | | | |5 | └─SUBPLAN SCAN |ANONYMOUS_VIEW1|3 |1 | | | |6 | └─PX BLOCK ITERATOR | |3 |1 | | | |7 | └─TABLE FULL SCAN |t2 |3 |1 | | | ============================================================================== | | Outputs & filters: | | ------------------------------------- | | 0 - output(nil), filter(nil), rowset=16 | | 1 - output(nil), filter(nil), rowset=16 | | dop=16 | | 2 - output(nil), filter(nil) | | columns([{tbl1: ({tbl1: (tbl1.__pk_increment(0x7efa63627790), tbl1.col1(0x7efa63611980), tbl1.col3(0x7efa63611dc0))})}]), partitions(p0), | | column_values([T_HIDDEN_PK(0x7efa63627bd0)], [column_conv(INT,PS:(11,0),NULL,ANONYMOUS_VIEW1.col1(0x7efa63626f10))(0x7efa63627ff0)], [column_conv(INT, | | PS:(11,0),NULL,ANONYMOUS_VIEW1.col3(0x7efa63627350))(0x7efa6362ff20)]) | | 3 - output([T_HIDDEN_PK(0x7efa63627bd0)], [ANONYMOUS_VIEW1.col1(0x7efa63626f10)], [ANONYMOUS_VIEW1.col3(0x7efa63627350)]), filter(nil), rowset=16 | | 4 - output([T_HIDDEN_PK(0x7efa63627bd0)], [ANONYMOUS_VIEW1.col1(0x7efa63626f10)], [ANONYMOUS_VIEW1.col3(0x7efa63627350)]), filter(nil), rowset=16 | | dop=16 | | 5 - output([ANONYMOUS_VIEW1.col1(0x7efa63626f10)], [ANONYMOUS_VIEW1.col3(0x7efa63627350)]), filter(nil), rowset=16 | | access([ANONYMOUS_VIEW1.col1(0x7efa63626f10)], [ANONYMOUS_VIEW1.col3(0x7efa63627350)]) | | 6 - output([t2.col1(0x7efa63625ed0)], [t2.col3(0x7efa63626780)]), filter(nil), rowset=16 | | 7 - output([t2.col1(0x7efa63625ed0)], [t2.col3(0x7efa63626780)]), filter(nil), rowset=16 | | access([t2.col1(0x7efa63625ed0)], [t2.col3(0x7efa63626780)]), partitions(p0) | | is_index_back=false, is_global_index=false, | | range_key([t2.__pk_increment(0x7efa63656410)]), range(MIN ; MAX)always true | | Used Hint: | | ------------------------------------- | | /*+ | | | | USE_PLAN_CACHE( NONE ) | | PARALLEL(16) | | ENABLE_PARALLEL_DML | | DIRECT(TRUE, 0, 'FULL') | | */ | | Qb name trace: | | ------------------------------------- | | stmt_id:0, stmt_type:T_EXPLAIN | | stmt_id:1, INS$1 | | stmt_id:2, SEL$1 | | Outline Data: | | ------------------------------------- | | /*+ | | BEGIN_OUTLINE_DATA | | PARALLEL(@"SEL$1" "t2"@"SEL$1" 16) | | FULL(@"SEL$1" "t2"@"SEL$1") | | USE_PLAN_CACHE( NONE ) | | PARALLEL(16) | | ENABLE_PARALLEL_DML | | OPTIMIZER_FEATURES_ENABLE('4.3.3.0') | | DIRECT(TRUE, 0, 'FULL') | | END_OUTLINE_DATA | | */ | | Optimization Info: | | ------------------------------------- | | t2: | | table_rows:3 | | physical_range_rows:3 | | logical_range_rows:3 | | index_back_rows:0 | | output_rows:3 | | table_dop:16 | | dop_method:Global DOP | | avaiable_index_name:[tbl2] | | stats info:[version=0, is_locked=0, is_expired=0] | | dynamic sampling level:0 | | estimation method:[DEFAULT, STORAGE] | | Plan Type: | | DISTRIBUTED | | Note: | | Degree of Parallelism is 16 because of hint | | Direct-mode is enabled in insert into select | +--------------------------------------------------------------------------------------------------------------------------------------------------------------+ 77 rows in set (0.009 sec)
Example 2: For a Range-Hash subpartitioned table, specify a primary partition for direct load.
The target table is a subpartitioned table, with the partition being a Range partition and the subpartition being a Hash partition. Specify the partition for full direct load.
Create a table named
tbl1with two partitions. The partition is by RANGE, and the subpartition is by HASH.obclient [test]> CREATE TABLE tbl1(col1 INT,col2 INT) PARTITION BY RANGE COLUMNS(col1) SUBPARTITION BY HASH(col2) SUBPARTITIONS 3 ( PARTITION p0 VALUES LESS THAN(10), PARTITION p1 VALUES LESS THAN(20));Use direct load to import the data from table
tbl2to partitionsp0andp1of tabletbl1.obclient [test]> insert /*+ direct(true, 0, 'full') enable_parallel_dml parallel(3) append */ into tbl1 partition(p0,p1) select * from tbl2 where col1 <20;
Example 1: A complete example of using INSERT INTO SELECT for full direct load.
Use direct load to import part of the data from table tbl4 to tbl3.
Check whether table
tbl3contains any data. At this point, the table is empty.obclient [test]> SELECT * FROM tbl3; Empty setCheck whether table
tbl4contains any data.obclient [test]> SELECT * FROM tbl4;The query shows that table
tbl4contains data.+------+------+------+ | COL1 | COL2 | COL3 | +------+------+------+ | 1 | a1 | 11 | | 2 | a2 | 22 | | 3 | a3 | 33 | +------+------+------+ 3 rows in set (0.000 sec)Use direct load to import the data from table
tbl4to tabletbl3.Specify the hint for the
INSERT INTO SELECTstatement.Do not specify partitions for import.
obclient [test]> INSERT /*+ direct(true, 0, 'full') enable_parallel_dml parallel(16) */ INTO tbl3 SELECT t2.col1, t2.col3 FROM tbl4 t2 WHERE ROWNUM <= 10000; Query OK, 3 rows affected Records: 3 Duplicates: 0 Warnings: 0(Optional) Specify partitions to import.
obclient [test]> INSERT /*+ DIRECT(true, 0, 'full') enable_parallel_dml parallel(16) */ INTO tbl3 partition(p0, p1) SELECT t2.col1,1 FROM tbl4 partition(p0, p1) t2; Query OK, 3 rows affected Records: 3 Duplicates: 0 Warnings: 0obclient [test]> INSERT /*+ DIRECT(true, 0, 'full') enable_parallel_dml parallel(16) */ INTO tbl3 partition(p0, p1) SELECT t2.col1,1 FROM tbl4 partition(p0sp0_1, p1sp1_1) t2; Query OK, 3 rows affected Records: 3 Duplicates: 0 Warnings: 0
Do not specify hints for the
INSERT INTO SELECTstatement.Set the value of the
default_load_modeparameter toFULL_DIRECT_WRITE.obclient [test]> ALTER SYSTEM SET default_load_mode ='FULL_DIRECT_WRITE';obclient [test]> INSERT INTO tbl3 SELECT t2.col1, t2.col3 FROM tbl4 t2 WHERE ROWNUM <= 10000;
Verify whether data has been imported to the
tbl3table.obclient [test]> SELECT * FROM tbl3;The query result is as follows:
+------+------+ | col1 | col3 | +------+------+ | 1 | 11 | | 2 | 22 | | 3 | 33 | +------+------+ 3 rows in setThe result shows that data has been imported into table
tbl3.(Optional) Check the
Notesection in the results returned by theEXPLAIN EXTENDEDstatement to verify whether the data was written using direct load.obclient [test]> EXPLAIN EXTENDED INSERT /*+ direct(true, 0, 'full') enable_parallel_dml parallel(16) */ INTO tbl3 SELECT t2.col1,t2.col3 FROM tbl4 t2;The return result is as follows:
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Query Plan | +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | =================================================================================== | | |ID|OPERATOR |NAME |EST.ROWS|EST.TIME(us)| | | ----------------------------------------------------------------------------------- | | |0 |OPTIMIZER STATS MERGE | |3 |29 | | | |1 |└─PX COORDINATOR | |3 |29 | | | |2 | └─EXCHANGE OUT DISTR |:EX10001 |3 |27 | | | |3 | └─INSERT | |3 |27 | | | |4 | └─OPTIMIZER STATS GATHER | |3 |1 | | | |5 | └─EXCHANGE IN DISTR | |3 |1 | | | |6 | └─EXCHANGE OUT DISTR (RANDOM) |:EX10000 |3 |1 | | | |7 | └─SUBPLAN SCAN |ANONYMOUS_VIEW1|3 |1 | | | |8 | └─PX BLOCK ITERATOR | |3 |1 | | | |9 | └─COLUMN TABLE FULL SCAN|T2 |3 |1 | | | =================================================================================== | | Outputs & filters: | | ------------------------------------- | | 0 - output(nil), filter(nil), rowset=16 | | 1 - output([column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.COL1(0x7ef8f6027720))(0x7ef8f60287f0)], [column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.COL3(0x7ef8f6027b60))(0x7ef8f6030710)]), filter(nil), rowset=16 | | 2 - output([column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.COL1(0x7ef8f6027720))(0x7ef8f60287f0)], [column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.COL3(0x7ef8f6027b60))(0x7ef8f6030710)]), filter(nil), rowset=16 | | dop=16 | | 3 - output([column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.COL1(0x7ef8f6027720))(0x7ef8f60287f0)], [column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.COL3(0x7ef8f6027b60))(0x7ef8f6030710)]), filter(nil) | | columns([{TBL3: ({TBL3: (TBL3.__pk_increment(0x7ef8f6027fa0), TBL3.COL1(0x7ef8f6011d50), TBL3.COL3(0x7ef8f6012190))})}]), partitions(p0), | | column_values([T_HIDDEN_PK(0x7ef8f60283e0)], [column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.COL1(0x7ef8f6027720))(0x7ef8f60287f0)], [column_conv(NUMBER, | | PS:(-1,0),NULL,ANONYMOUS_VIEW1.COL3(0x7ef8f6027b60))(0x7ef8f6030710)]) | | 4 - output([column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.COL1(0x7ef8f6027720))(0x7ef8f60287f0)], [column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.COL3(0x7ef8f6027b60))(0x7ef8f6030710)], | | [T_HIDDEN_PK(0x7ef8f60283e0)]), filter(nil), rowset=16 | | 5 - output([T_HIDDEN_PK(0x7ef8f60283e0)], [ANONYMOUS_VIEW1.COL1(0x7ef8f6027720)], [ANONYMOUS_VIEW1.COL3(0x7ef8f6027b60)]), filter(nil), rowset=16 | | 6 - output([T_HIDDEN_PK(0x7ef8f60283e0)], [ANONYMOUS_VIEW1.COL1(0x7ef8f6027720)], [ANONYMOUS_VIEW1.COL3(0x7ef8f6027b60)]), filter(nil), rowset=16 | | dop=16 | | 7 - output([ANONYMOUS_VIEW1.COL1(0x7ef8f6027720)], [ANONYMOUS_VIEW1.COL3(0x7ef8f6027b60)]), filter(nil), rowset=16 | | access([ANONYMOUS_VIEW1.COL1(0x7ef8f6027720)], [ANONYMOUS_VIEW1.COL3(0x7ef8f6027b60)]) | | 8 - output([T2.COL1(0x7ef8f60264c0)], [T2.COL3(0x7ef8f6026f90)]), filter(nil), rowset=16 | | 9 - output([T2.COL1(0x7ef8f60264c0)], [T2.COL3(0x7ef8f6026f90)]), filter(nil), rowset=16 | | access([T2.COL1(0x7ef8f60264c0)], [T2.COL3(0x7ef8f6026f90)]), partitions(p0) | | is_index_back=false, is_global_index=false, | | range_key([T2.__pk_increment(0x7ef8f60584a0)]), range(MIN ; MAX)always true | | Used Hint: | | ------------------------------------- | | /*+ | | | | USE_PLAN_CACHE( NONE ) | | PARALLEL(16) | | ENABLE_PARALLEL_DML | | DIRECT(TRUE, 0, 'FULL') | | */ | | Qb name trace: | | ------------------------------------- | | stmt_id:0, stmt_type:T_EXPLAIN | | stmt_id:1, INS$1 | | stmt_id:2, SEL$1 | | Outline Data: | | ------------------------------------- | | /*+ | | BEGIN_OUTLINE_DATA | | PARALLEL(@"SEL$1" "T2"@"SEL$1" 16) | | FULL(@"SEL$1" "T2"@"SEL$1") | | USE_COLUMN_TABLE(@"SEL$1" "T2"@"SEL$1") | | USE_PLAN_CACHE( NONE ) | | PARALLEL(16) | | ENABLE_PARALLEL_DML | | OPTIMIZER_FEATURES_ENABLE('4.3.3.0') | | DIRECT(TRUE, 0, 'FULL') | | END_OUTLINE_DATA | | */ | | Optimization Info: | | ------------------------------------- | | T2: | | table_rows:3 | | physical_range_rows:3 | | logical_range_rows:3 | | index_back_rows:0 | | output_rows:3 | | table_dop:16 | | dop_method:Global DOP | | avaiable_index_name:[TBL4] | | stats info:[version=0, is_locked=0, is_expired=0] | | dynamic sampling level:0 | | estimation method:[DEFAULT, STORAGE] | | Plan Type: | | DISTRIBUTED | | Note: | | Degree of Parallelism is 16 because of hint | +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 82 rows in set (0.006 sec)
Example 2: For a range-hash subpartitioned table, specify to directly load data into the partition.
The target table is a subpartitioned table, with the partion being a range partition and the subpartition being a hash partition. Specify the primary partition for full direct load.
Create a table named
tbl1with two partitions. The partition is by range, and the subpartition is by hash.obclient [test]> CREATE TABLE tbl1_1 ( col1 INT, col2 INT ) PARTITION BY RANGE (col1) SUBPARTITION BY HASH (col2) SUBPARTITION TEMPLATE ( SUBPARTITION sp1, SUBPARTITION sp2, SUBPARTITION sp3 ) ( PARTITION p0 VALUES LESS THAN (10), PARTITION p1 VALUES LESS THAN (20) );Use direct load to import the data from table
tbl2to partitionsp0andp1of tabletbl1.obclient [test]> insert /*+ direct(true, 0, 'full') enable_parallel_dml parallel(3) append */ into tbl1 partition(p0,p1) select * from tbl2 where col1 <20;
Use the CREATE TABLE AS SELECT statement for direct load
You can use the CREATE TABLE AS SELECT statement to specify the direct load mode by using the DIRECT() hint. If the hint is not specified, the direct load mode is determined based on the default_load_mode parameter.
Syntax
CREATE /*+ [APPEND | DIRECT(need_sort,max_error,load_type)] parallel(N) */ TABLE table_name [AS] select_sentence
For more information about the CREATE TABLE syntax, see CREATE TABLE (MySQL mode) and CREATE TABLE (Oracle mode).
Parameter description:
| Parameter | Description |
|---|---|
| APPEND | DIRECT() | Use hints to enable direct load.
|
| parallel(N) | The degree of parallelism for loading data. This parameter is required and must be an integer greater than 1. |
Example
Use direct load to import data from table tbl1 to other tables.
Create a table named
tbl1.obclient [test]> CREATE TABLE tbl1(c1 int);Insert data into the
tbl1table.obclient [test]> INSERT INTO tbl1 VALUES (1),(2),(3);Query the data in the
tbl1table.obclient [test]> SELECT * FROM tbl1;The query result is as follows:
+------+ | c1 | +------+ | 1 | | 2 | | 3 | +------+ 3 rows in set (0.027 sec)Use direct load to import data from table
tbl1to tabletbl2.Specifies a hint for the
CREATE TABLE AS SELECTstatement.Use the
APPENDhint for direct load.obclient [test]> CREATE /*+ append parallel(4) */ TABLE tbl2 AS SELECT * FROM tbl1;Use the
DIRECThint to bypass data import.obclient [test]> CREATE /*+ direct(true, 0, 'full') parallel(4) */ TABLE tbl2 AS SELECT * FROM tbl1;
Do not specify hints in the
CREATE TABLE AS SELECTstatement.Set the value of the
default_load_modeparameter toFULL_DIRECT_WRITE.obclient [test]> ALTER SYSTEM SET default_load_mode ='FULL_DIRECT_WRITE';obclient [test]> CREATE TABLE tbl2 AS SELECT * FROM tbl1;
Verify whether data has been imported to the
tbl2table.obclient [test]> SELECT * FROM tbl2;The query result is as follows:
+------+ | c1 | +------+ | 1 | | 3 | | 2 | +------+ 3 rows in set (0.050 sec)The result shows that data has been imported into table
tbl2.
Use direct load to import data from table tbl1 to other tables.
Create a table named
tbl1.obclient [SYS]> CREATE TABLE tbl1(c1 int);Insert data into the
tbl1table.obclient [SYS]> INSERT INTO tbl1 VALUES (1),(2),(3);Query the data in the
tbl1table.obclient [SYS]> SELECT * FROM tbl1;The query result is as follows:
+------+ | C1 | +------+ | 1 | | 2 | | 3 | +------+ 3 rows in set (0.045 sec)Use the direct load method to import data from table
tbl1to tabletbl2.Specifies a hint for the
CREATE TABLE AS SELECTstatement.Use the
APPENDhint for direct load.obclient [SYS]> CREATE /*+ append parallel(4) */ TABLE tbl2 AS SELECT * FROM tbl1;Use the
DIRECThint to bypass data import.obclient [SYS]> CREATE /*+ direct(true, 0, 'full') parallel(4) */ TABLE tbl2 AS SELECT * FROM tbl1;
Do not specify hints in the
CREATE TABLE AS SELECTstatement.Set the value of
default_load_modetoFULL_DIRECT_WRITE.obclient [SYS]> ALTER SYSTEM SET default_load_mode ='FULL_DIRECT_WRITE';obclient [SYS]> CREATE TABLE tbl2 AS SELECT * FROM tbl1;
Check whether data has been imported to the
tbl2table.obclient [SYS]> SELECT * FROM tbl2;The query result is as follows:
+------+ | C1 | +------+ | 3 | | 2 | | 1 | +------+ 3 rows in set (0.013 sec)The result shows that data has been imported into table
tbl2.