INTERSECT operators perform an intersection-set operation on the output of the left and right suboperators and deduplicate the result.
OceanBase Database supports the following types of INTERSECT operators: MERGE INTERSECT DISTINCT and HASH INTERSECT DISTINCT.
MERGE INTERSECT DISTINCT
In the following example, Q1 joins the two queries with INTERSECT. Since c1 is sorted, the MERGE INTERSECT DISTINCT operator is used as Operator 0 to obtain the intersection set and deduplicate the result. c2 is not sorted, so a SORT operator is assigned before Operator 3 for sorting. When these operators are executed, they read sorted inputs from the left and right subnodes and merge and deduplicate the rows to obtain the intersection set.
obclient> CREATE TABLE t1(c1 INT PRIMARY KEY, c2 INT);
Query OK, 0 rows affected
obclient> INSERT INTO t1 VALUES(1,1);
Query OK, 1 rows affected
obclient> INSERT INTO t1 VALUES(2,2);
Query OK, 1 rows affected
Q1:
obclient> EXPLAIN SELECT c1 FROM t1 INTERSECT SELECT c2 FROM t1;
Query Plan:
=================================================
|ID|OPERATOR |NAME|EST. ROWS|COST|
-------------------------------------------------
|0 |MERGE INTERSECT DISTINCT| |2 |77 |
|1 | TABLE SCAN |T1 |2 |37 |
|2 | SORT | |2 |39 |
|3 | TABLE SCAN |T1 |2 |37 |
=================================================
Outputs & filters:
-------------------------------------
0 - output([INTERSECT(T1.C1, T1.C2)]), filter(nil)
1 - output([T1.C1]), filter(nil),
access([T1.C1]), partitions(p0)
2 - output([T1.C2]), filter(nil), sort_keys([T1.C2, ASC])
3 - output([T1.C2]), filter(nil),
access([T1.C2]), partitions(p0)
In the preceding example, the Outputs & filters section shows in detail the output information of the MERGE INTERSECT DISTINCT operator.
| Field | Description |
|---|---|
| output | The output expressions of the operator. The output of the two suboperators are joined by INTERSECT to generate the intersection set. Within the brackets are the output columns of the left and right suboperators. |
| filter | The filter conditions of the operator. In this example, filter is set to nil because no filter condition is configured for the MERGE INTERSECT DISTINCT operator. |
HASH INTERSECT DISTINCT
In the following example, Q2 joins the two queries with INTERSECT, no sorting method is available, and the HASH INTERSECT DISTINCT operator is used as Operator 0 to obtain the intersection set and deduplicate the result. When the operator is being executed, it first reads the output of the subnode on one side, creates a hash table, and removes duplicate rows. Then, it reads the output of the subnode on the other side and uses the hash table to generate the intersection set and deduplicate the result.
Q2:
obclient> EXPLAIN SELECT c2 FROM t1 INTERSECT SELECT c2 FROM t1;
Query Plan:
================================================
|ID|OPERATOR |NAME|EST. ROWS|COST|
------------------------------------------------
|0 |HASH INTERSECT DISTINCT| |2 |77 |
|1 | TABLE SCAN |T1 |2 |37 |
|2 | TABLE SCAN |T1 |2 |37 |
================================================
Outputs & filters:
-------------------------------------
0 - output([INTERSECT(T1.C2, T1.C2)]), filter(nil)
1 - output([T1.C2]), filter(nil),
access([T1.C2]), partitions(p0)
2 - output([T1.C2]), filter(nil),
access([T1.C2]), partitions(p0)
In the preceding example, the Outputs & filters section shows in detail the output information of the HASH INTERSECT DISTINCT operator. The fields of the operator have the same meaning as those of the MERGE INTERSECT DISTINCT operator.