UNION operators perform union set operations on the output of two queries.
OceanBase Database supports the following types of UNION operators: UNION ALL, HASH UNION DISTINCT, and MERGE UNION DISTINCT.
UNION ALL
The UNION ALL operator directly merges the output of two queries.
In the following example, Q1 joins the two queries with UNION ALL, which performs a union set operation on their output. During execution, the operator sequentially generates the output of the left and right subnodes.
obclient> CREATE TABLE t1(c1 INT PRIMARY KEY, c2 INT);
Query OK, 0 rows affected
obclient> INSERT INTO t1 VALUES(1,1);
Query OK, 1 rows affected
obclient> INSERT INTO t1 VALUES(2,2);
Query OK, 1 rows affected
Q1:
obclient> EXPLAIN SELECT c1 FROM t1 UNION ALL SELECT c2 FROM t1;
Query Plan:
====================================
|ID|OPERATOR |NAME|EST. ROWS|COST|
------------------------------------
|0 |UNION ALL | |4 |74 |
|1 | TABLE SCAN|T1 |2 |37 |
|2 | TABLE SCAN|T1 |2 |37 |
====================================
Outputs & filters:
-------------------------------------
0 - output([UNION(T1.C1, T1.C2)]), filter(nil)
1 - output([T1.C1]), filter(nil),
access([T1.C1]), partitions(p0)
2 - output([T1.C2]), filter(nil),
access([T1.C2]), partitions(p0)
In the preceding example, the Outputs & filters section shows in detail the output information of the UNION ALL operator.
| Field | Description |
|---|---|
| output | The output expression of the operator. |
| filter | The filter conditions of the operator. In this example, filter is set to nil because no filter condition is configured for the UNION ALL operator. |
MERGE UNION DISTINCT
The MERGE UNION DISTINCT operator performs a union set operation on the output of queries and deduplicates the result.
In the following example, Q2 joins the two queries with UNION DISTINCT. Since c1 is sorted, the MERGE UNION DISTINCT operator is used as Operator 0 to find the union set and deduplicate the result. c2 is not sorted, so a SORT operator is assigned before Operator 3 for sorting. When the operator is executed, it reads the sorted output from the left and right subnodes, merges them to generate a sorted output, and deduplicates the output.
Q2:
obclient> EXPLAIN SELECT c1 FROM t1 UNION SELECT c2 FROM t1;
Query Plan:
=============================================
|ID|OPERATOR |NAME|EST. ROWS|COST|
---------------------------------------------
|0 |MERGE UNION DISTINCT| |4 |77 |
|1 | TABLE SCAN |T1 |2 |37 |
|2 | SORT | |2 |39 |
|3 | TABLE SCAN |T1 |2 |37 |
=============================================
Outputs & filters:
-------------------------------------
0 - output([UNION(T1.C1, T1.C2)]), filter(nil)
1 - output([T1.C1]), filter(nil),
access([T1.C1]), partitions(p0)
2 - output([T1.C2]), filter(nil), sort_keys([T1.C2, ASC])
3 - output([T1.C2]), filter(nil),
access([T1.C2]), partitions(p0)
In the preceding example, the Outputs & filters section shows in detail the output information of the MERGE UNION DISTINCT operator. The fields of the MERGE UNION DISTINCT operator have the same meaning as those of the UNION ALL operator.
HASH UNION DISTINCT
The HASH UNION DISTINCT operator performs a union set operation on the output of queries and deduplicates the result.
In the following example, Q3 uses UNION DISTINCT to join the two queries, the columns are unsorted, and the HASH UNION DISTINCT operator is used as Operator 0 to find the union set and deduplicate the result. When the operator is executed, it reads the output from the left and right subnodes, creates a hash table, deduplicates the data, and outputs the result.
Q3:
obclient> EXPLAIN SELECT c2 FROM t1 UNION SELECT c2 FROM t1;
Query Plan:
============================================
|ID|OPERATOR |NAME|EST. ROWS|COST|
--------------------------------------------
|0 |HASH UNION DISTINCT| |4 |77 |
|1 | TABLE SCAN |T1 |2 |37 |
|2 | TABLE SCAN |T1 |2 |37 |
============================================
Outputs & filters:
-------------------------------------
0 - output([UNION(T1.C2, T1.C2)]), filter(nil)
1 - output([T1.C2]), filter(nil),
access([T1.C2]), partitions(p0)
2 - output([T1.C2]), filter(nil),
access([T1.C2]), partitions(p0)
In the preceding example, the Outputs & filters section shows in detail the output information of the HASH UNION DISTINCT operator. The fields of the operator have the same meaning as those of the UNION ALL operator.