A subquery is a SELECT query that contains one or more nested SELECT queries. A subquery can return a single row, multiple rows, or no rows. A subquery in the FROM clause of a SELECT statement is also called an inline view. You can nest any number of subqueries in an inline view. A subquery in the WHERE clause of a SELECT statement is also called a nested subquery.
Subqueries can be categorized into correlated subqueries and noncorrelated subqueries. A correlated subquery is executed based on the variables of the outer query. Therefore, a correlated subquery is usually executed multiple times. A noncorrelated subquery is executed independently of the variables of the outer query. Therefore, a noncorrelated subquery is usually executed only once. You can rewrite a noncorrelated subquery or a partially correlated subquery to eliminate the subquery and expand the nested subquery.
Subquery syntax
The syntax of a subquery is as follows:
SELECT [ hint ] [ { { DISTINCT | UNIQUE } | ALL } ] select_list
FROM { table_reference | join_clause | ( join_clause ) }
[ , { table_reference | join_clause | (join_clause) } ]
[ where_clause ]
[ hierarchical_query_clause ]
[ group_by_clause ]
| subquery { UNION [ALL] | INTERSECT | MINUS } subquery [ { UNION [ALL] | INTERSECT | MINUS } subquery ]
| ( subquery ) [ order_by_clause ] [ row_limiting_clause ]
The following table describes the parameters.
| Parameter | Description |
|---|---|
| select_list | The query list. |
| subquery | The subquery. |
| hint | The hint. |
| table_reference | The target table to query. |
If a column in the subquery has the same name as a column in the outer query, you must specify the table name or an alias before the repeated column name in the outer query.
When an upper-level query references a column in a subquery, the subquery is executed. The upper-level query can be a SELECT, UPDATE, or DELETE statement. The subquery is used in the following ways in these statements:
To define the row set to be inserted into the target table in an
INSERTorCREATE TABLEstatement.To define the row set to be included in the view in a
CREATE VIEWorCREATE MATERIALIZED VIEWstatement.To define one or more values to be assigned to existing rows in an
UPDATEstatement.To provide condition values in the
WHERE,HAVING, orSTART WITHclause.To define a table that contains a query operation.
Unnesting of nested subqueries
Unnesting of nested subqueries is an optimization strategy that moves some subqueries to the outer parent query. The essence of this strategy is to convert certain subqueries into equivalent multi-table join operations. This strategy effectively utilizes access paths, join methods, and join order, reducing the query hierarchy as much as possible.
The database unnests nested subqueries in the following cases:
Noncorrelated
INsubqueries.Correlated subqueries in
INandEXISTSclauses that do not contain aggregate functions orGROUP BYclauses.
You can use the UNNEST hint to control whether to unnest nested subqueries.
Examples
The following statements create tables table_a and table_b and insert data into the tables.
CREATE TABLE table_a(PK INT, name VARCHAR(25));
INSERT INTO table_a VALUES(1,'Foxy');
INSERT INTO table_a VALUES(2,'Police');
INSERT INTO table_a VALUES(3,'Taxi');
INSERT INTO table_a VALUES(4,'Lincoln');
INSERT INTO table_a VALUES(5,'Arizona');
INSERT INTO table_a VALUES(6,'Washington');
INSERT INTO table_a VALUES(7,'Dell');
INSERT INTO table_a VALUES(10,'Lucent');
CREATE TABLE table_b(PK INT, name VARCHAR(25));
INSERT INTO table_b VALUES(1,'Foxy');
INSERT INTO table_b VALUES(2,'Police');
INSERT INTO table_b VALUES(3,'Taxi');
INSERT INTO table_b VALUES(6,'Washington');
INSERT INTO table_b VALUES(7,'Dell');
INSERT INTO table_b VALUES(8,'Microsoft');
INSERT INTO table_b VALUES(9,'Apple');
INSERT INTO table_b VALUES(11,'Scottish Whisky');
A subquery that has no dependency
obclient> SELECT * FROM TABLE_A T1 WHERE T1.PK IN (SELECT T2.PK FROM TABLE_B T2); +------+-----------+ | PK | NAME | +------+-----------+ | 1 | Foxy | | 2 | Police | | 3 | Taxi | | 6 | Washington| | 7 | Dell | +------+-----------+ 5 rows in setA subquery that has a dependency, where the subquery uses the variable T1.PK in the outer query
SELECT * FROM TABLE_A T1 WHERE T1.PK IN (SELECT T2.PK FROM TABLE_B T2 WHERE T2.PK = T1.PK); +------+-----------+ | PK | NAME | +------+-----------+ | 1 | Foxy | | 2 | Police | | 3 | Taxi | | 6 | Washington| | 7 | Dell | +------+-----------+ 5 rows in setA subquery that has a dependency and is unnested and rewritten as a join
obclient> EXPLAIN SELECT * FROM TABLE_A T1 WHERE T1.PK IN (SELECT T2.NAME FROM TABLE_B T2 WHERE T2.NAME = T1.NAME); +------------------------------------+ | Query Plan | +------------------------------------+ ============================================= |ID|OPERATOR |NAME|EST. ROWS|COST| --------------------------------------------- |0 |HASH RIGHT SEMI JOIN| |8 |107 | |1 | TABLE SCAN |T2 |8 |38 | |2 | TABLE SCAN |T1 |8 |38 | ============================================= Outputs & filters: ------------------------------------- 0 - output([T1.PK], [T1.NAME]), filter(nil), equal_conds([T1.PK = T2.NAME], [T2.NAME = T1.NAME]), other_conds(nil) 1 - output([T2.NAME]), filter(nil), access([T2.NAME]), partitions(p0) 2 - output([T1.NAME], [T1.PK]), filter(nil), access([T1.NAME], [T1.PK]), partitions(p0) +------------------------------------+ 1 row in set
