A subquery is a SELECT query that contains one or more nested SELECT queries. A subquery can return a single row, multiple rows, or no rows. A subquery in the FROM clause of a SELECT query is also called an inline view. You can nest any number of subqueries in an inline view. A subquery in the WHERE clause of a SELECT query is also called a nested subquery.
Subqueries can be categorized into correlated subqueries and noncorrelated subqueries. A correlated subquery is a subquery whose execution depends on variables from the outer query. This type of subquery is usually executed multiple times. A noncorrelated subquery is a subquery whose execution does not depend on variables from the outer query. This type of subquery is usually executed only once. For noncorrelated subqueries and partially correlated subqueries, you can rewrite them to eliminate subqueries and expand nested subqueries.
Subquery syntax
The syntax of a subquery is as follows:
SELECT [ hint ] [ { { DISTINCT | UNIQUE } | ALL } ] select_list
FROM { table_reference | join_clause | ( join_clause ) }
[ , { table_reference | join_clause | (join_clause) } ]
[ where_clause ]
[ hierarchical_query_clause ]
[ group_by_clause ]
| subquery { UNION [ALL] | INTERSECT | MINUS } subquery [ { UNION [ALL] | INTERSECT | MINUS } subquery ]
| ( subquery ) [ order_by_clause ] [ row_limiting_clause ]
The following table describes the parameters.
| Parameter | Description |
|---|---|
| select_list | Specifies the query list. |
| subquery | The subquery. |
| hint | A comment. |
| table_reference | The target table to query. |
If a column in the subquery has the same name as a column in the outer query, you must specify the table name before the repeated column name in the outer query or use an alias.
When an upper-level query references a column in the subquery, the subquery is executed. The upper-level query can be a SELECT, UPDATE, or DELETE statement. The subquery can be used in the following ways in these statements:
Define the row set to be inserted into the target table in an
INSERTorCREATE TABLEstatement.Define the row set to be included in the view in a
CREATE VIEWorCREATE MATERIALIZED VIEWstatement.Define one or more values to be assigned to existing rows in an
UPDATEstatement.Provide condition values in the
WHERE,HAVING, orSTART WITHclause.Define a table that contains a query operation.
Unnesting of nested subqueries
Unnesting of nested subqueries is an optimization strategy that moves some subqueries to the outer parent query. Essentially, this strategy converts certain subqueries into equivalent multi-table join operations. One clear benefit of this strategy is that it allows for the effective use of access paths, join methods, and join order, thereby minimizing the query's nesting level.
The database performs unnesting of nested subqueries in the following scenarios:
Noncorrelated
INsubqueries.Correlated subqueries in
INandEXISTSclauses that do not contain aggregate functions orGROUP BYclauses.
You can use the UNNEST hint to control whether to unnest nested subqueries.
Examples
The following statements create tables table_a and table_b and insert data into them.
CREATE TABLE table_a(PK INT, name VARCHAR(25));
INSERT INTO table_a VALUES(1,'Foxy');
INSERT INTO table_a VALUES(2,'Police');
INSERT INTO table_a VALUES(3,'Taxi');
INSERT INTO table_a VALUES(4,'Lincoln');
INSERT INTO table_a VALUES(5,'Arizona');
INSERT INTO table_a VALUES(6,'Washington');
INSERT INTO table_a VALUES(7,'Dell');
INSERT INTO table_a VALUES(10,'Lucent');
CREATE TABLE table_b(PK INT, name VARCHAR(25));
INSERT INTO table_b VALUES(1,'Foxy');
INSERT INTO table_b VALUES(2,'Police');
INSERT INTO table_b VALUES(3,'Taxi');
INSERT INTO table_b VALUES(6,'Washington');
INSERT INTO table_b VALUES(7,'Dell');
INSERT INTO table_b VALUES(8,'Microsoft');
INSERT INTO table_b VALUES(9,'Apple');
INSERT INTO table_b VALUES(11,'Scotch Whisky');
A subquery that is not dependent on the outer query
obclient> SELECT * FROM TABLE_A T1 WHERE T1.PK IN (SELECT T2.PK FROM TABLE_B T2); +------+-----------+ | PK | NAME | +------+-----------+ | 1 | Foxy | | 2 | Police | | 3 | Taxi | | 6 | Washington| | 7 | Dell | +------+-----------+ 5 rows in setA subquery that is dependent on the outer query and uses the variable T1.PK from the outer query
SELECT * FROM TABLE_A T1 WHERE T1.PK IN (SELECT T2.PK FROM TABLE_B T2 WHERE T2.PK = T1.PK); +------+-----------+ | PK | NAME | +------+-----------+ | 1 | Foxy | | 2 | Police | | 3 | Taxi | | 6 | Washington| | 7 | Dell | +------+-----------+ 5 rows in setA dependent subquery that is unnested and rewritten as a join
obclient> EXPLAIN SELECT * FROM TABLE_A T1 WHERE T1.PK IN (SELECT T2.NAME FROM TABLE_B T2 WHERE T2.NAME = T1.NAME); +------------------------------------+ | Query Plan | +------------------------------------+ ============================================= |ID|OPERATOR |NAME|EST. ROWS|COST| --------------------------------------------- |0 |HASH RIGHT SEMI JOIN| |8 |107 | |1 | TABLE SCAN |T2 |8 |38 | |2 | TABLE SCAN |T1 |8 |38 | ============================================= Outputs & filters: ------------------------------------- 0 - output([T1.PK], [T1.NAME]), filter(nil), equal_conds([T1.PK = T2.NAME], [T2.NAME = T1.NAME]), other_conds(nil) 1 - output([T2.NAME]), filter(nil), access([T2.NAME]), partitions(p0) 2 - output([T1.NAME], [T1.PK]), filter(nil), access([T1.NAME], [T1.PK]), partitions(p0) +------------------------------------+ 1 row in set