A subquery is a SELECT statement that is nested within another or multiple SELECT statements. It can return a single-row result, multiple-row results, or no results. A subquery in the FROM clause of a SELECT statement is also known as an inline view. You can nest any number of subqueries within an inline view. A subquery in the WHERE clause of a SELECT statement is also known as a nested subquery.
Subqueries can be categorized into correlated and noncorrelated subqueries. A correlated subquery is one whose execution depends on variables from the outer query. This type of subquery is typically executed multiple times. A noncorrelated subquery is one whose execution does not depend on variables from the outer query. This type of subquery is generally executed only once. For noncorrelated and partially correlated subqueries, you can rewrite them to eliminate the subqueries, thereby expanding the nested subqueries.
Subquery syntax
The general syntax of a subquery is as follows:
SELECT [ hint ] [ { { DISTINCT | UNIQUE } | ALL } ] select_list
FROM { table_reference | join_clause | ( join_clause ) }
[ , { table_reference | join_clause | (join_clause) } ]
[ where_clause ]
[ hierarchical_query_clause ]
[ group_by_clause ]
| subquery { UNION [ALL] | INTERSECT | MINUS } subquery [ { UNION [ALL] | INTERSECT | MINUS } subquery ]
| ( subquery ) [ order_by_clause ] [ row_limiting_clause ]
The following table describes the parameters.
| Parameter | Description |
|---|---|
| select_list | Specifies the query list. |
| subquery | The subquery. |
| hint | A comment. |
| table_reference | The target table to query. |
If a column in the subquery has the same name as a column in the outer query, you must add the table name before the repeated column name in the outer query or use an alias.
When an upper-level query references a correlated column in the subquery, the subquery is executed. The upper-level query can be a SELECT, UPDATE, or DELETE statement. The subquery can be used in the following ways in these statements:
Define the row set to be inserted into the target table in the
INSERTorCREATE TABLEstatement.Define the row set to be included in the view in the
CREATE VIEWorCREATE MATERIALIZED VIEWstatement.Define one or more values to be assigned to existing rows in the
UPDATEstatement.Provide condition values in the
WHERE,HAVING, orSTART WITHclause.Define a table that contains query operations.
Unnesting of nested subqueries
Unnesting of nested subqueries is an optimization strategy used by the database. It involves moving some subqueries to the outer parent query, effectively converting them into equivalent multi-table join operations. This strategy offers a clear advantage by allowing the effective use of access paths, join methods, and join order, thereby minimizing the query hierarchy.
The database performs unnesting of nested subqueries in the following scenarios:
Noncorrelated
INsubqueries.Correlated subqueries in
INandEXISTSclauses that do not contain aggregate functions orGROUP BYclauses.
You can control whether to unnest nested subqueries using the UNNEST hint.
Examples
The following statements create tables table_a and table_b and insert data into them.
CREATE TABLE table_a(PK INT, name VARCHAR(25));
INSERT INTO table_a VALUES(1,'Foxy');
INSERT INTO table_a VALUES(2,'Police');
INSERT INTO table_a VALUES(3,'Taxi');
INSERT INTO table_a VALUES(4,'Lincoln');
INSERT INTO table_a VALUES(5,'Arizona');
INSERT INTO table_a VALUES(6,'Washington');
INSERT INTO table_a VALUES(7,'Dell');
INSERT INTO table_a VALUES(10,'Lucent');
CREATE TABLE table_b(PK INT, name VARCHAR(25));
INSERT INTO table_b VALUES(1,'Foxy');
INSERT INTO table_b VALUES(2,'Police');
INSERT INTO table_b VALUES(3,'Taxi');
INSERT INTO table_b VALUES(6,'Washington');
INSERT INTO table_b VALUES(7,'Dell');
INSERT INTO table_b VALUES(8,'Microsoft');
INSERT INTO table_b VALUES(9,'Apple');
INSERT INTO table_b VALUES(11,'Scotch Whisky');
A subquery with no dependencies
obclient> SELECT * FROM TABLE_A T1 WHERE T1.PK IN (SELECT T2.PK FROM TABLE_B T2); +------+-----------+ | PK | NAME | +------+-----------+ | 1 | Foxy | | 2 | Police | | 3 | Taxi | | 6 | Washington| | 7 | Dell | +------+-----------+ 5 rows in setA subquery with dependencies, where the subquery uses the outer query variable T1.PK
SELECT * FROM TABLE_A T1 WHERE T1.PK IN (SELECT T2.PK FROM TABLE_B T2 WHERE T2.PK = T1.PK); +------+-----------+ | PK | NAME | +------+-----------+ | 1 | Foxy | | 2 | Police | | 3 | Taxi | | 6 | Washington| | 7 | Dell | +------+-----------+ 5 rows in setA subquery with dependencies that is rewritten as a join
obclient> EXPLAIN SELECT * FROM TABLE_A T1 WHERE T1.PK IN (SELECT T2.NAME FROM TABLE_B T2 WHERE T2.NAME = T1.NAME); +------------------------------------+ | Query Plan | +------------------------------------+ ============================================= |ID|OPERATOR |NAME|EST. ROWS|COST| --------------------------------------------- |0 |HASH RIGHT SEMI JOIN| |8 |107 | |1 | TABLE SCAN |T2 |8 |38 | |2 | TABLE SCAN |T1 |8 |38 | ============================================= Outputs & filters: ------------------------------------- 0 - output([T1.PK], [T1.NAME]), filter(nil), equal_conds([T1.PK = T2.NAME], [T2.NAME = T1.NAME]), other_conds(nil) 1 - output([T2.NAME]), filter(nil), access([T2.NAME]), partitions(p0) 2 - output([T1.NAME], [T1.PK]), filter(nil), access([T1.NAME], [T1.PK]), partitions(p0) +------------------------------------+ 1 row in set