Subquery

2023-10-24 09:23:03  Updated

A subquery is a query nested within another SELECT statement and can return one row, multiple rows, or nothing. A SELECT statement can be nested with one or more other SELECT statements. A subquery in the FROM clause of a SELECT statement is also called an inner join view. You can nest any number of subqueries in a nested view. A subquery in the WHERE clause of a SELECT statement is also called a nested subquery.

Subqueries can be divided into dependent subqueries and independent subqueries. A dependent subquery means that the execution of this subquery depends on variables of outer queries. Dependent queries are usually executed multiple times. An independent subquery means that the execution of this subquery does not depend on variables of outer queries. Independent subqueries are typically executed only once. You can rewrite independent subqueries and some of the dependent subqueries to unnest nested subqueries.

Syntax

General syntax for subqueries:

SELECT [ hint ] [ { { DISTINCT  UNIQUE }  ALL } ] select_list 
FROM { table_reference  join_clause  ( join_clause ) }
  [ , { table_reference  join_clause  (join_clause) } ]
  [ where_clause ]
  [ hierarchical_query_clause ]
  [ group_by_clause ]
 subquery { UNION [ALL]  INTERSECT  MINUS } subquery [ { UNION [ALL]  INTERSECT  MINUS } subquery ]
 ( subquery ) [ order_by_clause ] [ row_limiting_clause ]

The following table describes the parameters in the syntax.

Parameter Description
select_list The select list.
subquery The subquery.
hint The comment.
table_reference The table to be queried.

If a column in a subquery has the same name as a column in the outer query, the table name or an alias must be prefixed to the duplicate column name in the outer query.

When the outer query references a column in a subquery, the subquery is executed. The outer query can be the SELECT, UPDATE, or DELETE statement. The rules for referencing subqueries in different statements are as follows:

  • Define the row set to be inserted into the target table of the INSERT or CREATE TABLE statement.

  • Define the row set to be included in the view or materialized view of the CREATE VIEW or CREATE MATERIALIZED VIEW statement.

  • Define one or more values to be assigned to the existing rows in the UPDATE statement.

  • Specify conditions in the WHERE, HAVING, or START WITH clause.

  • Define a table that involves query operations.

Unnesting of nested subqueries

Subquery unnesting is a database optimization strategy. It places some subqueries in the outer parent query to convert some subqueries into equivalent multi-table join operations. This strategy can reduce the query layers as possible by using the access path, join method, and join order.

OceanBase Database will unnest the following types of subqueries:

  • Independent IN subqueries

  • Dependent subqueries in IN and EXISTS that do not contain aggregate functions or the GROUP BY clause

You can control whether to unnest the nested subqueries by using the UNNEST hint.

Examples

Create two tables respectively named table_a and table_b, and insert data into them.

CREATE TABLE table_a(PK INT, name VARCHAR(25));
INSERT INTO table_a VALUES(1,'Fox');
INSERT INTO table_a VALUES(2,'Police');  
INSERT INTO table_a VALUES(3,'Taxi');  
INSERT INTO table_a VALUES(4,'Lincoln');  
INSERT INTO table_a VALUES(5,'Arizona');  
INSERT INTO table_a VALUES(6,'Washington');  
INSERT INTO table_a VALUES(7,'Dell');  
INSERT INTO table_a VALUES(10,'Lucent'); 
CREATE TABLE table_b(PK INT, name VARCHAR(25));
INSERT INTO table_b VALUES(1,'Fox');
INSERT INTO table_b VALUES(2,'Police');  
INSERT INTO table_b VALUES(3,'Taxi');  
INSERT INTO table_b VALUES(6,'Washington');  
INSERT INTO table_b VALUES(7,'Dell');  
INSERT INTO table_b VALUES(8,'Microsoft');  
INSERT INTO table_b VALUES(9,'Apple'); 
INSERT INTO table_b VALUES(11,'Scotch whisky');

Example 1: an independent subquery

obclient> SELECT * FROM TABLE_A T1 WHERE T1.PK IN (SELECT T2.PK FROM TABLE_B T2);
+------+-----------+
 PK    NAME      
+------+-----------+
    1  Fox    
    2  Police      
    3  Taxi      
    6  Washington    
    7  Dell      
+------+-----------+
5 rows in set

Example 2: a dependent subquery, which references the T1.PK variable of the outer query

SELECT * FROM TABLE_A T1 WHERE T1.PK IN (SELECT T2.PK FROM TABLE_B T2 WHERE T2.PK = T1.PK);
+------+-----------+
 PK    NAME      
+------+-----------+
    1  Fox    
    2  Police      
    3  Taxi      
    6  Washington    
    7  Dell      
+------+-----------+
5 rows in set

Example 3: a dependent subquery that is unnested and rewritten into a join

obclient> EXPLAIN SELECT * FROM TABLE_A T1 
   WHERE T1.PK IN (SELECT T2.NAME FROM TABLE_B T2 WHERE T2.NAME = T1.NAME);
+------------------------------------+
 Query Plan                         
+------------------------------------+
=============================================
IDOPERATOR            NAMEEST. ROWSCOST
---------------------------------------------
0 HASH RIGHT SEMI JOIN    8        107 
1  TABLE SCAN         T2  8        38  
2  TABLE SCAN         T1  8        38  
=============================================
Outputs & filters:
-------------------------------------
  0 - output([T1.PK], [T1.NAME]), filter(nil),
      equal_conds([T1.PK = T2.NAME], [T2.NAME = T1.NAME]), other_conds(nil)
  1 - output([T2.NAME]), filter(nil),
      access([T2.NAME]), partitions(p0)
  2 - output([T1.NAME], [T1.PK]), filter(nil),
      access([T1.NAME], [T1.PK]), partitions(p0)
+------------------------------------+
1 row in set

Contact Us