CTE

2023-07-28 02:55:42  Updated

A common table expression (CTE) is a named temporary result set that applies to the current statement. It is not stored as an object and is used only during queries. OceanBase Database supports recursive CTEs and non-recursive CTEs.

CTE syntax

CTE is optional in the syntax of DML statements and is defined by using the WITH clause. You can separate multiple subclauses in a WITH clause with commas (,). Each subclause provides a subquery to generate a result set and associates the subquery with a name. Syntax:

with_clause:
    WITH [RECURSIVE]
        cte_name [(column_name [, column_name] ...)] AS (subquery)
        [, cte_name [(column_name [, column_name] ...)] AS (subquery)] ...

Parameters:

  • cte_name names a CTE, which can be referenced by tables with a WITH clause.

  • AS(subquery) is referred to as a CTE subquery and is used to generate a CTE result set. AS must be followed with parentheses.

If a CTE subquery references its own name, the CTE is recursive. If all CTEs in a WITH clause are recursive, the RECURSIVE keyword must be included.

In the following example, two CTEs named cte1 and cte2 are defined in the WITH clause and are referenced in the top-level SELECT statement behind the WITH clause.

WITH
   cte1 AS (SELECT col1, col2 FROM tbl1),
   cte2 AS (SELECT col3, col4 FROM tbl2)
SELECT col2, col4 FROM cte1 JOIN cte2
WHERE cte1.col1 = cte2.col3;

If a CTE name is followed by a list of names quoted in a pair of parentheses, these names are column names, and the number of names in the list must be the same as the number of columns in the result set. Otherwise, the column names are sourced from the first select list in AS(subquery).

Application scenarios of the WITH clause

You can use the WITH clause in the following scenarios:

  • At the beginning of a SELECT statement

    WITH ... SELECT ...
    
  • At the beginning of a subquery, including derived table subqueries

    SELECT ... WHERE id IN (WITH ... SELECT ...) ...
    SELECT * FROM (WITH ... SELECT ...) AS dt ...
    
  • Right before the SELECT statement in a statement that contains a SELECT statement

    INSERT ... WITH ... SELECT ...
    REPLACE ... WITH ... SELECT ...
    CREATE TABLE ... WITH ... SELECT ...
    CREATE VIEW ... WITH ... SELECT ...
    

Only one WITH clause is allowed at the same level. You can separate multiple subclauses in a WITH clause with commas (,).

WITH cte1 AS (...), cte2 AS (...) SELECT ...

A WITH clause can define one or multiple CTEs. Make sure that the CTE names are unique in the same WITH clause. The following example shows an invalid WITH clause:

WITH cte1 AS (...), cte1 AS (...) SELECT ...

Recursive CTE

A CTE with a subquery that references the name of the CTE is a recursive CTE.

Structure of a recursive CTE

The following describes the structure of a recursive CTE:

  • If a CTE in a WITH clause references itself, the WITH clause must begin with the RECURSIVE keyword. Otherwise, the RECURSIVE keyword is optional.

  • A recursive CTE subquery comprises two parts separated with UNION [ALL] or UNION DISTINCT.

    SELECT ...      -- Returns the initial row set.
    UNION ALL
    SELECT ...      -- Returns an additional row set.
    

    The first SELECT generates one or more initial rows for the CTE without referencing the CTE name. The second SELECT generates additional rows and references the CTE name in the FROM clause for recursion. The recursion ends when no new rows are generated. Therefore, a recursive CTE consists of a non-recursive SELECT part and a recursive SELECT part. Each SELECT part can be a combination of multiple SELECT statements.

  • The type of result columns of a CTE is inferred from the column type of the non-recursive SELECT part, and all columns can be null. The recursive SELECT part is ignored in determining the type of result columns. If the recursive and non-recursive parts are separated with UNION DISTINCT, duplicate rows are removed. This avoids infinite loops when queries that perform transitive closures are executed.

  • Each iteration of the recursive part applies only to the rows generated in the previous iteration. If the recursive part contains multiple query blocks, the iteration of each query block is scheduled based on a built-in order, and the rows operated by each query block are those generated by the last iteration or those generated by other query blocks since the last iteration ends.

Sample code:

WITH RECURSIVE cte1 (n) AS
(
  SELECT 1                /*The non-recursive part that retrieves a single row to generate the initial row set.*/
  UNION ALL
  SELECT n + 2 FROM cte WHERE n < 10     /*The recursive part that generates a new value greater than value n in the previous row set by 2 until n is not smaller than 10.*/
)
SELECT * FROM cte1;

Application conditions

The following syntax constraints apply to recursive CTE subqueries:

  • The recursive SELECT part must not contain the following structures:

    • Aggregate functions such as SUM()

    • Analytic functions

    • GROUP BY

    • ORDER BY

    • DISTINCT

  • The recursive SELECT part can reference the CTE name in the subquery of the FROM clause only once. The recursive part can reference another table and join it with the CTE. In this case, the CTE cannot be located on the right side of LEFT JOIN.

For a recursive CTE, the estimated cost displayed by using EXPLAIN represents the cost of each iteration, which may differ greatly from the total cost. The optimizer cannot predict the number of iterations because it cannot predict when the WHERE clause will become False.

Contact Us