Common Table Expressions

2025-12-04 05:37:52  Updated

A Common Table Expression (CTE) is a temporary named result set that is used only within the statement that contains it. OceanBase Database supports both recursive and non-recursive CTEs.

CTE syntax

A common table expression is an optional part of the DML statement syntax. It is defined by using the WITH clause. Multiple clauses can be separated by commas. Each clause provides a name and a subquery to generate a result set. The syntax is as follows:

with_clause:
    WITH [RECURSIVE]
        cte_name [(column_name [, column_name] ...)] AS (subquery)
        [, cte_name [(column_name [, column_name] ...)] AS (subquery)] ...

The parameters are described as follows:

  • cte_name is the name of the common table expression. It can be referenced by a table in the WITH clause.

  • AS(subquery) is referred to as the "CTE subquery," which is used to generate the result set of the CTE. AS must be followed by a pair of parentheses.

If the CTE subquery references its own name, the common table expression is recursive. If all the CTEs in the WITH clause are recursive, the RECURSIVE keyword must be included.

In the following example, the WITH clause is used to define common table expressions named cte1 and cte2. Then, the SELECT statement after the WITH clause is used to reference the common table expressions:

WITH
   cte1 AS (SELECT col1, col2 FROM tbl1),
   cte2 AS (SELECT col3, col4 FROM tbl2)
SELECT col2, col4 FROM cte1 JOIN cte2
WHERE cte1.col1 = cte2.col3;

If a name of a CTE is followed by a pair of parentheses containing a name list, the names in the parentheses are the column names. The number of names in the parentheses must be equal to the number of columns in the result set. Otherwise, the column names are taken from the first select list in the AS(subquery) part.

Scenarios supported by the WITH clause

The WITH clause is supported at the beginning of the following statements:

  • SELECT statements

    WITH ... SELECT ...
    
  • Subqueries (including derived table subqueries)

    SELECT ... WHERE id IN (WITH ... SELECT ...) ...
    SELECT * FROM (WITH ... SELECT ...) AS dt ...
    
  • Statements that contain SELECT statements, immediately preceding the SELECT statement.

    INSERT ... WITH ... SELECT ...
    REPLACE ... WITH ... SELECT ...
    CREATE TABLE ... WITH ... SELECT ...
    CREATE VIEW ... WITH ... SELECT ...
    

Only one WITH clause is allowed at the same level. Multiple clauses can be separated by commas.

WITH cte1 AS (...), cte2 AS (...) SELECT ...

The WITH clause can define one or more common table expressions. However, each CTE name must be unique within the WITH clause. The following example is illegal:

WITH cte1 AS (...), cte1 AS (...) SELECT ...

Recursive CTEs

A recursive common table expression (CTE) is a CTE that contains a subquery that references the CTE itself.

Structure of a recursive CTE

A recursive CTE has the following structure:

  • If the CTE in the WITH clause references itself, the WITH clause must start with WITH RECURSIVE. Otherwise, RECURSIVE is not required.

  • The recursive CTE's subquery consists of two SELECT statements separated by UNION [ALL] or UNION DISTINCT:

    SELECT ...      -- Returns an initial set of rows.
    UNION ALL
    SELECT ...      -- Returns additional rows and references the CTE name in its `FROM` clause for recursion.
    

    The first SELECT statement generates one or more initial rows for the CTE and does not reference the CTE name. The second SELECT statement generates additional rows and references the CTE name in its FROM clause for recursion. When this statement no longer generates new rows, the recursion ends. Therefore, a recursive CTE consists of a non-recursive SELECT statement and a recursive SELECT statement. Each SELECT statement can be a combination of multiple SELECT statements.

  • The data types of the result columns of a recursive CTE are inferred from the column types of the initial rows generated by the non-recursive SELECT statement. All columns can be null. The types of the result columns are determined without considering the recursive SELECT statement. If UNION DISTINCT is used to separate the non-recursive and recursive SELECT statements, duplicate rows are eliminated. This can prevent infinite loops in queries that compute transitive closures.

  • The recursive part processes only the rows generated by the previous iteration. If the recursive part contains multiple query blocks, the iterations of each query block are scheduled in an unspecified order, and each query block processes the rows generated by the previous iteration of itself or other query blocks since the last iteration ended.

Here is an example:

WITH RECURSIVE cte1 (n) AS
(
  SELECT 1                /*Non-recursive part, which retrieves a single row to generate the initial row set.*/
  UNION ALL
  SELECT n + 2 FROM cte1 WHERE n < 10     /*Recursive part, which generates a new value that is 2 greater than the values in the preceding row set until the new value is not less than 10.*/
)
SELECT * FROM cte1;

Prerequisites

The subquery in a recursive CTE must meet the following requirements:

  • The recursive SELECT statement must not contain any of the following structures:

    • Aggregate functions such as SUM()

    • Window functions

    • GROUP BY

    • ORDER BY

    • DISTINCT

  • The recursive SELECT statement must reference only once the subquery in its FROM clause. It can reference tables other than the CTE and join with the CTE. If used in such a join, the CTE cannot be located on the right side of a LEFT JOIN.

For a recursive CTE, the cost estimates displayed by the EXPLAIN statement represent the cost of each iteration and can differ significantly from the total cost. The optimizer cannot predict the number of iterations because it cannot predict when the WHERE condition becomes false.

Contact Us