Common table expressions|V4.2.2| docs|Distributed Database

Common table expressions

Last Updated：2025-12-02 02:50:56 Updated

A common table expression (CTE) is a temporary result set that is used only within the statement that defines it. A CTE does not actually store data but calculates and returns data during the execution of a query. Unlike derived tables, CTEs can be self-referential and can be referenced multiple times in the same query.

Scenarios:

CTEs can reuse the same subquery multiple times without rewriting the same logic multiple times.
CTEs can simplify recursive queries, such as those used to query hierarchical data.
CTEs can break complex queries into smaller parts to simplify the query logic.

OceanBase Database supports both recursive and non-recursive CTEs.

CTE syntax

A common table expression is an optional part of the DML statement syntax. It is defined by using the WITH clause. Multiple WITH clauses can be separated with commas. Each WITH clause contains a subquery that generates a result set and associates the result set with a name. The syntax is as follows:

 WITH [RECURSIVE]
      cte_name [(column_name [, column_name] ...)] AS (subquery)
      [, cte_name [(column_name [, column_name] ...)] AS (subquery)] ...

Arguments

Argument	Description
`[RECURSIVE]`	The keyword to specify whether to create a recursive CTE. Recursive CTE: The `RECURSIVE` keyword is specified. A recursive common table expression (CTE) is an expression where a CTE references itself. It is used to process data that has a recursive structure, such as tree structures and graphs. A recursive CTE includes an initial query (starting condition) in its definition and recursively operates on the result of the initial query until the stopping condition is met. Non-recursive CTE: The `RECURSIVE` keyword is not specified. A non-recursive CTE is used to generate a one-time temporary result set that does not involve recursion.
`cte_name`	The name of the CTE, which can be referenced by tables in the WITH clause.
`column_name`	The selected column names that are used to alias the columns in the CTE. This enables the use of more readable column names in the main query.
`AS(subquery)`	The subquery that is used to generate the result set of the CTE. The AS keyword must be followed by a pair of parentheses.

If the CTE name is followed by a pair of parentheses containing a name list, the names in the parentheses are column names. The number of column names must be the same as the number of columns in the SELECT statement of the CTE. If no column names are specified, the column names in the first SELECT list of the AS(subquery) part are used.

Scenarios where the WITH clause can be used

You can use the WITH clause in the following scenarios:

At the beginning of a SELECT statement.
```
WITH ... SELECT ...
```

At the beginning of a subquery (including a derived table subquery).

SELECT ... WHERE id IN (WITH ... SELECT ...) ...
SELECT * FROM (WITH ... SELECT ...) AS dt ...

Immediately before the SELECT clause in a statement that contains a SELECT statement.

INSERT ... WITH ... SELECT ...
REPLACE ... WITH ... SELECT ...
CREATE TABLE ... WITH ... SELECT ...
CREATE VIEW ... WITH ... SELECT ...

You can use at most one WITH clause at the same level. If multiple WITH clauses are included in the WITH clause, separate them with commas.

WITH cte1 AS (...), cte2 AS (...) SELECT ...

The WITH clause can define one or more common table expressions (CTEs). However, each CTE name must be unique within the WITH clause. The following example is illegal:

WITH cte1 AS (...), cte1 AS (...) SELECT ...

Structure of a recursive CTE

A recursive CTE has the following structure:

If the WITH clause contains a CTE that references itself, the WITH clause must start with WITH RECURSIVE. Otherwise, it does not need to start with RECURSIVE.
A recursive CTE consists of two parts separated by UNION [ALL]:
```
SELECT ...      -- Returns an initial row set.
UNION ALL
SELECT ...      -- Returns additional row sets.
```
The first SELECT statement generates one or more initial rows for the CTE and does not reference the CTE name. The second SELECT statement generates additional rows through recursion by referencing the CTE name in its FROM clause. The recursion ends when the second SELECT statement does not generate new rows. Therefore, a recursive CTE consists of a non-recursive SELECT part and a recursive SELECT part. Each SELECT part can be multiple SELECT statements joined together.
The types of the columns in the result set of a CTE are inferred from the column types in the result set of the non-recursive SELECT part. All columns can be null. The types of the columns in the recursive SELECT part are ignored.
The rows generated by the recursive part are operated on only by the current iteration. If the recursive part contains multiple query blocks, the iterations of the query blocks are scheduled in an unspecified order, and a query block can operate on the rows generated by the previous iteration of itself or other query blocks since the last iteration ended.

Here is an example:

WITH RECURSIVE cte1 (n) AS
(
  SELECT 1                /*Non-recursive part. It retrieves a single row to generate the initial row set.*/
  UNION ALL
  SELECT n + 2 FROM cte1 WHERE n < 10     /*Recursive part. It generates a new value that is 2 greater than the value of n in the previous row set until the value of n is not less than 10.*/
)
SELECT * FROM cte1;

Limitations

The recursive SELECT part of a recursive CTE query must meet the following requirements:

It must not contain the following elements:
- Aggregate functions such as SUM()
- Window functions
- GROUP BY
- ORDER BY
- DISTINCT
It must reference a table only through a subquery in its FROM clause. It can join tables and reference CTEs. If it joins a CTE, the CTE cannot be on the right side of a LEFT JOIN.

The cost estimation displayed in the EXPLAIN statement for a recursive CTE represents the cost of each iteration and can differ significantly from the total cost. The optimizer cannot predict the number of iterations because it cannot predict when the WHERE condition becomes false.

The UNION [DISTINCT] and LIMIT structures are not supported between the recursive SELECT part and the non-recursive SELECT part.

Examples

The following example shows how to create a student hierarchy table that contains the IDs and names of students and their mentor IDs. This example is used to illustrate the differences between recursive CTEs and non-recursive CTEs.

First, create a student table and insert some data into the table:

obclient> CREATE TABLE student (
    student_id INT PRIMARY KEY,
    name VARCHAR(100),
    mentor_id INT,
    FOREIGN KEY (mentor_id) REFERENCES student(student_id)
);
Query OK, 0 rows affected

obclient> INSERT INTO student (student_id, name, mentor_id) VALUES
(1, 'Alice', NULL),
(2, 'Bob', 1),
(3, 'Charlie', 1),
(4, 'David', 2),
(5, 'Eve', 3);
Query OK, 5 rows affected

In this data model, Alice is a top-level student with no mentor. Bob and Charlie are Alice's students. David is Bob's student. Eve is Charlie's student.

Non-recursive CTE example

A non-recursive CTE does not reference itself. For example, if you want to select all students directly mentored by Alice (i.e., the first-level students), you can use the following non-recursive CTE:

WITH Alice_Students AS (
    SELECT * FROM student WHERE mentor_id = 1
)
SELECT * FROM Alice_Students;

The execution result is as follows:

+------------+---------+-----------+
| student_id | name    | mentor_id |
+------------+---------+-----------+
|          2 | Bob     |         1 |
|          3 | Charlie |         1 |
+------------+---------+-----------+
2 rows in set

Recursive CTE example

If you want to find all students directly or indirectly mentored by Alice (i.e., all students at all levels), you must use a recursive CTE. A recursive CTE references itself to query students at subsequent levels.

WITH RECURSIVE Student_Hierarchy AS (
    -- Anchor member: Select Alice as the starting point.
    SELECT student_id, name, mentor_id FROM student WHERE mentor_id IS NULL
    UNION ALL
    -- Recursive member: Select direct students of students at the previous level.
    SELECT s.student_id, s.name, s.mentor_id
    FROM student s
    INNER JOIN Student_Hierarchy sh ON s.mentor_id = sh.student_id
)
SELECT * FROM Student_Hierarchy;

This recursive CTE selects Alice as the starting point, then recursively selects direct students of each student, until no more students are found.

The execution result is as follows:

+------------+---------+-----------+
| student_id | name    | mentor_id |
+------------+---------+-----------+
|          1 | Alice   |      NULL |
|          3 | Charlie |         1 |
|          2 | Bob     |         1 |
|          5 | Eve     |         3 |
|          4 | David   |         2 |
+------------+---------+-----------+
5 rows in set

In this example, the recursive CTE enables you to traverse the hierarchical structure of students. A non-recursive CTE can only select students at a fixed level.