A Common Table Expression (CTE) is a temporary result set that is used only within the statement that contains it. A CTE does not store data as a permanent object in the database. A CTE can be self-referential or referenced multiple times within the same query. This is a feature not available with derived tables.
Scenarios:
- A CTE can be reused multiple times within the same query to avoid redundant logic.
- A CTE can simplify recursive queries, such as those used to process hierarchical data.
- A CTE can break a complex query into smaller parts to make the query logic clearer and easier to understand.
OceanBase Database supports both recursive and non-recursive CTEs.
CTE syntax
A common table expression is an optional part of the DML statement syntax. It is defined by using the WITH clause. Multiple WITH clauses can be separated by commas. Each WITH clause contains a subquery that generates a result set and associates the result set with a name. The syntax is as follows:
WITH [RECURSIVE]
cte_name [(column_name [, column_name] ...)] AS (subquery)
[, cte_name [(column_name [, column_name] ...)] AS (subquery)] ...
Parameters
| Parameter | Description |
|---|---|
[RECURSIVE] |
Specifies whether to create a recursive CTE.
|
cte_name |
The name of the CTE, which can be referenced by tables in the WITH clause. |
column_name |
A list of selected column names that are used to alias the columns in the CTE. This enables you to use more readable column names in the main query. |
AS(subquery) |
The subquery that is used to generate the result set of the CTE. The parentheses after the AS keyword are mandatory. |
If the CTE name is followed by a name list in parentheses, the names in the list are column names. The number of names must be the same as the number of columns in the SELECT statement of the CTE. If no column names are specified, the first column name list in the AS(subquery) part is used.
Scenarios where the WITH clause can be used
You can use the WITH clause in the following scenarios:
At the start of a
SELECTstatement.WITH ... SELECT ...At the start of a subquery (including a derived table subquery).
SELECT ... WHERE id IN (WITH ... SELECT ...) ... SELECT * FROM (WITH ... SELECT ...) AS dt ...Immediately before the
SELECTclause in a statement that contains aSELECTclause.INSERT ... WITH ... SELECT ... REPLACE ... WITH ... SELECT ... CREATE TABLE ... WITH ... SELECT ... CREATE VIEW ... WITH ... SELECT ...
You can use only one WITH clause at the same level. If multiple WITH clauses are included in the WITH clause, separate them with commas.
WITH cte1 AS (...), cte2 AS (...) SELECT ...
The WITH clause can define one or more common table expressions (CTEs). However, each CTE name must be unique within the WITH clause. The following example is illegal because the CTE names are not unique:
WITH cte1 AS (...), cte1 AS (...) SELECT ...
Structure of a recursive CTE
A recursive CTE has the following structure:
If a
WITHclause contains a CTE that references itself, theWITHclause must start withWITH RECURSIVE. Otherwise, it does not need to start withRECURSIVE.A recursive CTE comprises two parts separated by
UNION [ALL]:SELECT ... -- Returns an initial set of rows. UNION ALL SELECT ... -- Returns additional rows and references the name of the CTE.The first
SELECTgenerates one or more initial rows for the CTE and does not reference the CTE name. The secondSELECTgenerates additional rows through recursion by using theFROMclause to reference the CTE name. The recursion ends when thisSELECTdoes not generate new rows. Therefore, a recursive CTE comprises a non-recursiveSELECTpart and a recursiveSELECTpart. EachSELECTpart can be multipleSELECTstatements joined together.The types of the columns in the result set of a CTE are inferred from the column types in the result set of the non-recursive
SELECTpart, and all columns can be null. The types of the columns in the result set of the recursiveSELECTpart are ignored for inference.The
FROMclause of the recursiveSELECTpart references only rows generated by the previous iteration. If the recursiveSELECTpart contains multiple query blocks, the iterations of the query blocks are scheduled in an unspecified order, and each query block operates on the row(s) generated by the previous iteration or by another query block after the last iteration.
Here is an example:
WITH RECURSIVE cte1 (n) AS
(
SELECT 1 /*Non-recursive part. It retrieves a single row to generate the initial row set.*/
UNION ALL
SELECT n + 2 FROM cte1 WHERE n < 10 /*Recursive part. It generates a new value that is 2 greater than the value of n in the previous row set. The process continues until n is no less than 10.*/
)
SELECT * FROM cte1;
Limitations
The recursive SELECT part of a recursive CTE must meet the following requirements:
It must not contain the following structures:
Aggregate functions such as
SUM()Window functions
GROUP BYORDER BYDISTINCT
It must reference a table included in the
FROMclause of the recursive CTE only once, namely, in theFROMclause of a subquery. It can reference a table not included in theWITHclause and join it with the CTE. If it uses a join, the CTE cannot be on the right side of aLEFT JOIN.
The cost estimates displayed in the EXPLAIN statement for a recursive CTE represent the cost of each iteration and can differ significantly from the total cost. The optimizer cannot estimate the number of iterations because it cannot determine when the condition in the WHERE clause becomes False.
- The
UNION [DISTINCT]andLIMITstructures are not supported between the recursiveSELECTpart and the non-recursiveSELECTpart.
Examples
The following example shows how to create a student hierarchy table that contains the IDs, names, and mentors of students to demonstrate the difference between recursive and non-recursive CTEs.
First, create the student hierarchy table and insert some data:
obclient> CREATE TABLE student (
student_id INT PRIMARY KEY,
name VARCHAR(100),
mentor_id INT,
FOREIGN KEY (mentor_id) REFERENCES student(student_id)
);
Query OK, 0 rows affected
obclient> INSERT INTO student (student_id, name, mentor_id) VALUES
(1, 'Alice', NULL),
(2, 'Bob', 1),
(3, 'Charlie', 1),
(4, 'David', 2),
(5, 'Eve', 3);
Query OK, 5 rows affected
In this data model, Alice is a top-level student who does not have a mentor. Bob and Charlie are Alice's students. David is Bob's student. Eve is Charlie's student.
Non-recursive CTE example
A non-recursive CTE does not reference itself. For example, if we want to select all students directly under Alice (the first-level students), we can use the following non-recursive CTE:
WITH Alice_Students AS (
SELECT * FROM student WHERE mentor_id = 1
)
SELECT * FROM Alice_Students;
The execution result is as follows:
+------------+---------+-----------+
| student_id | name | mentor_id |
+------------+---------+-----------+
| 2 | Bob | 1 |
| 3 | Charlie | 1 |
+------------+---------+-----------+
2 rows in set
Recursive CTE example
If we want to find all students, both direct and indirect, under Alice (students of all levels), we need to use a recursive CTE. A recursive CTE references itself to query students of subsequent levels.
WITH RECURSIVE Student_Hierarchy AS (
-- Anchor member: Select Alice as the starting point.
SELECT student_id, name, mentor_id FROM student WHERE mentor_id IS NULL
UNION ALL
-- Recursive member: Select direct students of students from the previous level.
SELECT s.student_id, s.name, s.mentor_id
FROM student s
INNER JOIN Student_Hierarchy sh ON s.mentor_id = sh.student_id
)
SELECT * FROM Student_Hierarchy;
This recursive CTE selects Alice as the starting point, then recursively selects direct students of each student, until no more students are found.
The execution result is as follows:
+------------+---------+-----------+
| student_id | name | mentor_id |
+------------+---------+-----------+
| 1 | Alice | NULL |
| 3 | Charlie | 1 |
| 2 | Bob | 1 |
| 5 | Eve | 3 |
| 4 | David | 2 |
+------------+---------+-----------+
5 rows in set
In this example, the recursive CTE enables us to traverse the hierarchical structure of students, while the non-recursive CTE can only select students of a fixed level.