A Common Table Expression (CTE) is a temporary named result set that is used only within the statement that contains it. OceanBase Database supports both recursive and non-recursive CTEs.
CTE syntax
A common table expression is an optional part of the DML statement syntax. It is defined by using the WITH clause. Multiple clauses can be separated by commas. Each clause provides a name and a subquery to generate a result set. The syntax is as follows:
with_clause:
WITH [RECURSIVE]
cte_name [(column_name [, column_name] ...)] AS (subquery)
[, cte_name [(column_name [, column_name] ...)] AS (subquery)] ...
The parameters are described as follows:
cte_nameis the name of the common table expression. It can be referenced by a table in theWITHclause.AS(subquery)is referred to as the "CTE subquery," which is used to generate the result set of the CTE. AS must be followed by a pair of parentheses.
If the CTE subquery references its own name, the common table expression is recursive. If all the CTEs in the WITH clause are recursive, the RECURSIVE keyword must be included.
In the following example, the WITH clause is used to define common table expressions named cte1 and cte2. Then, the SELECT statement after the WITH clause is used to reference the common table expressions:
WITH
cte1 AS (SELECT col1, col2 FROM tbl1),
cte2 AS (SELECT col3, col4 FROM tbl2)
SELECT col2, col4 FROM cte1 JOIN cte2
WHERE cte1.col1 = cte2.col3;
If a name of a CTE is followed by a pair of parentheses containing a name list, the names in the parentheses are the column names. The number of names in the parentheses must be equal to the number of columns in the result set. Otherwise, the column names are taken from the first select list in the AS(subquery) part.
Scenarios supported by the WITH clause
The WITH clause is supported at the beginning of the following statements:
SELECTstatementsWITH ... SELECT ...Subqueries (including derived table subqueries)
SELECT ... WHERE id IN (WITH ... SELECT ...) ... SELECT * FROM (WITH ... SELECT ...) AS dt ...Statements that contain
SELECTstatements, immediately preceding theSELECTstatement.INSERT ... WITH ... SELECT ... REPLACE ... WITH ... SELECT ... CREATE TABLE ... WITH ... SELECT ... CREATE VIEW ... WITH ... SELECT ...
Only one WITH clause is allowed at the same level. Multiple clauses can be separated by commas.
WITH cte1 AS (...), cte2 AS (...) SELECT ...
The WITH clause can define one or more common table expressions. However, each CTE name must be unique within the WITH clause. The following example is illegal:
WITH cte1 AS (...), cte1 AS (...) SELECT ...
Recursive CTEs
A recursive common table expression (CTE) is a CTE that contains a subquery that references the CTE itself.
Structure of a recursive CTE
A recursive CTE has the following structure:
If the CTE in the
WITHclause references itself, theWITHclause must start withWITH RECURSIVE. Otherwise,RECURSIVEis not required.The recursive CTE's subquery consists of two
SELECTstatements separated byUNION [ALL]orUNION DISTINCT:SELECT ... -- Returns an initial set of rows. UNION ALL SELECT ... -- Returns additional rows and references the CTE name in its `FROM` clause for recursion.The first
SELECTstatement generates one or more initial rows for the CTE and does not reference the CTE name. The secondSELECTstatement generates additional rows and references the CTE name in itsFROMclause for recursion. When this statement no longer generates new rows, the recursion ends. Therefore, a recursive CTE consists of a non-recursiveSELECTstatement and a recursiveSELECTstatement. EachSELECTstatement can be a combination of multipleSELECTstatements.The data types of the result columns of a recursive CTE are inferred from the column types of the initial rows generated by the non-recursive
SELECTstatement. All columns can be null. The types of the result columns are determined without considering the recursiveSELECTstatement. IfUNION DISTINCTis used to separate the non-recursive and recursiveSELECTstatements, duplicate rows are eliminated. This can prevent infinite loops in queries that compute transitive closures.The recursive part processes only the rows generated by the previous iteration. If the recursive part contains multiple query blocks, the iterations of each query block are scheduled in an unspecified order, and each query block processes the rows generated by the previous iteration of itself or other query blocks since the last iteration ended.
Here is an example:
WITH RECURSIVE cte1 (n) AS
(
SELECT 1 /*Non-recursive part, which retrieves a single row to generate the initial row set.*/
UNION ALL
SELECT n + 2 FROM cte1 WHERE n < 10 /*Recursive part, which generates a new value that is 2 greater than the values in the preceding row set until the new value is not less than 10.*/
)
SELECT * FROM cte1;
Prerequisites
The subquery in a recursive CTE must meet the following requirements:
The recursive
SELECTstatement must not contain any of the following structures:Aggregate functions such as
SUM()Window functions
GROUP BYORDER BYDISTINCT
The recursive
SELECTstatement must reference only once the subquery in itsFROMclause. It can reference tables other than the CTE and join with the CTE. If used in such a join, the CTE cannot be located on the right side of aLEFT JOIN.
For a recursive CTE, the cost estimates displayed by the EXPLAIN statement represent the cost of each iteration and can differ significantly from the total cost. The optimizer cannot predict the number of iterations because it cannot predict when the WHERE condition becomes false.