This topic describes the basic concept and syntax of lateral derived tables, and provides some examples to introduce how to use lateral derived tables.
Concept
A derived table is a subquery used in a FROM clause. The result set of this subquery is temporarily used as a table in outer queries. A derived table is usually used to group and aggregate data, or used to create a dataset that meets the specified conditions for use in the main query.
Lateral derived tables are a special type of derived tables. You can use the LATERAL keyword to specify a lateral derived table so that it can reference fields in another table or derived table previously defined in the same FROM clause. This way, a subquery in the lateral derived table can reference tables defined in the same FROM clause and access column values in these tables.
A lateral derived table differs from a normal derived table in that it can reference columns in tables that are previously defined in the same FROM clause.
Syntax
SELECT select_list
FROM table_name1,
LATERAL (SELECT select_list
FROM table_name2
WHERE table_name2.col_name = table_name1.col_name) AS lateral_derived_table_name
[...];
The parameters are described as follows:
| Parameter | Description |
|---|---|
| select_list | The list of columns to retrieve, which can be column names, expressions, or aggregate functions. Separate multiple columns with commas (,). |
| table_name1 | The primary table to query. |
| LATERAL | A lateral join subquery, namely, a lateral derived table. |
| table_name2 | A secondary table referenced in the LATERAL subquery to provide related additional information for each row in the primary table specified by table_name1. |
| lateral_derived_table_name | The alias of the secondary table, which is referenced in subsequent queries. |
| [...] | An optional query clause, such as a WHERE clause. |
Examples
Create a test table and add test data to the table
Create a table named
students.CREATE TABLE students ( id NUMBER PRIMARY KEY, name VARCHAR2(50) NOT NULL, age NUMBER );Insert 3 data records into the
studentstable.INSERT INTO students VALUES (1, 'name1', 20), (2, 'name2', 22), (3, 'name3', 21);Create a table named
scores.CREATE TABLE scores ( id NUMBER PRIMARY KEY, student_id NUMBER, subject VARCHAR2(50) NOT NULL, score DECIMAL(5, 2), FOREIGN KEY (student_id) REFERENCES students(id) );Insert 9 data records into the
scorestable.INSERT INTO scores VALUES (1, 1, 'A', 86.5), (2, 1, 'B', 90.0), (3, 1, 'C', 91.5), (4, 2, 'A', 86.0), (5, 2, 'B', 92.0), (6, 2, 'C', 89.5), (7, 3, 'A', 93.0), (8, 3, 'B', 92.5), (9, 3, 'C', 85.0);
Query the test data
Query the students and scores tables for the name, average score, and highest score of each student.
Q1: Use multiple subqueries to obtain required data. Perform a group operation and an aggregate operation on the scores table. Scan the scores table twice to calculate the average score and highest score of each student. Then, use the WHERE clause to join the results with the students table.
SELECT st.name, sc.avg_score, scs.max_score
FROM students st,
(SELECT student_id, AVG(score) avg_score
FROM scores
GROUP BY student_id) sc,
(SELECT student_id, MAX(score) max_score
FROM scores
GROUP BY student_id) scs
WHERE sc.student_id = st.id
AND scs.student_id = st.id;
The result is as follows:
+-------+-------------------------------------------+-----------+
| NAME | AVG_SCORE | MAX_SCORE |
+-------+-------------------------------------------+-----------+
| name1 | 89.33333333333333333333333333333333333333 | 91.5 |
| name2 | 89.16666666666666666666666666666666666667 | 92 |
| name3 | 90.16666666666666666666666666666666666667 | 93 |
+-------+-------------------------------------------+-----------+
3 rows in set
Q2: Use the LATERAL keyword to obtain required data. Use the LATERAL keyword (lateral derived table) to calculate the average score and highest score of each student in an SQL statement, and scan the scores table once for the ID of each student.
SELECT st.name, ld_tbl.avg_score, ld_tbl.max_score
FROM students st,
LATERAL (SELECT AVG(score) avg_score, MAX(score) max_score
FROM scores sc
WHERE sc.student_id = st.id) ld_tbl;
The result is as follows:
+-------+-------------------------------------------+-----------+
| NAME | AVG_SCORE | MAX_SCORE |
+-------+-------------------------------------------+-----------+
| name1 | 89.33333333333333333333333333333333333333 | 91.5 |
| name2 | 89.16666666666666666666666666666666666667 | 92 |
| name3 | 90.16666666666666666666666666666666666667 | 93 |
+-------+-------------------------------------------+-----------+
3 rows in set
Q2 has higher performance than Q1 in processing a large dataset because it avoids repeatedly scanning the scores table. The syntax of Q2 is clearer and more compact and easier to understand and maintain.