A hierarchical query is a special query that can present hierarchical data in a hierarchical order.
Hierarchical data refers to data with a hierarchical relationship in relational tables. This type of relationship is very common in our daily life. Here are some examples:
The relationship between team leaders and team members
The relationship between upper-level and lower-level departments in enterprises
The relationship between source and destination web pages in page jumping scenarios
Syntax
The syntax of SELECT for a simple query is as follows:
SELECT [level], column, expr... FROM table [WHERE condition] [ START WITH start_expression ]
CONNECT BY [NOCYCLE] { PRIOR child_expr = parent_expr | parent_expr = PRIOR child_expr }
[ ORDER SIBLINGS BY ...] [ GROUP BY ... ] [ HAVING ... ] [ ORDER BY ... ]
The following table describes the parameters in the syntax.
| Parameter | Description |
|---|---|
| level | The level, namely, the hierarchy of the node. It is a pseudocolumn. Counting from the start point of a query, the level starts from 1, and so forth. |
| CONNECT_BY_ISLEAF | Indicates whether the current row is a leaf node. It is a pseudocolumn. Valid values:
|
| CONNECT_BY_ISCYCLE | Indicates whether the current row is in a cycle. It is a pseudocolumn. Valid values:
|
| CONNECT_BY_ROOT | CONNECT_BY_ROOT is a unary operator, indicating that the column in the parameter comes from the root node of the hierarchical query. It has the same precedence as the unary operators + and -. |
| condition | The condition. |
| CONNECT BY | Indicates how to determine the parent-child relationship. An equality expression is usually used. Other expressions are also supported. |
| START WITH | The root row in the hierarchical query. |
| PRIOR | PRIOR is a unary operator, indicating that the column in the parameter comes from the parent row. It has the same precedence as the unary operators + and -. |
| NOCYCLE | If this keyword is specified, the result can still be returned even if it contains a cycle. The CONNECT_BY_ISCYCLE pseudocolumn can be used to specify where the cycle occurs. Otherwise, an error is returned. |
| ORDER SIBLINGS BY | The sorting order for rows of siblings of the same parent. |
Execution process
Before you take use of a hierarchical query, you must understand its execution process. The following describes the general execution process of a hierarchical query:
Execute the
SCANorJOINoperation after theFROMclause.Generate a hierarchical relationship based on
START WITHandCONNECT BY.The procedure for generating a hierarchical relationship is as follows:
Obtain the root rows based on the expression in
START WITH.Select the child rows of each root row based on the expression in
CONNECT BY.
Use the child rows generated in Step 2 as the new root rows and generate child rows for them. Repeat these steps cyclically until no new rows are generated.
Execute the rest clauses such as
WHERE,GROUP, andORDER BYbased on the general query execution process.
Examples
Create a table named emp and insert data into the emp_id, position, and mgr_id columns of the table.
CREATE TABLE emp(emp_id INT,position VARCHAR(50),mgr_id INT);
INSERT INTO emp VALUES (1,'Global manager',NULL);
INSERT INTO emp VALUES (2,'Manager of European region',1);
INSERT INTO emp VALUES (3,'Manager of Asia-Pacific region',1);
INSERT INTO emp VALUES (4,'Manager of Americas region',1);
INSERT INTO emp VALUES (5,'Manager of Italy region',2);
INSERT INTO emp VALUES (6,'Manager of France region',2);
INSERT INTO emp VALUES (7,'Manager of China region',3);
INSERT INTO emp VALUES (8,'Manager of South Korea region',3);
INSERT INTO emp VALUES (9,'Manager of Japan region',3);
INSERT INTO emp VALUES (10,'Manager of US region',4);
INSERT INTO emp VALUES (11,'Manager of Canada region',4);
INSERT INTO emp VALUES (12,'Manager of Beijing, China region',7);
In the preceding table, the position column has a clear hierarchical relationship.
Execute the following statement to present the results in a hierarchical structure:
obclient> SELECT emp_id, mgr_id, position, level FROM emp
START WITH mgr_id IS NULL CONNECT BY PRIOR emp_id = mgr_id;
+--------+--------+----------------------------------+-------+
| EMP_ID | MGR_ID | POSITION | LEVEL |
+--------+--------+----------------------------------+-------+
| 1 | NULL | Global manager | 1 |
| 2 | 1 | Manager of Europe region | 2 |
| 5 | 2 | Manager of Italy region | 3 |
| 6 | 2 | Manager of France region | 3 |
| 3 | 1 | Manager of Asia-Pacific region | 2 |
| 7 | 3 | Manager of China region | 3 |
| 12 | 7 | Manager of Beijing, China region | 4 |
| 8 | 3 | Manager of South Korea region | 3 |
| 9 | 3 | Manager of Japan region | 3 |
| 4 | 1 | Manager of Americas region | 2 |
| 10 | 4 | Manager of US region | 3 |
| 11 | 4 | Manager of Canada region | 3 |
+--------+--------+----------------------------------+-------+
12 rows in set
Execute the following statement to query the hierarchical structure of the Asia-Pacific region:
obclient> SELECT emp_id, mgr_id, position, level FROM emp
START WITH position = 'Manager of Asia-Pacific region' CONNECT BY PRIOR emp_id = mgr_id;
+--------+--------+----------------------------------+-------+
| EMP_ID | MGR_ID | POSITION | LEVEL |
+--------+--------+----------------------------------+-------+
| 3 | 1 | Manager of Asia-Pacific region | 1 |
| 7 | 3 | Manager of China region | 2 |
| 12 | 7 | Manager of Beijing, China region | 3 |
| 8 | 3 | Manager of South Korea region | 2 |
| 9 | 3 | Manager of Japan region | 2 |
+--------+--------+----------------------------------+-------+
5 rows in set