Purpose
This function calculates the cumulative distribution of a value in a set of values, returning a value in the range (0,1]. It can be used as an aggregate or analytic function.
Note
- As an analytic function, it calculates the relative position of a specified value within a set of values. For a row
row, assuming the rows are sorted in ascending order, thecume_distofrowis the number of rows with values less than or equal torow's value, divided by the total number of rows (the entire query result set or partition) being calculated. - As an aggregate function, its parameters identify a single hypothetical row within each aggregate group. Therefore, they must all be constant expressions within each aggregate group. The constant parameter expressions and the expressions in the
ORDER BYclause of the aggregate function match by position. Therefore, the number of parameters must be the same, and their types must be compatible. If a value in the grouping sequence is repeated, the two repeated values are treated as one value.
Syntax
/*Aggregate syntax*/
CUME_DIST( expr[,expr]...)
WITHIN GROUP (ORDER BY expr_col [ DESC | ASC ][ NULLS { FIRST | LAST } ]
[,expr_col [ DESC | ASC ][ NULLS { FIRST | LAST } ]]...
)
/*Analytic syntax*/
CUME_DIST() OVER ([ query_partition_clause ] order_by_clause)
Parameters
| Parameter | Description |
|---|---|
| expr | Specifies the value to query. |
| expr_col | Specifies the column name corresponding to the value to query. |
| DESC | ASC | Specifies the sorting order, which is optional.
|
| NULLS { FIRST | LAST } | Specifies the position of NULL values in expr_col after sorting, which is optional.
|
| OVER | Use the OVER clause to define the window for calculation. For more information, see Analytic Function Description. |
Return type
The return type is NUMBER.
Examples
Assume that the emp_msg table has been created.
obclient> SELECT * FROM emp_msg;
+--------+--------+------+------+
| DEPTNO | ENAME | SAL | MGR |
+--------+--------+------+------+
| 10 | CLARK | 2750 | 7839 |
| 10 | KING | 5300 | NULL |
| 10 | MILLER | 1600 | 7782 |
| 20 | ADAMS | 1400 | 7788 |
| 20 | FORD | 3300 | 7566 |
| 20 | JONES | 3275 | 7839 |
| 20 | SCOTT | 3300 | 7566 |
| 20 | SMITH | 1100 | 7902 |
| 30 | ALLEN | 1900 | 7698 |
| 30 | BLAKE | 3150 | 7839 |
| 30 | JAMES | 1250 | 7698 |
| 30 | MARTIN | 1550 | 7698 |
| 30 | TURNER | 1800 | 7698 |
| 30 | WARD | 1550 | 7698 |
| 30 | SCLARK | 1750 | 7839 |
+--------+--------+------+------+
15 rows in set
Aggregate function example
Return the position of 3300 in the sal column.
obclient> SELECT CUME_DIST(3300) WITHIN GROUP (ORDER BY sal) FROM emp_msg;
+----------------------------------------+
| CUME_DIST(3300)WITHINGROUP(ORDERBYSAL) |
+----------------------------------------+
| .9375 |
+----------------------------------------+
1 row in set
Analytic function example
Group by the deptno column and sort the sal column in descending order, then return the position of each value in the sal column.
obclient> SELECT deptno,ename,sal,
CUME_DIST ( ) over (PARTITION BY deptno ORDER BY sal DESC ) "RANK"
FROM emp_msg;
+--------+--------+------+-------------------------------------------+
| DEPTNO | ENAME | SAL | RANK |
+--------+--------+------+-------------------------------------------+
| 10 | KING | 5300 | .3333333333333333333333333333333333333333 |
| 10 | CLARK | 2750 | .6666666666666666666666666666666666666667 |
| 10 | MILLER | 1600 | 1 |
| 20 | FORD | 3300 | .4 |
| 20 | SCOTT | 3300 | .4 |
| 20 | JONES | 3275 | .6 |
| 20 | ADAMS | 1400 | .8 |
| 20 | SMITH | 1100 | 1 |
| 30 | BLAKE | 3150 | .1428571428571428571428571428571428571429 |
| 30 | ALLEN | 1900 | .2857142857142857142857142857142857142857 |
| 30 | TURNER | 1800 | .4285714285714285714285714285714285714286 |
| 30 | SCLARK | 1750 | .5714285714285714285714285714285714285714 |
| 30 | MARTIN | 1550 | .8571428571428571428571428571428571428571 |
| 30 | WARD | 1550 | .8571428571428571428571428571428571428571 |
| 30 | JAMES | 1250 | 1 |
+--------+--------+------+-------------------------------------------+
15 rows in set