Purpose
This function is the inverse of the continuous distribution model. It returns an interpolated value at the specified percentile for the given sort order. Null values are ignored in the calculation.
Note
- When used as an analytic function, you must use the
OVERclause to define the window for computation. It calculates over a set of rows and returns multiple values. - When used as an aggregate function, it aggregates over a set of rows and returns a single value. In this case, the
OVERclause is not required.
Syntax
PERCENTILE_CONT(percentile) WITHIN GROUP (ORDER BY expr [ DESC | ASC ])
[ OVER (query_partition_clause) ]
Parameters
| Parameter | Description |
|---|---|
| percentile | The percentile value, which is a constant of the numeric data type. The value ranges from [0,1]. Note The MEDIAN function is equivalent to setting percentile to 0.5. |
| expr | The expression for the sort order, which is of the numeric or datetime data type. Note expr must be a single expression involving column references. Multiple expressions are not allowed. |
| DESC | ASC | The sort order, which is optional.
|
| OVER | Use the OVER clause to define the window for computation. For more information, see Analytic Function Description. |
Note
After sorting the rows based on the sort order, the function calculates the row number using the specified percentile (P) and the number of non-null rows (N) with the formula RN = (1+ (P*(N-1)). The final result is computed through linear interpolation between the values at row numbers CRN = CEILING(RN) and FRN = FLOOR(RN). The final result will be as follows:
If (CRN = FRN = RN), the result is (value of expression from row at RN); otherwise, the result is:
(CRN - RN) * (value of expression for row at FRN) + (RN - FRN) * (value of expression for row at CRN).
Return type
The same data type as the expr parameter.
Examples
Assume that the tbl1 table has been created.
obclient> SELECT * FROM tbl1;
+------+------+------+------+
| COL1 | COL2 | COL3 | COL4 |
+------+------+------+------+
| 1 | A1 | 8 | 12 |
| 1 | A2 | 10 | 15 |
| 1 | A3 | 11 | 16 |
| 2 | B1 | 9 | 14 |
| 2 | B2 | 10 | 15 |
| 2 | B3 | 8 | 13 |
| 2 | B4 | 11 | 16 |
| 3 | C1 | 8 | 18 |
| 3 | C2 | 9 | 16 |
| 3 | C3 | 10 | 15 |
| 3 | C4 | 11 | 12 |
| 3 | C5 | 12 | 10 |
+------+------+------+------+
12 rows in set
Aggregate function example
Calculate the 10th percentile value of the col4 column.
obclient> SELECT PERCENTILE_CONT(0.1) WITHIN GROUP (ORDER BY col4) FROM tbl1;
+----------------------------------------------+
| PERCENTILE_CONT(0.1)WITHINGROUP(ORDERBYCOL4) |
+----------------------------------------------+
| 12 |
+----------------------------------------------+
1 row in set
Analytic function example
Group by the col1 column and calculate the 50th percentile value of the col4 column (i.e., the median of the col4 column).
obclient> SELECT col1,col4,PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY col4) OVER (PARTITION BY col1)
FROM tbl1;
+------+------+-------------------------------------------------------------------+
| COL1 | COL4 | PERCENTILE_CONT(0.5)WITHINGROUP(ORDERBYCOL4)OVER(PARTITIONBYCOL1) |
+------+------+-------------------------------------------------------------------+
| 1 | 12 | 15 |
| 1 | 15 | 15 |
| 1 | 16 | 15 |
| 2 | 14 | 14.5 |
| 2 | 15 | 14.5 |
| 2 | 13 | 14.5 |
| 2 | 16 | 14.5 |
| 3 | 18 | 15 |
| 3 | 16 | 15 |
| 3 | 15 | 15 |
| 3 | 12 | 15 |
| 3 | 10 | 15 |
+------+------+-------------------------------------------------------------------+
12 rows in set