Purpose
This function is an inverse distribution function that assumes a continuous distribution model. You can call this function to return an interpolated value within a given percentile based on the specified percentile value and sorting specification. Nulls are ignored in the calculation.
Note
- When you use it as an analytic function, use the
OVERclause to define a window over the data on which the function operates. The function operates on a group of rows to return a list of values.- When you use it as an aggregate function, the function operates on a set of rows and returns a single value. You do not need to add the
OVERclause.
Syntax
PERCENTILE_CONT(percentile) WITHIN GROUP (ORDER BY expr [ DESC ASC ])
[ OVER (query_partition_clause) ]
Parameters
| Parameter | Description |
|---|---|
| percentile | Specifies the percentile value. It is a constant of the numeric data type in the range of [0,1]. Note The MEDIAN function is a special case where percentile equals 0.5. |
| expr | Specifies the sorting specification. The data type is numeric or datetime. Notice expr must be a single expression involving column references. Multiple expressions are not allowed. |
| DESC ASC | (Optional) The sorting method of the list.
|
| OVER | You can use the OVER clause to define a window over the data on which the function operates. For more information about analytic functions, see Analytic functions. |
Note
This function sorts the data according to the sorting specification, and then calculates the row number based on the given percentile
(P)and non-empty rows(N). The formula for calculating the row number isRN = (1+ (P*(N-1)). The final result of this function is calculated upon linear interpolation on row values between the row numberCRN = CEILING(RN)and the row numberFRN = FLOOR(RN). The final result is determined based on the following rules:
- If
(CRN = FRN = RN), the result is(value of expression from row at RN).- Otherwise, the result is:
(CRN - RN) * (value of expression for row at FRN) + (RN - FRN) * (value of expression for row at CRN)
Return types
This function returns the same value type as the expr parameter.
Examples
Assume that you have created the table tbl1.
obclient> SELECT * FROM tbl1;
+------+------+------+------+
COL1 COL2 COL3 COL4
+------+------+------+------+
1 A1 8 12
1 A2 10 15
1 A3 11 16
2 B1 9 14
2 B2 10 15
2 B3 8 13
2 B4 11 16
3 C1 8 18
3 C2 9 16
3 C3 10 15
3 C4 11 12
3 C5 12 10
+------+------+------+------+
12 rows in set
Aggregate example
Calculate the value at the 10th percentile in the col4 column.
obclient> SELECT PERCENTILE_CONT(0.1) WITHIN GROUP (ORDER BY col4) FROM tbl1;
+----------------------------------------------+
PERCENTILE_CONT(0.1)WITHINGROUP(ORDERBYCOL4)
+----------------------------------------------+
12
+----------------------------------------------+
1 row in set
Analytic example
Group data by the col1 column. Calculate the value at the 50th percentile in the col4 column, that is the median of the col4 column.
obclient> SELECT col1,col4,PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY col4) OVER (PARTITION BY col1)
FROM tbl1;
+------+------+-------------------------------------------------------------------+
COL1 COL4 PERCENTILE_CONT(0.5)WITHINGROUP(ORDERBYCOL4)OVER(PARTITIONBYCOL1)
+------+------+-------------------------------------------------------------------+
1 12 15
1 15 15
1 16 15
2 14 14.5
2 15 14.5
2 13 14.5
2 16 14.5
3 18 15
3 16 15
3 15 15
3 12 15
3 10 15
+------+------+-------------------------------------------------------------------+
12 rows in set