Purpose
This function returns the variance of a specified column. It can be used as an aggregate or analytic function.
Note
- When used as an analytic function, you must use the
OVERclause to define the window for computation. It computes the variance over a set of rows and returns multiple values. - When used as an aggregate function, it aggregates a set of rows and returns a single value. In this case, the
OVERclause is not required.
Syntax
VARIANCE([ DISTINCT | UNIQUE | ALL ] expr) [ OVER (analytic_clause) ]
Parameters
Parameter |
Description |
|---|---|
| DISTINCT | UNIQUE | ALL | Specifies whether to eliminate duplicates. This is an optional parameter with a default value of ALL.
|
| expr | A numeric data type or any expression that can be implicitly converted to a numeric data type. |
| OVER | Defines the window for computation using the OVER clause. For more information, see Analytic Function Description. |
Notice
If you specify the DISTINCT or UNIQUE keyword, the order_by_clause and windowing_clause cannot be specified in the analytic_clause.
Return type
The return type is the same as the data type of the expr parameter.
Examples
Assume that the employees table has been created.
obclient> SELECT * FROM employees;
+---------------+-----------+------------+--------+
| DEPARTMENT_ID | LAST_NAME | HIREDATE | SALARY |
+---------------+-----------+------------+--------+
| 30 | Raphaely | 2017-07-01 | 1700 |
| 30 | De Haan | 2018-05-01 | 11000 |
| 40 | Errazuriz | 2017-07-21 | 1400 |
| 50 | Hartstein | 2019-10-05 | 14000 |
| 50 | Raphaely | 2017-07-22 | 1700 |
| 50 | Weiss | 2019-10-05 | 13500 |
| 90 | Russell | 2019-07-11 | 13000 |
| 90 | Partners | 2018-12-01 | 14000 |
+---------------+-----------+------------+--------+
8 rows in set
Aggregate function example
Calculate the variance of all values in the salary column.
obclient> SELECT SUM(salary) FROM employees;
+-------------+
| SUM(SALARY) |
+-------------+
| 70300 |
+-------------+
1 row in set
Analytic function example
Calculate the cumulative variance of the salary column sorted in ascending order by the hiredate column.
obclient> SELECT last_name,hiredate,salary,VARIANCE(salary) OVER (ORDER BY hiredate) "Variance"
FROM employees;
+-----------+------------+--------+-------------------------------------------+
| LAST_NAME | HIREDATE | SALARY | Variance |
+-----------+------------+--------+-------------------------------------------+
| Raphaely | 2017-07-01 | 1700 | 0 |
| Errazuriz | 2017-07-21 | 1400 | 45000 |
| Raphaely | 2017-07-22 | 1700 | 30000 |
| De Haan | 2018-05-01 | 11000 | 22110000 |
| Partners | 2018-12-01 | 14000 | 36783000 |
| Russell | 2019-07-11 | 13000 | 37686666.6666666666666666666666666666666 |
| Hartstein | 2019-10-05 | 14000 | 36318392.85714285714285714285714285714286 |
| Weiss | 2019-10-05 | 13500 | 36318392.85714285714285714285714285714286 |
+-----------+------------+--------+-------------------------------------------+
8 rows in set
