Purpose
This function calculates the sample covariance of a set of numerical pairs. It can be used as an aggregate or analytic function.
Note
- When used as an analytic function, the
OVERclause must be used to define the window for calculation. It calculates over a set of rows and returns multiple values. - When used as an aggregate function, it aggregates over a set of rows and returns a single value. In this case, the
OVERclause is not needed.
Syntax
COVAR_SAMP(expr1, expr2) [ OVER (analytic_clause) ]
Parameters
Parameter |
Description |
|---|---|
| expr1 | Specifies the first numerical expression, which is of a numeric data type or can be implicitly converted to a numeric data type. |
| expr2 | Specifies the second numerical expression, which is of a numeric data type or can be implicitly converted to a numeric data type. |
| OVER | Use the OVER clause to define the window for calculation. For more information, see Analytic Function Description. |
Note
- The database determines the parameter with the highest numeric precedence, implicitly converts the other parameters to this data type, and returns the result in this data type.
- The positions of parameters
expr1andexpr2do not affect the result. That is,COVAR_POP(expr1, expr2)is equal toCOVAR_POP(expr2, expr1). - The formula for calculating the result is:
(SUM(expr1 * expr2) - SUM(expr1) * SUM(expr2) / n) / (n-1), wherenis the number of (expr1, expr2) pairs, and bothexpr1andexpr2are not empty.
Return type
If any parameter is empty or there is only one row of data, it returns NULL. Otherwise, it returns a NUMBER value.
Examples
A table named tbl1 has been created.
obclient> SELECT * FROM tbl1;
+------+------+------+------+
| COL1 | COL2 | COL3 | COL4 |
+------+------+------+------+
| 1 | A1 | 8 | 12 |
| 1 | A2 | 10 | 15 |
| 1 | A3 | 11 | 16 |
| 2 | B1 | 9 | 14 |
| 2 | B2 | 10 | 15 |
| 2 | B3 | 8 | 13 |
| 2 | B4 | 11 | 16 |
| 3 | C1 | 8 | 18 |
| 3 | C2 | 9 | 16 |
| 3 | C3 | 10 | 15 |
| 3 | C4 | 11 | 12 |
| 3 | C5 | 12 | 10 |
+------+------+------+------+
12 rows in set
Example of aggregate function
Calculate the sample covariance of the numerical pairs in columns col3 and col4.
obclient> SELECT COVAR_SAMP(col3,col4) FROM tbl1;
+--------------------------------------------+
| CORR(COL3,COL4) |
+--------------------------------------------+
| -.2705008904002296868793073195758520224002 |
+--------------------------------------------+
1 row in set
Example of analytic function
Group by column col1 and sort by column col2, then calculate the cumulative sample covariance of the numerical pairs in column col3 and col4.
obclient> SELECT col1,col3,col4,
COVAR_SAMP(col3,col4) OVER(PARTITION BY col1 ORDER BY col2) "COVAR_SAMP"
FROM tbl1;
+------+------+------+-------------------------------------------+
| COL1 | COL3 | COL4 | COVAR_SAMP |
+------+------+------+-------------------------------------------+
| 1 | 8 | 12 | NULL |
| 1 | 10 | 15 | 3 |
| 1 | 11 | 16 | 3.1666666666666666666666666666666666665 |
| 2 | 9 | 14 | NULL |
| 2 | 10 | 15 | .5 |
| 2 | 8 | 13 | 1 |
| 2 | 11 | 16 | 1.66666666666666666666666666666666666667 |
| 3 | 8 | 18 | NULL |
| 3 | 9 | 16 | -1 |
| 3 | 10 | 15 | -1.5 |
| 3 | 11 | 12 | -3.16666666666666666666666666666666666667 |
| 3 | 12 | 10 | -5 |
+------+------+------+-------------------------------------------+
12 rows in set
