Purpose
This function calculates the sample covariance of a set of numerical pairs. You can use this function as an aggregate function or an analytic function.
Note
- When used as an analytic function, you must use the
OVERclause to define the window for calculation. It calculates the covariance for a set of rows and returns multiple values. - When used as an aggregate function, it aggregates the covariance for a set of rows and returns a single value. In this case, the
OVERclause is not required.
Syntax
COVAR_SAMP(expr1, expr2) [ OVER (analytic_clause) ]
Parameters
Parameter |
Description |
|---|---|
| expr1 | Specifies the first numerical expression, which can be a value of a numeric data type or a value that can be implicitly converted to a numeric data type. |
| expr2 | Specifies the second numerical expression, which can be a value of a numeric data type or a value that can be implicitly converted to a numeric data type. |
| OVER | Specifies the OVER clause to define the window for calculation. For more information, see Analytic Function Description. |
Note
- The database determines the parameter with the highest numeric precedence, implicitly converts the other parameters to this data type, and returns the result in this data type.
- The positions of parameters
expr1andexpr2do not affect the result, i.e.,COVAR_POP(expr1, expr2)is equal toCOVAR_POP(expr2, expr1). - The calculation formula for the result is:
(SUM(expr1 * expr2) - SUM(expr1) * SUM(expr2) / n) / (n-1), wherenis the number of pairs of (expr1,expr2), and bothexpr1andexpr2are not empty.
Return type
If any parameter is empty or only one row of data exists, it returns NULL. Otherwise, it returns a value of the NUMBER type.
Examples
Assume that the table tbl1 has been created.
obclient> SELECT * FROM tbl1;
+------+------+------+------+
| COL1 | COL2 | COL3 | COL4 |
+------+------+------+------+
| 1 | A1 | 8 | 12 |
| 1 | A2 | 10 | 15 |
| 1 | A3 | 11 | 16 |
| 2 | B1 | 9 | 14 |
| 2 | B2 | 10 | 15 |
| 2 | B3 | 8 | 13 |
| 2 | B4 | 11 | 16 |
| 3 | C1 | 8 | 18 |
| 3 | C2 | 9 | 16 |
| 3 | C3 | 10 | 15 |
| 3 | C4 | 11 | 12 |
| 3 | C5 | 12 | 10 |
+------+------+------+------+
12 rows in set
Example of an aggregate function
Calculate the sample covariance of the numerical pairs in columns col3 and col4.
obclient> SELECT COVAR_SAMP(col3,col4) FROM tbl1;
+--------------------------------------------+
| CORR(COL3,COL4) |
+--------------------------------------------+
| -.2705008904002296868793073195758520224002 |
+--------------------------------------------+
1 row in set
Example of an analytic function
Group the data by column col1 and sort the data by column col2, and calculate the cumulative sample covariance of the numerical pairs in column col3 and col4.
obclient> SELECT col1,col3,col4,
COVAR_SAMP(col3,col4) OVER(PARTITION BY col1 ORDER BY col2) "COVAR_SAMP"
FROM tbl1;
+------+------+------+-------------------------------------------+
| COL1 | COL3 | COL4 | COVAR_SAMP |
+------+------+------+-------------------------------------------+
| 1 | 8 | 12 | NULL |
| 1 | 10 | 15 | 3 |
| 1 | 11 | 16 | 3.1666666666666666666666666666666666665 |
| 2 | 9 | 14 | NULL |
| 2 | 10 | 15 | .5 |
| 2 | 8 | 13 | 1 |
| 2 | 11 | 16 | 1.66666666666666666666666666666666666667 |
| 3 | 8 | 18 | NULL |
| 3 | 9 | 16 | -1 |
| 3 | 10 | 15 | -1.5 |
| 3 | 11 | 12 | -3.16666666666666666666666666666666666667 |
| 3 | 12 | 10 | -5 |
+------+------+------+-------------------------------------------+
12 rows in set
