Purpose
CORR() returns the correlation coefficient of a set of numeric values. The correlation coefficient measures the strength of correlation between number pairs. The value range is [-1, 1]. The value 0 indicates that the number pairs are unrelated. A value less than 0 indicates negative correlation. A value greater than 0 indicates positive correlation. You can use this function as an aggregate or analytic function.
Note
- When you use this function as an analytic function, use the
OVERclause to define a window over the data on which the function operates. The function operates on a group of rows to return a list of values. - When you use this function as an aggregate function, the function operates on a set of rows and returns a single value. You do not need to add the
OVERclause.
Syntax
CORR(expr1, expr2) [ OVER (analytic_clause) ]
Parameters
| Parameter | Description |
|---|---|
| expr1 | The first parameter, which is of a numeric data type or a data type that can be implicitly converted to a numeric data type. |
| expr2 | The second parameter, which is of a numeric data type or a data type that can be implicitly converted to a numeric data type. |
| OVER | You can use the OVER clause to define a window over the data on which the function operates. For more information, see Analytic functions. |
Note
- The database determines the parameter with the highest numeric precedence, implicitly converts the remaining parameter values to this data type, and returns a result of this data type.
- The locations of the
expr1andexpr2parameters have no influence over the return result. - The return result is calculated by using the following formula:
COVAR_POP(expr1, expr2) / (STDDEV_POP(expr1) * STDDEV_POP(expr2)).
Return type
If any parameter is NULL or you have only one row of data, NULL is returned. Otherwise, a value of the NUMBER type is returned.
Examples
Create a table named tbl1 and insert data into it.
obclient> CREATE TABLE tbl1(col1 INT,col2 varchar(10),col3 INT,col4 INT);
Query OK, 0 rows affected
obclient> INSERT INTO tbl1 VALUES(1,'A1',8,12),(1,'A2',10,15),(1,'A3',11,16),
(2,'B1',9,14),(2,'B2',10,15),(2,'B3',8,13),(2,'B4',11,16),(3,'C1',8,18),
(3,'C2',9,16),(3,'C3',10,15),(3,'C4',11,12),(3,'C5',12,10);
Query OK, 12 rows affected
Records: 12 Duplicates: 0 Warnings: 0
obclient> SELECT * FROM tbl1;
+------+------+------+------+
| COL1 | COL2 | COL3 | COL4 |
+------+------+------+------+
| 1 | A1 | 8 | 12 |
| 1 | A2 | 10 | 15 |
| 1 | A3 | 11 | 16 |
| 2 | B1 | 9 | 14 |
| 2 | B2 | 10 | 15 |
| 2 | B3 | 8 | 13 |
| 2 | B4 | 11 | 16 |
| 3 | C1 | 8 | 18 |
| 3 | C2 | 9 | 16 |
| 3 | C3 | 10 | 15 |
| 3 | C4 | 11 | 12 |
| 3 | C5 | 12 | 10 |
+------+------+------+------+
12 rows in set
Example of an aggregate function
Calculate the correlation coefficient between the col3 column and the col4 column.
obclient> SELECT CORR(col3,col4) FROM tbl1;
+--------------------------------------------+
| CORR(COL3,COL4) |
+--------------------------------------------+
| -.2705008904002296868793073195758520224002 |
+--------------------------------------------+
1 row in set
Example of an analytic function
Group data by the col1 column. Calculate the correlation coefficient between the col3 column and the col4 column.
obclient> SELECT col1,col3,col4,CORR(col3,col4) OVER(PARTITION BY col1) "corr" FROM tbl1;
+------+------+------+--------------------------------------------+
| COL1 | COL3 | COL4 | corr |
+------+------+------+--------------------------------------------+
| 1 | 8 | 12 | .9958705948858223809835060513429288056548 |
| 1 | 10 | 15 | .9958705948858223809835060513429288056548 |
| 1 | 11 | 16 | .9958705948858223809835060513429288056548 |
| 2 | 9 | 14 | 1 |
| 2 | 10 | 15 | 1 |
| 2 | 8 | 13 | 1 |
| 2 | 11 | 16 | 1 |
| 3 | 8 | 18 | -.9901475429766743091532731291244706579003 |
| 3 | 9 | 16 | -.9901475429766743091532731291244706579003 |
| 3 | 10 | 15 | -.9901475429766743091532731291244706579003 |
| 3 | 11 | 12 | -.9901475429766743091532731291244706579003 |
| 3 | 12 | 10 | -.9901475429766743091532731291244706579003 |
+------+------+------+--------------------------------------------+
12 rows in set