Purpose
APPROX_COUNT_DISTINCT() returns an approximate number of unique rows in a column. You can use this function to further calculate the selectivity of referenced columns.
Compared with the COUNT(DISTINCT x) function, APPROX_COUNT_DISTINCT() returns an approximate value, and therefore has an extremely fast calculation speed. When processing a large amount of data, COUNT(DISTINCT x) often takes a long time. APPROX_COUNT_DISTINCT, however, greatly improves the computing efficiency at a slight cost of accuracy.
Syntax
APPROX_COUNT_DISTINCT(expr)
Parameters
expr indicates the column to deduplicate.
Return type
The return type is NUMBER.
Examples
Create a table named tbl1 and insert 10 data entries into it.
obclient> CREATE TABLE tbl1 (col1 INT,col2 INT,col3 varchar(10));
Query OK, 0 rows affected
obclient> INSERT INTO tbl1 VALUES (1,10,'a'),(2,20,'b'),(3,30,'c'),
(4,40,'a'),(5,50,'c'),(1,10,'a'),(2,20,'b'),(3,30,'c'),(4,30,'a'),(5,40,'b');
Query OK, 10 rows affected
Records: 10 Duplicates: 0 Warnings: 0
Return the number of unique values in the
col2column.obclient> SELECT APPROX_COUNT_DISTINCT(col2) FROM tbl1; +-----------------------------+ | APPROX_COUNT_DISTINCT(COL2) | +-----------------------------+ | 5 | +-----------------------------+ 1 row in setGroup and deduplicate the data records by the
col1andcol2columns, and count the number of each value in thecol1column.obclient> SELECT col1,APPROX_COUNT_DISTINCT(col2) FROM tbl1 GROUP BY col1; +------+-----------------------------+ | COL1 | APPROX_COUNT_DISTINCT(COL2) | +------+-----------------------------+ | 1 | 1 | | 2 | 1 | | 3 | 1 | | 4 | 2 | | 5 | 2 | +------+-----------------------------+ 5 rows in setGroup and deduplicate the data records by the
col1,col2, andcol3columns, and count the number of each value pair in thecol1andcol3columns.obclient> SELECT col1,col3,APPROX_COUNT_DISTINCT(col2) FROM tbl1 GROUP BY col1,col3; +------+------+-----------------------------+ | COL1 | COL3 | APPROX_COUNT_DISTINCT(COL2) | +------+------+-----------------------------+ | 1 | a | 1 | | 2 | b | 1 | | 3 | c | 1 | | 4 | a | 2 | | 5 | b | 1 | | 5 | c | 1 | +------+------+-----------------------------+ 6 rows in set