Purpose
APPROX_COUNT_DISTINCT() returns an approximate number of unique rows in a column. You can use this function to further calculate the selectivity of referenced columns.
Compared with the COUNT(DISTINCT x) function, APPROX_COUNT_DISTINCT returns an approximate value, and therefore has an extremely fast calculation speed. When processing a large amount of data, COUNT(DISTINCT x) often takes a long time. APPROX_COUNT_DISTINCT, however, greatly improves the computing efficiency at a slight cost of accuracy.
Syntax
APPROX_COUNT_DISTINCT(expr)
Parameters
expr indicates the column to deduplicate.
Return type
The return type is NUMBER.
Examples
Execute the following statement to create table tbl1 and insert 10 data entries into it.
obclient> CREATE TABLE tbl1 (col1 INT,col2 INT,col3 varchar(10));
Query OK, 0 rows affected
obclient> INSERT INTO tbl1 VALUES (1,10,'a'),(2,20,'b'),(3,30,'c'),
(4,40,'a'),(5,50,'c'),(1,10,'a'),(2,20,'b'),(3,30,'c'),(4,30,'a'),(5,40,'b');
Query OK, 10 rows affected
Records: 10 Duplicates: 0 Warnings: 0
Return the number of unique values in the
col2column.obclient> SELECT APPROX_COUNT_DISTINCT(col2) FROM tbl1; +-----------------------------+ | APPROX_COUNT_DISTINCT(COL2) | +-----------------------------+ | 5 | +-----------------------------+ 1 row in setGroup and deduplicate the data records by the
col1andcol2columns, and count the number of each value in thecol1column.obclient> SELECT col1,APPROX_COUNT_DISTINCT(col2) FROM tbl1 GROUP BY col1; +------+-----------------------------+ | COL1 | APPROX_COUNT_DISTINCT(COL2) | +------+-----------------------------+ | 1 | 1 | | 2 | 1 | | 3 | 1 | | 4 | 2 | | 5 | 2 | +------+-----------------------------+ 5 rows in setGroup and deduplicate the data records by the
col1,col2, andcol3columns, and count the number of each value pair in thecol1andcol3columns.obclient> SELECT col1,col3,APPROX_COUNT_DISTINCT(col2) FROM tbl1 GROUP BY col1,col3; +------+------+-----------------------------+ | COL1 | COL3 | APPROX_COUNT_DISTINCT(COL2) | +------+------+-----------------------------+ | 1 | a | 1 | | 2 | b | 1 | | 3 | c | 1 | | 4 | a | 2 | | 5 | b | 1 | | 5 | c | 1 | +------+------+-----------------------------+ 6 rows in set