Purpose
This function calculates the number of unique rows in the specified column and returns an approximate value. This function can be used to determine the selectivity of the referenced column.
Compared with the COUNT(DISTINCT x) function, APPROX_COUNT_DISTINCT returns an approximate value, which makes it significantly faster. When processing large amounts of data, COUNT(DISTINCT x) can take a long time. APPROX_COUNT_DISTINCT sacrifices a small amount of precision for a significant improvement in computational efficiency.
Syntax
APPROX_COUNT_DISTINCT(expr)
Parameters
expr specifies the numeric column to be deduplicated.
Return type
Returns data of the NUMBER type.
Examples
Create a table named tbl1 and insert ten rows of data.
obclient> CREATE TABLE tbl1 (col1 INT,col2 INT,col3 varchar(10));
Query OK, 0 rows affected
obclient> INSERT INTO tbl1 VALUES (1,10,'a'),(2,20,'b'),(3,30,'c'),
(4,40,'a'),(5,50,'c'),(1,10,'a'),(2,20,'b'),(3,30,'c'),(4,30,'a'),(5,40,'b');
Query OK, 10 rows affected
Records: 10 Duplicates: 0 Warnings: 0
Count the number of unique values in the
col2column.obclient> SELECT APPROX_COUNT_DISTINCT(col2) FROM tbl1; +-----------------------------+ | APPROX_COUNT_DISTINCT(COL2) | +-----------------------------+ | 5 | +-----------------------------+ 1 row in setGroup by the
col1andcol2columns, deduplicate, and count the number of unique values in thecol1column.obclient> SELECT col1,APPROX_COUNT_DISTINCT(col2) FROM tbl1 GROUP BY col1; +------+-----------------------------+ | COL1 | APPROX_COUNT_DISTINCT(COL2) | +------+-----------------------------+ | 1 | 1 | | 2 | 1 | | 3 | 1 | | 4 | 2 | | 5 | 2 | +------+-----------------------------+ 5 rows in setGroup by the
col1,col2, andcol3columns, deduplicate, and count the number of unique combinations of values in thecol1andcol3columns.obclient> SELECT col1,col3,APPROX_COUNT_DISTINCT(col2) FROM tbl1 GROUP BY col1,col3; +------+------+-----------------------------+ | COL1 | COL3 | APPROX_COUNT_DISTINCT(COL2) | +------+------+-----------------------------+ | 1 | a | 1 | | 2 | b | 1 | | 3 | c | 1 | | 4 | a | 2 | | 5 | b | 1 | | 5 | c | 1 | +------+------+-----------------------------+ 6 rows in set
