Purpose
The GENERATOR function is a special built-in function designed for use within the Table Function framework. Its primary role is to generate a virtual table with a specified number of rows. This table does not contain any actual data but provides a structure for inserting data. The GENERATOR function is commonly used to generate a set of consecutive rows for batch operations or to simulate datasets.
Table Functions are an advanced feature in SQL that allows functions to return entire tables, not just single values. This means Table Functions can produce multiple rows and columns, similar to regular data tables.
TABLE(GENERATOR(N)) generates a virtual table with N rows, where N is a 64-bit integer greater than or equal to 0. If you want this virtual table to include specific data, you need to add data generation logic to your SELECT statement, such as constants, calculated expressions, function calls, or user-defined functions (UDFs).
TABLE(GENERATOR(N)) supports parallel execution. You can use it in parallel SELECT queries or parallel Data Manipulation Language (DML) operations. However, the process of generating these virtual rows is not parallel; only one thread is responsible for reading the rows generated by TABLE(GENERATOR(N)). Therefore, in Data Flow Operations (DFOs) that include TABLE(GENERATOR(N)), the Degree of Parallelism (DOP) setting should be 1.
Limitations and considerations
TABLE(GENERATOR(N))generates a virtual table without actual data, which needs to be filled in theSELECTstatement.- Although
TABLE(GENERATOR(N))can be used in parallel operations, the row generation process itself is single-threaded.
Syntax
TABLE(GENERATOR(N))
Parameters
N: A 64-bit positive integer greater than or equal to 0, used to specify the number of rows to generate.
Examples
- Example 1:
obclient [SYS]> SELECT * FROM TABLE(GENERATOR(10));
The above SQL statement generates a virtual table with 10 rows, each of which is empty because no data is specified to display.
The result is as follows:
+--------------+
| COLUMN_VALUE |
+--------------+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
| 10 |
+--------------+
10 rows in set
If you want to generate specific data, you can write it like this:
obclient [SYS]> SELECT 'Hello, World!' AS greeting FROM TABLE(GENERATOR(10));
This query generates a virtual table with 10 rows, each of which has the greeting column containing the string Hello, World!.
The result is as follows:
+---------------+
| GREETING |
+---------------+
| Hello, World! |
| Hello, World! |
| Hello, World! |
| Hello, World! |
| Hello, World! |
| Hello, World! |
| Hello, World! |
| Hello, World! |
| Hello, World! |
| Hello, World! |
+---------------+
10 rows in set
- Example 2: Use the
GENERATOR()function to generate random numbers that meet the conditions.
obclient> SELECT COUNT(*) FROM TABLE(GENERATOR(100000));
+----------+
| COUNT(*) |
+----------+
| 100000 |
+----------+
1 row in set
obclient> SELECT NORMAL(0, 1, RANDOM()) FROM TABLE(GENERATOR(5));
+------------------------+
| NORMAL(0, 1, RANDOM()) |
|------------------------|
| 0.227384164 |
| 0.9945290748 |
| -0.2045078571 |
| -1.594607893 |
| -0.8213296842 |
+------------------------+
5 rows in set
obclient> SELECT RANDSTR(1, ZIPF(1, 5, RANDOM())) str FROM TABLE(GENERATOR(5));
+------------------------+
| str |
|------------------------|
| A |
| D |
| A |
| A |
| C |
+------------------------+
5 rows in set
- Example 3: The
GENERATOR()function can also be joined with other tables, as shown in the example below:
obclient> CREATE TABLE t1 (c1 INT);
Query OK, 0 rows affected
obclient> INSERT INTO t1 VALUES (1), (2);
Query OK, 2 rows affected
Records: 2 Duplicates: 0 Warnings: 0
obclient> SELECT c1, RANDOM(1) FROM t1, TABLE(GENERATOR(3));
+------+----------------------+
| c1 | RANDOM(1) |
+------+----------------------+
| 1 | -6753783847308464280 |
| 2 | -6707106347154343346 |
| 1 | -899926183391115878 |
| 2 | -8835543475904200562 |
| 1 | -2750444335953844424 |
| 2 | 7588216632478230601 |
+------+----------------------+
6 rows in set
obclient> EXPLAIN SELECT c1, RANDOM(1) FROM t1, TABLE(GENERATOR(3));
+----------------------------------------------------------------------------------+
| Query Plan |
+----------------------------------------------------------------------------------+
| ================================================================== |
| |ID|OPERATOR |NAME |EST.ROWS|EST.TIME(us)| |
| ------------------------------------------------------------------ |
| |0 |NESTED-LOOP JOIN CARTESIAN | |398 |14 | |
| |1 |├─FUNCTION_TABLE |FUNC_TABLE1|199 |1 | |
| |2 |└─MATERIAL | |2 |2 | |
| |3 | └─TABLE FULL SCAN |T1 |2 |2 | |
| ================================================================== |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output([T1.C1], [RANDOM(cast(1, BIGINT(-1, 0)))]), filter(nil), rowset=256 |
| conds(nil), nl_params_(nil), use_batch=false |
| 1 - output(nil), filter(nil) |
| value(GENERATOR(cast(3, BIGINT(-1, 0)))) |
| 2 - output([T1.C1]), filter(nil), rowset=256 |
| 3 - output([T1.C1]), filter(nil), rowset=256 |
| access([T1.C1]), partitions(p0) |
| is_index_back=false, is_global_index=false, |
| range_key([T1.__pk_increment]), range(MIN ; MAX)always true |
+----------------------------------------------------------------------------------+
19 rows in set
obclient> SELECT /*+ PARALLEL(2) */ c1, RANDOM(1) FROM t1, TABLE(GENERATOR(3));
+------+----------------------+
| c1 | random(1) |
+------+----------------------+
| 1 | -6753783847308464280 |
| 2 | -6707106347154343346 |
| 1 | -899926183391115878 |
| 2 | -8835543475904200562 |
| 1 | -2750444335953844424 |
| 2 | 7588216632478230601 |
+------+----------------------+
6 rows in set
obclient> EXPLAIN SELECT /*+ PARALLEL(2) */ c1, RANDOM(1) FROM t1, TABLE(GENERATOR(3));
+----------------------------------------------------------------------------------+
| Query Plan |
+----------------------------------------------------------------------------------+
| ================================================================== |
| |ID|OPERATOR |NAME |EST.ROWS|EST.TIME(us)| |
| ------------------------------------------------------------------ |
| |0 |NESTED-LOOP JOIN CARTESIAN | |398 |13 | |
| |1 |├─FUNCTION_TABLE |FUNC_TABLE1|199 |1 | |
| |2 |└─MATERIAL | |2 |3 | |
| |3 | └─PX COORDINATOR | |2 |3 | |
| |4 | └─EXCHANGE OUT DISTR |:EX10000 |2 |2 | |
| |5 | └─PX BLOCK ITERATOR | |2 |1 | |
| |6 | └─TABLE FULL SCAN |T1 |2 |1 | |
| ================================================================== |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output([T1.C1], [RANDOM(cast(1, BIGINT(-1, 0)))]), filter(nil), rowset=256 |
| conds(nil), nl_params_(nil), use_batch=false |
| 1 - output(nil), filter(nil) |
| value(GENERATOR(cast(3, BIGINT(-1, 0)))) |
| 2 - output([T1.C1]), filter(nil), rowset=256 |
| 3 - output([T1.C1]), filter(nil), rowset=256 |
| 4 - output([T1.C1]), filter(nil), rowset=256 |
| dop=2 |
| 5 - output([T1.C1]), filter(nil), rowset=256 |
| 6 - output([T1.C1]), filter(nil), rowset=256 |
| access([T1.C1]), partitions(p0) |
| is_index_back=false, is_global_index=false, |
| range_key([T1.__pk_increment]), range(MIN ; MAX)always true |
+----------------------------------------------------------------------------------+
26 rows in set
