ZIPF

2023-10-31 11:17:12  Updated

Syntax

ZIPF(<s> , <N> , <gen>)

Purpose

ZIPF() returns a Zipf-distributed integer in the value range of [0, N) and with the distribution exponent of s.

  • The larger the s, the more skewed the generated sequence, and the steeper the curve of the sequence.
  • s and N must be scalar values that do not change with row iteration. For example, they can be integral or floating-point constants and scalar functions. In PL, they can also be @v1 and 1+@v3. The value range of s is [1, + ∞). The value range of N is [1, 16777215].
  • The amount of storage and computing resources consumed by the Zipf algorithm is related to the value of N. The space complexity of the algorithm is O(N). The time complexity is O(logN) for the generation of each integer. Therefore, the value range of N is limited to [1, 16777215].
  • gen is a numeric value generation function. Generally, the RANDOM() function is used. If the input value is a constant, ZIPF() returns a fixed value.

Examples

The following example shows how to use the ZIPF() function to return a Zipf-distributed integer.

obclient> SELECT ZIPF(1, 10, RANDOM()) FROM TABLE(GENERATOR(6));
+-----------------------+
|  ZIPF(1,10,RANDOM())  |
+-----------------------+
|                     2 |
|                     0 |
|                     0 |
|                     0 |
|                     3 |
|                     3 |
+-----------------------+
6 rows in set

obclient> SELECT ZIPF(1, 10, 0415) FROM TABLE(GENERATOR(6));
+-------------------+
| ZIPF(1, 10, 0415) |
+-------------------+
|                 1 |
|                 1 |
|                 1 |
|                 1 |
|                 1 |
|                 1 |
+-------------------+
6 rows in set

obclient> SELECT ZIPF(ABS(-1), 23, RANDOM()) FROM DUAL;
+-----------------------------+
| ZIPF(ABS(-1),23,RANDOM()) |
+-----------------------------+
|                           9 |
+-----------------------------+
1 row in set

Contact Us