Auto-increment columns and sequences are widely used in applications as primary keys or serial numbers for table data, providing incremental and unique values. However, in OceanBase Database's distributed architecture, auto-increment columns and sequences differ significantly from those in traditional centralized databases. Overlooking these differences can result in unexpected implementation outcomes and performance issues. This topic provides an in-depth understanding of the concepts, implementation methods, and behavioral characteristics of auto-increment columns and sequences in OceanBase Database, along with relevant usage recommendations.
Basic concepts
Auto-increment column (AUTO_INCREMENT): A column attribute in a database table that automatically generates unique, incrementing values, typically used to represent the unique identifier for a row.
Sequence (SEQUENCE): A unique and usually incrementing value generated by the database according to specific rules, often used to generate unique identifiers and existing independently of tables.
Feature updates
OceanBase Database has continuously optimized auto-increment and sequence features across different versions:
V3.2.1: Auto-increment primary keys can be used as partitioning keyss. Auto-increment values are globally unique, but their order is not guaranteed within a partition—this differs from native MySQL behavior. All INSERT operations generate distributed insert plans, which may result in decreased performance.
V4.0.0: In MySQL-compatible mode, OceanBase Database supports two auto-increment modes. You can control the default mode using the tenant-level parameter DEFAULT_AUTO_INCREMENT_MODE, or specify AUTO_INCREMENT_MODE when creating a table. The default mode is ORDER.
- ORDER: Uses centralized caching at the underlying level, ensuring globally monotonically increasing values. This mode offers better compatibility with MySQL behavior.
- NOORDER: Uses distributed caching, guaranteeing only global uniqueness. Partitioned tables achieve better performance in this mode (auto-increment values are guaranteed to increase within each partition, but global monotonicity across the entire table is not ensured).
V4.2.2: Online modification is now supported for changing the data type of INTEGER columns. For primary keys, partitioning keyss, index columns, columns referenced by generated columns, and columns with CHECK constraints, if the column is of integer type and is changed to a type with a larger value range such as INT to BIGINT, this was previously handled via offline DDL in V4.2.1, which involved table locks and blocked reads and writes. Starting with V4.2.2, offline DDL has been upgraded to online DDL, so changing integer column types no longer affects business write operations.
V4.2.3 and later: The starting value for auto-increment columns can now be reduced. If a table has existing data and the maximum value in the auto-increment column is greater than or equal to the new specified AUTO_INCREMENT value, the new value will automatically adjust to the next value after the current maximum. For example, if the current maximum is 5, the current AUTO_INCREMENT value is 8, and you set AUTO_INCREMENT to any value between 0 and 6, after execution, the actual AUTO_INCREMENT value will be set to 6. This behavior is compatible with native MySQL.
In OceanBase Database V4.2.3, you can set the auto-increment cache size (CACHE SIZE) individually for each table. Previously, the cache size for all tables was controlled by the global parameter AUTO_INCREMENT_CACHE_SIZE, which applied to each node. Now, when creating a table, you can specify the AUTO_INCREMENT_CACHE_SIZE parameter to set different cache strategies for different tables.
In OceanBase Database V4.2.3, when using ORDER mode, the auto-increment value remains continuous after a switchover (failover).
How auto-increment columns work
OceanBase Database supports two modes for auto-increment columns, which differ primarily in how they manage caching.
NOORDER mode (distributed caching): Each OBServer node independently requests and caches auto-increment intervals from an internal table. This mode delivers high performance, but does not guarantee global incrementality and value gaps may occur. This mode is suitable for performance-sensitive scenarios or applications that can tolerate non-sequential values.
ORDER mode (centralized caching): A leader OBServer node is elected to serve as the auto-increment service node, and other nodes request values via RPC calls to this leader. This mode generally produces continuous incremental values and is compatible with MySQL behavior. However, under high concurrency, its performance is slightly lower than NOORDER mode, and value gaps may still occur during leader switchover. Starting with OceanBase Database V4.2.3, value gaps no longer occur after a leader switchover.
Comparison with centralized databases
| Item | Centralized database (MySQL) | Distributed database (OceanBase) |
|---|---|---|
| Auto-increment continuity | Strongly guaranteed (single-node) | Not strictly guaranteed, depends on mode |
| Performance | High (local cache) | High in NOORDER mode, slightly lower in ORDER mode |
| Fault tolerance | Single point of failure | Multi-replica disaster recovery |
| Risk of value gaps | Low (only on restart) | Higher (multiple nodes, leader switchover) |
| Architecture | Single-node cache | Distributed coordination |
How sequences work
Among OceanBase Database users, many applications were originally running on DB2 or Oracle and are now planning to migrate to MySQL. This involves migrating their workloads from DB2 or Oracle to OceanBase Database's MySQL-compatible tenants. To reduce the complexity for customers in refactoring applications that make extensive use of SEQUENCE during migration, OceanBase Database provides SEQUENCE functionality in MySQL-compatible mode that is compatible with Oracle behavior.
Scenarios:
- Migrating workloads from DB2 or Oracle to a MySQL-compatible tenant in OceanBase Database.
- When auto-increment columns bound to tables cannot meet business requirements. Sequences (SEQUENCE) are not bound to a specific table—they can be created independently and used across multiple tables.
- When auto-increment columns lack the CYCLE capability and stop working after reaching MAXVALUE, which does not meet business needs. Sequences support cyclic behavior with the CYCLE option.
Syntax for creating an auto-increment column
-- Define an auto-increment column when creating a table (in ORDER mode by default)
CREATE TABLE t_user (
id BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(50)
) AUTO_INCREMENT = 1;
-- Explicitly specify the NOORDER mode
CREATE TABLE t_order (
id BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
order_no VARCHAR(20)
) AUTO_INCREMENT_MODE = 'NOORDER';
Syntax for creating sequences
A sequence is an object independent of tables and can be used for sharing primary keys across multiple tables or for generating complex business numbers.
CREATE SEQUENCE seq_order_id
START WITH 1
INCREMENT BY 1
MINVALUE 1
MAXVALUE 9999999999
NOCYCLE
NOORDER
CACHE 100;
Comparison: Auto-increment columns vs. Sequences
- Auto-increment columns: Bound to a specific table and can only be used for generating primary keys within that table.
- Sequences: Independent of tables, can be used across multiple tables, and support cyclic sequences (CYCLE).
Example:
-- Create a table with an auto-increment column. Auto-increment columns are tightly bound to tables.
obclient [test]> CREATE TABLE t1 (id bigint not null AUTO_INCREMENT PRIMARY KEY, name VARCHAR(50));
obclient [test]> INSERT INTO t1 (name) VALUES ('A'),('B'),('C');
obclient [test]> SELECT * FROM t1;
+----+------+
| id | name |
+----+------+
| 1 | A |
| 2 | B |
| 3 | C |
+----+------+
3 rows in set (0.021 sec)
-- Create a sequence starting at 1, with a minimum value of 1, maximum value of 5, increment of 2. The sequence does not restart after reaching the maximum value.
obclient [test]> CREATE SEQUENCE seq1 START WITH 1 MINVALUE 1 MAXVALUE 5 INCREMENT BY 2 NOCYCLE;
obclient [test]> SELECT seq1.nextval FROM DUAL;
+---------+
| nextval |
+---------+
| 1 |
+---------+
1 row in set (0.012 sec)
obclient [test]> SELECT seq1.nextval FROM DUAL;
+---------+
| nextval |
+---------+
| 3 |
+---------+
1 row in set (0.004 sec)
obclient [test]> SELECT seq1.nextval FROM DUAL;
+---------+
| nextval |
+---------+
| 5 |
+---------+
1 row in set (0.004 sec)
-- If NOCYCLE is specified, no larger sequence values can be generated after the maximum value is reached.
obclient [test]> SELECT seq1.nextval FROM DUAL;
ERROR 4332 (HY000): sequence exceeds MAXVALUE and cannot be instantiated
-- Create another sequence starting at 1, minimum value 1, maximum value 5, increment 2, with cycling enabled and 2 sequence values preallocated in memory.
obclient [test]> CREATE SEQUENCE seq7 START WITH 1 MINVALUE 1 MAXVALUE 5 INCREMENT BY 2 CYCLE CACHE 2;
obclient [test]> SELECT seq7.nextval FROM DUAL;
+---------+
| nextval |
+---------+
| 1 |
+---------+
1 row in set (0.009 sec)
obclient [test]> SELECT seq7.nextval FROM DUAL;
+---------+
| nextval |
+---------+
| 3 |
+---------+
1 row in set (0.005 sec)
obclient [test]> SELECT seq7.nextval FROM DUAL;
+---------+
| nextval |
+---------+
| 5 |
+---------+
1 row in set (0.005 sec)
obclient [test]> SELECT seq7.nextval FROM DUAL;
+---------+
| nextval |
+---------+
| 1 |
+---------+
1 row in set (0.001 sec)
-- Sequences can be used in SELECT, INSERT, and UPDATE statements.
obclient [test]> CREATE TABLE t2 (c1 int);
obclient [test]> INSERT INTO t2 VALUES (seq7.nextval);
obclient [test]> SELECT * FROM t2;
+------+
| c1 |
+------+
| 3 |
+------+
1 row in set (0.001 sec)
obclient [test]> UPDATE t2 SET c1 = seq7.nextval;
obclient [test]> SELECT * FROM t2;
+------+
| c1 |
+------+
| 5 |
+------+
1 row in set (0.001 sec)
Note
Differences between sequences and auto-increment columns:
- When creating a sequence, the default attribute is NOORDER (for compatibility with Oracle behavior).
- When creating an auto-increment column, the default attribute is ORDER (for compatibility with MySQL behavior).
Value gap analysis in NOORDER mode
As a distributed database, OceanBase Database may experience value gaps in the auto-increment mechanism during scenarios such as high availability switching, major compaction, and node failures.
Scenario 1: Auto-increment values generated by multiple machines and partitions
Assume that the AUTO_INCREMENT_CACHE_SIZE parameter is set to 100. When the INSERT INTO VALUES (NULL) statement is sequentially received by the OBServer nodes hosting the partitioned table, namely, OBServer1, OBServer2, and OBServer3, the internal processing logic of each node is as follows:
- OBServer1 detects that it has no cache, requests an auto-increment range [1,100] from the internal table, and generates the auto-increment value 1.
- OBServer2 detects that it has no cache, requests an auto-increment range [101,200] from the internal table, and generates the auto-increment value 101.
- OBServer3 detects that it has no cache, requests an auto-increment range [201,300] from the internal table, and generates the auto-increment value 201.
- OBServer1 uses its cache [2,100] to generate the auto-increment value 2.
- OBServer2 uses its cache [102,200] to generate the auto-increment value 102.
- ...
In this case, the order of auto-increment values is 1, 101, 201, 2, 102, ..., indicating that value gaps always occur in the auto-increment values.
Scenario 2: Insert a specified maximum value using the INSERT statement
In a MySQL database, if a specified value is inserted into an auto-increment table, subsequent auto-increment values will not be less than this value. If a value within the current cache range is inserted, the cache will be abandoned.
In the distributed environment of OceanBase Database, if a specified value is inserted and it is the maximum value in the auto-increment table, the OBServer node must not only recognize that the maximum value has been inserted, but also synchronize this value to other OBServer nodes and the internal table. This synchronization process is time-consuming. To avoid this synchronization every time a maximum value is specified, the system will abandon the current cache when a maximum value is inserted. This way, no further synchronization is needed until the next cache is allocated.
For example, when the OBServer nodes hosting the partitioned table, namely, OBServer1, OBServer2, and OBServer3, sequentially receive requests to insert an explicitly specified incrementing sequence (1, 2, 3, ...), and assuming that these nodes have caches:
- OBServer1 receives the value 1, abandons the cache [1,100], and then obtains a new cache range [301,400] from the internal table. It also synchronizes the value 101 to the internal table and other OBServer nodes.
- OBServer2 receives the value 2, which is smaller than the current cache range [101,200], so it takes no action.
- OBServer3 receives the value 3, which is smaller than the current cache range [201,300], so it takes no action.
- OBServer1 receives the value 4, which is smaller than the current cache range [301,400], so it takes no action.
In this case, if some values are inserted and then the auto-increment column is used to generate a sequence, value gaps will occur in the auto-increment values. For example, OBServer1 skips from the initial cache range [1,100] directly to 301.
In addition to multi-node environments, in single-node environments, inserting a specified maximum value can also cause value gaps in auto-increment values. Here is an example:
Create a table named t1 with an auto-increment column:
obclient> CREATE TABLE t1 (c1 int not null AUTO_INCREMENT) AUTO_INCREMENT_MODE = 'NOORDER';
Set the AUTO_INCREMENT_CACHE_SIZE parameter to 100.
Insert data into the table multiple times:
obclient> INSERT INTO t1 VALUES (NULL);
obclient> INSERT INTO t1 VALUES (3);
obclient> INSERT INTO t1 VALUES (NULL);
After the insertion is successful, view the data in the table:
obclient> SELECT * FROM t1;
The query result is as follows:
+-----+
| c1 |
+-----+
| 1 |
| 3 |
| 101 |
+-----+
We can see that the auto-increment column jumps from 3 to 101.
Scenario 3: Server restart or node failure
The cache for the auto-increment column is an in-memory structure. If an OBServer node experiences a restart or failure, the unused portion of the cache range on that node will not be written back to the internal table. This means that the unused part of the cache range will not be used again. For example, assume that the initial cache range for the auto-increment column on OBServer1 is [1,100], and auto-increment values 1 and 2 have already been generated. If OBServer1 fails, after it restarts, the cache range on the node becomes a new range [101,200], and the next auto-increment value is 101. This results in an auto-increment value sequence of 1, 2, 101, ..., indicating a value gap.
Value gap analysis in ORDER mode
To address the auto-increment value gap issue in NOORDER mode, OceanBase Database introduced ORDER mode in V4.x. This mode offers better compatibility with MySQL databases. It prevents auto-increment value gaps when generating values across multiple servers and partitions, and also when inserting a specified maximum value using an INSERT statement.
Although ORDER mode resolves the value gap issues in scenarios such as generating auto-increment values across multiple servers and partitions, and inserting a specified maximum value with an INSERT statement, auto-increment value gaps can still occur if the OBServer node serving as the leader restarts or fails, or if a leader switch occurs.
Scenario 1: Server restart or node failure
In ORDER mode, the OBServer node serving as the leader maintains an in-memory cache range of auto-increment values. If the node hosting the leader fails or restarts, the unused auto-increment values in the current cache range are not reused; instead, a new cache range is allocated, resulting in a value gap.
Note
In this scenario, a value gap only occurs if the OBServer node serving as the leader fails or restarts. Follower OBServer nodes do not maintain a cache, so even if they fail, the continuity of auto-increment value generation is not affected.
Scenario 2: Leader switchover
Assume the initial auto-increment cache range for the auto-increment column on OBServer2 is [1, 100], and values 1 and 2 have already been generated. During a leader switchover:
The leader switches to OBServer1. OBServer1 requests a new auto-increment range [101, 200] from the internal table and continues to generate values 101 and 102. After OBServer2 restarts, the leader switches back to OBServer2, which continues to use the previous cache range [3, 100] to generate values 3 and 4. This shows that the auto-increment values are not monotonically increasing, as the value drops from 101 to 3.
To prevent non-increasing auto-increment values due to repeated leader switches, OceanBase Database clears the cache range of the original leader OBServer node during a leader switch, which causes a value gap.
Behavior of value gaps in different modes
| Scenario | NOORDER mode (best performance) | ORDER mode (maximum compatibility) |
|---|---|---|
| Generating auto-increment values in multi-machine, multi-partition environments | Each machine maintains its own cache range, which may cause the insertion order in the table to experience value gaps. However, no auto-increment values are skipped or wasted due to these gaps. | All nodes request auto-increment values from the leader, so no value gaps occur. However, performance is lower compared to noorder mode. |
| Explicitly specifying auto-increment values (insert, insert on duplicate key update, replace) | If an explicit value is inserted into the auto-increment column, the system refreshes the auto-increment value cache range on all nodes to ensure subsequent values are not less than the specified value. This cache refresh can cause value gaps and some values to be skipped or wasted. | No value gap occurs. |
| INSERT ON DUPLICATE UPDATE/REPLACE INTO scenarios without specifying auto-increment values | Earlier versions required a global cache refresh, but this is no longer necessary in the latest versions of all branches. No special value gap occurs in this scenario. | Earlier versions required a global cache refresh, but this is no longer necessary in the latest versions of all branches. No special value gap occurs in this scenario. |
| Machine restart or failure | After a machine restarts or fails, any unused values in the cache range cannot be reused, and a new cache range must be obtained. This can result in value gaps and skipped auto-increment values due to the restart or failure. | The leader node maintains a cache range. If the leader node restarts or fails, unused auto-increment values in the cache range are not reused, leading to value gaps and skipped values. Note that this only occurs on the leader node; follower nodes do not maintain a cache, so their restart or failure does not affect the continuity of auto-increment value generation. |
| Active leader switchover (such as upgrading or downgrading OBServer) | There is no mechanism to prevent value gaps during a leader switchover. | In earlier versions, a leader switchover would clear the cache range of the original leader node, resulting in value gaps and skipped values. Since v4.2.3, this has been optimized so that normal leader switchovers do not cause value gaps. However, note that a leader switchover can affect server availability, and some insert requests may be retried after the switchover, potentially causing minor value gaps. |
Causes of sequence gaps
Sequence gaps are primarily caused by the CACHE mechanism and ORDER mode:
Gaps caused by the CACHE mechanism: When CACHE is set to 100, the system preallocates 100 values. If a node fails, any unused values are lost. After a restart, the system starts from the next cache segment, resulting in a gap.
Gaps in NOORDER mode during concurrent operations: When multiple sessions simultaneously request NEXTVAL, the system does not guarantee the order, so values may be returned out of sequence.
ORDER mode ensures sequence but sacrifices performance: Using ORDER mode guarantees that the sequence strictly increases, but it requires locking and coordination, which impacts concurrent performance.
CACHE setting recommendations
| Setting | Recommended value | Description |
|---|---|---|
| Auto-increment cache | AUTO_INCREMENT_CACHE_SIZE = 1000000 (default) |
A large cache reduces metadata contention and improves performance, but can cause large gaps during switchover. You can decrease the cache size as needed, such as setting it to 10,000. Table-level settings are supported. |
| Sequence cache | CACHE 100 ~ 1000 |
Balances performance and the risk of gaps. Avoid using NOCACHE in production environments, as it leads to poor performance. |
Note
- If your business cannot tolerate gaps, you can disable caching (NOCACHE), but you will accept a performance drop.
- After V4.2.3, sequences support order + cache, which can also prevent gaps.
ORDER setting recommendations
| Mechanism | Recommended setting | Description |
|---|---|---|
| Auto-increment column | Default ORDER mode | Ensures global increment. If you allow gaps and prioritize performance, you can set it to NOORDER (which only ensures uniqueness). |
| Sequence | NOORDER (default) | Provides optimal performance. Use ORDER only when strict increment is required. |
Notice
ORDER mode requires cross-node coordination and has high overhead. Use it only in scenarios that require strong ordering, such as financial applications.
Recommendations on data type selection
When should the data type of an auto-increment column be set to BIGINT?
- Customer data grows rapidly and is retained for long periods.
- Customers do not mind whether the auto-increment column uses BIGINT or INT.
- Leader failures and machine switchovers become more frequent (due to downtime or random load balancing), increasing the likelihood of value gaps.
- In NOORDER mode, the auto-increment value must be explicitly specified.
When is it acceptable to keep the data type of the auto-increment column as INT?
- The business data volume is well below the maximum limit of INT.
- Data is migrated from MySQL, and the business must use INT to avoid compatibility issues with applications.
- The system is deployed on a single node, and switchovers are rare.
- There is sufficient monitoring and operations capability to handle situations when the auto-increment value approaches its limit, such as rebuilding the table or converting INT to BIGINT (supported since V4.2.2).
Recommendations for mode selection
When can you change the AUTO_INCREMENT column to NOORDER mode?
- If you do not require ordered insertion for the AUTO_INCREMENT column and expect improved performance for high-concurrency operations, you can change the mode to NOORDER.
- If ordered insertion is required for the AUTO_INCREMENT column, but all leaders are located on a single OBServer node and you expect improved performance for high-concurrency operations, you can change the mode to NOORDER.
Note
In standalone mode, the performance difference between ORDER and NOORDER modes is not significant. However, in multi-node (multi-partition) scenarios, NOORDER mode outperforms ORDER mode under high concurrency.
When can you reduce the cache size of the AUTO_INCREMENT column?
- If skip gaps are prominent.
- If business traffic is very low.
- If performance is not a critical concern.
- If performance requirements are relatively high, but the system is running in standalone mode with all leaders concentrated on a single node.
Related configuration parameters
Global system variables
| Variable | Description | Default value |
|---|---|---|
AUTO_INCREMENT_CACHE_SIZE |
Number of auto-increment values cached. | 1,000,000 (V4.0 and later) |
AUTO_INCREMENT_INCREMENT |
Step size for auto-increment values. | 1. This variable can also be set at the session level. |
AUTO_INCREMENT_OFFSET |
Starting value for the auto-increment column. | 1. This variable can also be set at the session level. |
Tenant-level parameters
| Parameter | Description | Default value |
|---|---|---|
DEFAULT_AUTO_INCREMENT_MODE |
Default auto-increment mode for auto-increment columns. • order: Auto-increment values are continuous and incremental. • noorder: Auto-increment values are unique but not necessarily continuous. |
Before V4.0, only noorder was supported. V4.0 and later: order. |
Table options
| Option | Description | Default value |
|---|---|---|
AUTO_INCREMENT_MODE |
Auto-increment mode for the auto-increment column. | If not specified, the value of the DEFAULT_AUTO_INCREMENT_MODE parameter is used. |
AUTO_INCREMENT_CACHE_SIZE |
Number of auto-increment values cached in memory. | If not specified, the value of the AUTO_INCREMENT_CACHE_SIZE system variable is used. |
AUTO_INCREMENT_OFFSET |
Starting auto-increment value for the table. | 1. |
Performance optimization suggestions
How should you set relevant attributes when creating a sequence with performance overhead in mind?
If you set the ORDER attribute, each NEXTVALUE request must be processed by the central node to update a specific internal table in order to ensure global ordering. In high-concurrency scenarios, this can result in significant lock contention. If you do not require sequence values to be strictly increasing and only need them to be unique, it is recommended to set the sequence attribute to NOORDER.
If high performance is required, you should also pay attention to the CACHE and NOCACHE attributes:
- NOCACHE: Indicates that OBServer does not cache auto-increment values. In this mode, each NEXTVAL call triggers a SELECT and UPDATE on the internal table, which can impact database performance.
- CACHE: Specifies the number of auto-increment values cached in each OBServer's memory. The default value is 20.
Note
When creating a sequence, the default CACHE value is too small, so it should be specified manually. If the single-node TPS is 100, it is recommended to set the CACHE SIZE to 360,000.
You can also use ORDER + CACHE. In this configuration, the internal table is updated every CACHE values, which can help avoid value gaps.
Table-level CACHE SIZE specification
You can control the number of auto-increment values cached in memory per request for an auto-increment column by configuring AUTO_INCREMENT_CACHE_SIZE. Generally, the larger the cache size, the better the performance. However, if there are frequent cache gaps, the auto-increment column may be quickly exhausted, leading to primary key conflicts and errors.