Best practices for updating hotspot rows-OceanBase

In databases, hotspot rows refer to rows that are frequently accessed and modified. With the growth of online transactions and the e-commerce industry, hotspot concurrency has become an increasing challenge for business systems. For example, frequent balance updates to hot accounts within a short period or flash sale events for popular products during massive online promotions are typical scenarios. Hotspot row updates essentially involve high-concurrency modifications to specific fields, such as balance or inventory, of the same data row in the database within a short timeframe. However, to maintain transactional consistency in a relational database, updating a data row must follow the process of locking, updating, writing a log commit, and releasing the lock—an inherently serial process. As a result, the ability to handle hotspot row updates becomes a performance bottleneck for the database. The key to improving this capability lies in reducing the lock holding time during transactions.

Although the early lock release (ELR) technique has been proposed in academia for some time, its complex exception-handling scenarios have hindered the development of mature industrial implementations. To address this, OceanBase Database has conducted continuous exploration and introduced an ELR implementation based on a distributed architecture. This approach enhances the database's ability to handle concurrent updates to single rows in similar business scenarios. The ELR feature is now a key capability of OceanBase Database in scalable online transaction processing (OLTP).

This topic discusses best practices for updating hotspot rows, including technical background, performance stress testing, and considerations.

Technical background

Before optimization

When you initiate a COMMIT operation, the database triggers the log persistence process, which involves the following steps:

Serialize the in-memory data and submit it to the local buffer manager.
Send the log data to all followers.
The database considers log persistence successful only after the logs are successfully synchronized to majority of the followers.
The transaction releases the lock and returns a commit success response to the client.

During this process, the transaction holds a lock for the following operations: data writing, log serialization, log synchronization between the leader and followers, and log flushing to disk. In scenarios where OceanBase Database is deployed across five IDCs in three regions or where disk performance is suboptimal, the lock holding duration can be significantly extended, which severely impacts the performance of hotspot rows.

After optimization

In the optimization solution, the overall commit process remains unchanged, but the timing of unlocking has been adjusted. In the new process, once the logs are serialized and submitted to the buffer manager, the transaction is unlocked immediately—without waiting for the logs to be flushed to the majority of followers. This approach effectively reduces the transaction's lock holding time. Once the transaction is unlocked, subsequent transactions can operate on the same row, enabling higher concurrency and improving system throughput.

Performance analysis

Based on the optimization solution described above, the performance of hotspot row scenarios can be calculated using the following formula:

TPS = 1/Lock holding time per transaction on a hotspot row

Here, the lock holding time refers to the interval from the start of locking to the completion of the transaction commit.

In the deployment scenario of five IDCs across three regions, the overall SQL execution time is approximately 30ms, and the transaction's COMMIT response time (RT) is also around 30ms. With this optimization, the performance can essentially match that of intra-city deployments.

Performance stress testing

Before implementing best practices for hotspot row updates, it is necessary to conduct performance testing to evaluate the current system's performance and compare it with the results after optimization.

Environment information

The deployment environment includes one OceanBase Database Proxy (ODP) instance configured with ecs.g6.8xlarge and one OBServer instance configured with ecs.r6.4xlarge. To improve performance and reliability, the storage system uses three independent ESSD PL1 disks, which are configured and deployed separately. This separation allows for independent management of data storage, log storage, and redo log storage, optimizing disk I/O performance and fault recovery capabilities.

The specific content of the deploy.yaml configuration file is as follows:

oceanbase-ce:
  version: 4.2.0.0
  servers:
    - name: server1
      ip: xxx.xx.xxx.01
  global:
    devname: eth0
    memory_limit: 120G # The maximum running memory for an observer
    log_disk_percentage: 90
    datafile_disk_percentage: 90
    enable_syslog_wf: false # Print system logs whose levels are higher than WARNING to a separate log file. The default value is true.
    enable_syslog_recycle: true # Enable auto system log recycling or not. The default value is false.
    max_syslog_file_count: 1000 # The maximum number of reserved log files before enabling auto recycling. The default value is 0.
    appname: obcluster
    root_password: ******
  server1:
    mysql_port: 2881 # External port for OceanBase Database. The default value is 2881. DO NOT change this value after the cluster is started.
    rpc_port: 2882 # Internal port for OceanBase Database. The default value is 2882. DO NOT change this value after the cluster is started.
    home_path: /data/1/ob/obd/observer
    data_dir: /data/3/ob/obd/storage
    redo_dir: /data/2/ob/obd/redo
    zone: zone1
obproxy-ce:
  version: 4.2.0.0
  depends:
    - oceanbase-ce
  servers:
    - xxx.xx.xxx.02
  global:
    listen_port: 2886 # External port. The default value is 2883.
    prometheus_listen_port: 2887 # The Prometheus port. The default value is 2884.
    home_path: /root/obproxy
    enable_cluster_checkout: false
    skip_proxy_sys_private_check: true
    enable_strict_kernel_release: false

Configuration information

Tenant specifications: 15 CPU cores and 95 GB of memory.

Test procedure

Step 1: Create a test table and insert test data

Based on the Sysbench benchmark, the schema for table creation is as follows:

CREATE TABLE `sbtest1` (
   `id` int(11) NOT NULL AUTO_INCREMENT,
   `k` int(11) NOT NULL DEFAULT '0',
   `c` char(120) NOT NULL DEFAULT '',
   `pad` char(60) NOT NULL DEFAULT '',
   PRIMARY KEY (`id`)
);

INSERT INTO sbtest1 VALUES(1,0,'aa','aa');

Step 2:Adjust the SQL statement for stress testing

If the ID column is the primary key, update only the row where id = 1. Change the value of non_index_updates in the oltp_common.lua script to "UPDATE sbtest1 SET k=k+1 WHERE id=1".

non_index_updates = {
    "UPDATE sbtest1 SET k=k+1 WHERE id=1",
    {t.CHAR, 120}, t.INT},

Step 3: Modify database parameters

ALTER SYSTEM SET enable_sql_audit=false;
ALTER SYSTEM SET enable_perf_event=false;

Notice

Parameters are critical in hotspot row update scenarios. We recommend that you do not modify the following parameters:

ALTER SYSTEM SET syslog_level='PERF';
ALTER SYSTEM SET enable_record_trace_log=false;
ALTER SYSTEM SET _enable_defensive_check=false;
ALTER SYSTEM SET _lcl_op_interval = '0ms';
ALTER SYSTEM set_tp tp_no = 2100, error_code = 4001, frequency = 1;
ALTER SYSTEM SET _trace_control_info='' tenant=all;

Step 4: Measure stress test data

QPS/95th percentile response time/Average response time

Concurrency	Default values	New values
8	2990.36/1.93/2.78	3283.19/1.96/2.68
16	3191.89/2.35/5.05	3775.30/2.03/4.42
32	3445.89/29.72/0.62	4409.73/3.68/7.52
64	3639.10/106.75/17.90	5323.44/102.97/12.61
128	3842.81/207.82/31.37	6347.05/110.66/20.69
256	5006.01/325.98/51.84	8026.34/211.60/32.14
512	6243.97/530.08/82.22	10016.74/320.17/50.71
1024	6334.96/802.05/162.51	10337.51/559.50/106.93
2048	4790.85/1506.29/413.44	7536.00/1109.09/289.70

Considerations

When verifying, testing, and tuning hotspot row performance, keep the following points in mind:

Parameters are essential in hotspot row update scenarios. We recommend that you do not modify the parameters given in Step 3, except enable_sql_audit and enable_perf_event. Modify other parameters based on the actual situation in the production environment.
The sequence of operations on a table with hotspot rows within a transaction significantly affects performance. Operations on hotspot rows should be placed as close to the COMMIT statement as possible to maximize performance.
If there is high network latency between the leader and follower replicas, the client must increase throughput to achieve performance equivalent to that of a single-IDC deployment scenario.