Distributed transactions|V4.1.0| docs|Distributed Database

Distributed transactions

Last Updated：2023-07-28 02:55:42 Updated

The commit time of a transaction varies based on the transaction type. We recommend that you reduce the proportion of cross-server distributed transactions in performance tuning. You can optimize a transaction model from the following perspectives:

Overall business logic
Specific transactions
Proportions of multi-table and single-table transactions

Transaction types

SQL execution plans are divided into the local, remote, distributed, and uncertain types. A local execution plan indicates that the partition leader involved in the current statement is on the same server as that of the session. A remote execution plan indicates that the partition leader involved in the current statement is located on a server different from that of the session. A distributed or uncertain execution plan indicates that the relationship between the partition leader and the session is uncertain. They may be on the same server or multiple servers.

Single-server transactions demonstrate higher performance than distributed transactions. Based on the statistics on execution plan types, you can roughly estimate the proportion of distributed transactions and make the corresponding adjustments. Sample SQL statements:

MySQL [oceanbase]> select plan_type, count(1) from gv$ob_sql_audit where
request_time > time_to_usec('2021-08-24 18:00:00') group by plan_type;
+-----------+----------+
| plan_type | count(1) |
+-----------+----------+
|         1 |    17119 |
|         0 |     9614 |
|         3 |     4400 |
|         2 |    23429 |
+-----------+----------+
4 rows in set

The value of plan_type can be 1, 2, 3, or 4, indicating local plans, remote plans, distributed plans, or uncertain plans, respectively. Generally, 0 indicates SQL statements without a plan, such as set autocommit=0/1 and commit.

Analysis of non-local plans

Compared with single-server transactions, transactions for requests that use non-local plans (except for those whose plan_type is 0) may involve cross-server operations and affect the query performance. You can check for the following scenarios:

The primary zone is set to a single zone & a single unit.
The primary zone is set to a single zone & multiple units.
The primary zone is set to RANDOM.

Single zone & single unit for primary zone

Remote and uncertain plans are not expected for single-unit deployment for the following reasons:

The followers of the tenant are directly connected. You can check for the case based on the statistics on execution plan types.
The application is connected to OceanBase Database Proxy (ODP), but the first SQL statement of the transaction cannot be correctly forwarded by ODP. As a result, the session and the partition leader involved in the transaction are on different servers. In this case, check the all ODP logs in the cluster. Example of the key log information:
```
fail to caculate partition id, just use tenant server
```
The leaders of some partitions have just been switched, and the location cache maintained by the OBServer or ODP has not been updated.

For reason 1, connect the application to ODP. For reason 2, the current SQL statement is complex, and ODP cannot calculate the partitions to be accessed during parsing. Therefore, ODP has sent the statement randomly. In this case, you must adjust the SQL statement by adding the partitioning key. For reason 3, you do not need to take any action.

Single zone & multiple units for primary zone

In this scenario, the OBServer automatically balances the load among the partitions. Therefore, the partition leader and the session of a transaction are possibly on different servers. To avoid cross-server transactions, you can manage table groups to ensure single-server execution of transactions based on the execution of statements in a transaction.

Manage table groups:

-- Non-partition scenario
create tablegroup tg1;
create table t1 (id1 int, id2 int) tablegroup tg1;
create table t2 (id1 int, id2 int) tablegroup tg1;

-- HASH partition (MySQL mode)
create tablegroup tg2  partition by hash partitions 2;
create table pg_trans_test2_1(id1 int, id2 int)
  tablegroup tg2 partition by hash(id1 % 2) partitions 2;
create table pg_trans_test2_2(id1 int, id2 int)
  tablegroup tg2 partition by hash(id1 % 2) partitions 2;

-- HASH partition (Oracle mode)
create tablegroup tg2  partition by hash partitions 2;
create table pg_trans_test2_1(id1 int, id2 int)
  tablegroup tg2 partition by hash(id1) partitions 2;
create table pg_trans_test2_2(id1 int, id2 int)
  tablegroup tg2 partition by hash(id1) partitions 2;

-- RANGE partition (MySQL mode)
create tablegroup tg3  
  partition by range columns 1 (
          partition p0 values less than (10),
          partition p1 values less than(20));

create table pg_trans_test3_1(id1 int, id2 int) tablegroup tg3
  partition by range columns(id1)
      (partition p0 values less than (10),
         partition p1 values less than(20));
create table pg_trans_test3_2(id1 int, id2 int) tablegroup tg3
  partition by range columns(id1)
      (partition p0 values less than (10),
         partition p1 values less than(20));

-- RANGE partition (Oracle mode)
create tablegroup tg3  
  partition by range columns 1
      (partition p0 values less than (10),
         partition p1 values less than(20));
create table pg_trans_test3_1(id1 int, id2 int)
  tablegroup tg3 partition by range (id1)
      (partition p0 values less than (10),
         partition p1 values less than(20));
create table pg_trans_test3_2(id1 int, id2 int)
  tablegroup tg3 partition by range (id1)
      (partition p0 values less than (10),
         partition p1 values less than(20));

-- LIST partition (MySQL mode)
create tablegroup tg4 partition by list columns 1 (
    partition p0 values in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
    partition p1 values in (11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
);
create table pg_trans_test4_1(id1 int, id2 int) tablegroup tg4 partition by list columns(id1) (
    partition p0 values in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
    partition p1 values in (11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
);
create table pg_trans_test4_2(id1 int, id2 int) tablegroup tg4 partition by list columns(id1) (
    partition p0 values in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
    partition p1 values in (11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
);

-- LIST partition (Oracle mode)
create tablegroup tg4 partition by list columns 1 (
    partition p0 values (1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
    partition p1 values (11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
);
create table pg_trans_test4_1(id1 int, id2 int) tablegroup tg4 partition by list(id1) (
    partition p0 values (1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
    partition p1 values (11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
);
create table pg_trans_test4_2(id1 int, id2 int) tablegroup tg4 partition by list(id1) (
    partition p0 values (1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
    partition p1 values (11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
);

RANDOM for primary zone

The leaders in the tenant are distributed in RANDOM deployment mode, which means that cross-server transactions are at a high probability.

Transaction commit optimization solution

Single-table and multi-table single-server transactions

In OceanBase Database V4.0, the single-log stream architecture is modified. If the leaders of log streams involved in a transaction are distributed on the same server, the transaction is handled in the single-log stream model. This transaction model achieves optimal performance. Therefore, you do not need to modify any parameters.

Cross-server transactions

You may use one of the following solutions:

Utilize the multi-server capabilities as much as possible.
Implement traffic load balancing on OBServers.

Specific measures:

To avoid cross-server transactions, you can manage table groups to ensure single-server execution of transactions based on the execution of statements in a transaction. For more information, see the Single zone & single unit for primary zone section above.
For batch import scenarios, make full use of the parallel execution capability of PDML, which is available in OceanBase Database V3.2.x and later versions.
Adjust the number of network threads (net_thread_count) based on the load. By default, this parameter is set to 0. After the process starts, the number of network threads required for the server is calculated automatically based on the number of CPU cores on the current server. The calculation formula is min (6, cpu_core/8). You may manually adjust the value based on the actual situation. The parameter takes effect after the process restarts.

Considerations

In OceanBase Database V2.2.x and earlier versions, one-phase commit is not allowed.
In OceanBase Database V3.2.x and earlier versions, we recommend that you do not use partition groups in a production environment. To improve the performance, you can carry out performance tuning.