FAQ about SQL statements|V4.3.5| docs|Distributed Database

SQL execution

How do I locate errors in PL object creation?

You can execute the SHOW ERRORS statement to view the information about errors that occurred when stored procedures are created.

How do I analyze PL error logs?

You only need to pay attention to the error logs returned to the client. OceanBase Database logs the information about errors that occurred during resolve operations. If the EXPECTION OTHERS statement is contained in PL, error information is recorded in logs when this statement is executed. Therefore, the information recorded in PL error logs is inaccurate and can be ignored. You only need to pay attention to the error logs returned to the client.

How do I query the source code of a created PL object?

You can query the DBA_SOURCE, ALL_SOURCE, or USER_SOURCE view for the source code of a created PL object. The value in the TEXT column is the source code of the PL object.

Is the `INSERT ALL INTO` syntax supported in the MySQL mode of OceanBase Database?

No. Currently, the INSERT ALL INTO syntax is supported only in Oracle mode.

How do the SQL engine and transaction engine allocate and isolate resources for tenants?

Both the SQL engine and transaction engine isolate resources among tenants, where:

The plan cache of the SQL engine and the locks of the transaction engine are totally independent of each other.
CPU: The CPU resources available for the SQL threads of a tenant are limited to control the number of active SQL threads.
Memory: The SQL memory and transaction memory of different tenants are separately managed. If the memory of a tenant is used up, other tenants are not affected.
Thread: The threads of the SQL engine and transaction engine of different tenants are totally independent of each other. If the threads of a tenant are suspended, other tenants are not affected.

What is batch execution in OceanBase Database?

When you use Java Database Connectivity (JDBC) in OceanBase Database, you can place multiple requests in the same group and transmit the requests at the same time. This is called batch execution or batch processing.

Why is batch execution used?

Sometimes, batch execution is used for data consistency. More often, batch execution is used to improve performance from the following aspects:

Batch statements are rewritten to improve performance.
Batch execution can reduce the number of interactions with the database.
When OceanBase Database receives a batch execution request, it can perform optimization to further improve the performance.

Which classes/objects in JDBC are suitable for batch execution?

Both statements and prepared statements are suitable for batch execution.

Which JDBC attributes do I need to set to use batch execution?

In terms of features, you must set rewriteBatchedStatements to TRUE to use batch execution.
In terms of implementation and behavior, useServerPrepStmts determines the batch execution behavior.
In terms of performance, cachePrepStmt, prepStmtCacheSize, prepStmtCacheSqlLimit, and maxBatchTotalParamsNum can all improve the performance.

The following table describes the attributes.

Attribute	Default value	Description
allowMultiQueries	FALSE	Specifies whether the semicolon (;) can be used to join requests in a statement. Batch execution does not depend on this attribute but on the rewriteBatchedStatements attribute. Note If the JDBC version is 1.1.9 or earlier, allowMultiQueries must be set to `TRUE` so that multiple `UPDATE` statements can be provided and separated with semicolons (;). Otherwise, the error "Not supported feature or function" is returned. If the JDBC version is 2.2.6 or later, the allowMultiQueries attribute is optional.
rewriteBatchedStatements	FALSE	Specifies whether to rewrite the `INSERT` statement in batch execution. For a prepared statement, multiple values are used to join multiple `INSERT` statements. For a statement, multiple `INSERT` statements are separated with semicolons (;).
useServerPrepStmts	FALSE	Specifies whether to use prepared statements on the server. This attribute is valid only for prepared statements. Valid values: `TRUE`: specifies to use prepared statements on the server. In this case, `_ob_enable_prepared_statement` must be set to `TRUE` on the OBServer node, and the binary protocol is used for communication. `FALSE`: specifies to use prepared statements on the client. In this case, the text protocol is used for communication.
cachePrepStmts	FALSE/TRUE	Specifies whether the JDBC driver caches prepared statements. The cached content varies according to the prepared statements on the server and client. Note The default value is `FALSE` for OceanBase Connector/J 1.x, and `TRUE` for OceanBase Connector/J 2.x.
prepStmtCacheSize	25/250	The number of prepared statements that can be cached. This attribute is valid only when `cachePrepStmts` is specified. Note The default value is `25` for OceanBase Connector/J 1.x, and `250` for OceanBase Connector/J 2.x.
prepStmtCacheSqlLimit	256/2048	The maximum size of the SQL statement that can be cached. This attribute is valid only when `cachePrepStmts` is specified. Note The default value is 256 for OceanBase Connector/J 1.x, and 2048 for OceanBase Connector/J 2.x.
maxBatchTotalParamsNum	30000	The maximum number of arguments that can be taken by the executeBatch() method. Note This attribute is valid only in OceanBase Connector/J 2.2.7 and later.

The following tables describe the parameters and variables related to batch execution.

Parameter	Default value	Range	Effective mode	Description
ob_enable_batched_multi_statement	FALSE	Tenant	Dynamic	Specifies whether to enable batch processing of multiple statements. If this parameter is set to `TRUE`, when the client and server use the text protocol for communication in a batch execution scenario, OceanBase Database parses multiple `UPDATE` statements of the same format as one statement and generates a batch physical plan based on the corresponding parameters and distribution of data.
_ob_enable_prepared_statement	FALSE	Cluster	Dynamic	Specifies whether to use prepared statements on the server.
_enable_static_typing_engine	TRUE	Cluster	Dynamic	Specifies whether to use a new SQL engine. The old engine can process only batch `UPDATE` statements that contain all primary keys. The new engine can process batch `UPDATE` statements that do not contain all primary keys.

Variable	Default value	Level	Description
_enable_dist_data_access_service	TRUE	SESSION/GLOBAL	Specifies whether to execute SQL statements in DAS mode. This parameter must be set to TRUE if the optimization capability of batch `UPDATE` is needed.

What are the differences between statements and prepared statements in terms of behavior and use methods?

Statement:

conn = DriverManager.getConnection(obUrl);
conn.setAutoCommit(false);
Statement stmt = conn.createStatement();
String SQL = "INSERT INTO test1 (c1, c2) VALUES (1, 'test11')";
stmt.addBatch(SQL);
String SQL = "INSERT INTO test1 (c1, c2) VALUES (2, 'test12')";
stmt.addBatch(SQL);
String SQL = "INSERT INTO test1 (c1, c2) VALUES (3, 'test13')";
stmt.addBatch(SQL);
int[] count = stmt.executeBatch();
stmt.clearBatch();
conn.commit();

Prepared statement:

conn = DriverManager.getConnection(obUrl);
conn.setAutoCommit(false);
String SQL = "INSERT INTO TEST1 (C1, C2) VALUES (?, ?)";
PreparedStatemen pstmt = conn.prepareStatement(SQL);
int rowCount = 5, batchCount = 10;
for (int k=1; k<=batchCount; k++) {
    for (int i=1; i<=rowCount; i++) {
        pstmt.setInt(1, (k*100+i));
        pstmt.setString(2, "test value");
        pstmt.addBatch();
    }
    int[] count = pstmt.executeBatch();
    pstmt.clearBatch();
}
conn.commit();
pstmt.close();

The following tables describe the batch execution behavior for prepared statements and statements when rewriteBatchedStatements is set to TRUE.

Prepared statement:

useServerPrepStmts	INSERT	UPDATE	Scenario
TRUE	The `VALUES` clauses of multiple `INSERT` statements are concatenated with commas (,) in one `INSERT` statement in the format of `INSERT INTO TEST1 VALUES (?), (?),..., (?)`.	In multiple independent `UPDATE` statements, the variables are replaced with question marks (?).	Scenario 1
FALSE	The `VALUES` clauses of multiple `INSERT` statements are concatenated with commas (,) in one `INSERT` statement in the format of `INSERT INTO TEST1 VALUES (1), (2), ...,(10)`.	Multiple independent `UPDATE` statements are concatenated with semicolons (;).	Scenario 2

Statement:

useServerPrepStmts	INSERT	UPDATE	Scenario
TRUE	Multiple independent `INSERT` statements are concatenated with semicolons (;).	Multiple independent `UPDATE` statements are concatenated with semicolons (;).	Scenario 3
FALSE	Multiple independent `INSERT` statements are concatenated with semicolons (;).	Multiple independent `UPDATE` statements are concatenated with semicolons (;).	Scenario 4

What are the batch execution types in OceanBase Database? What are their optimization measures on requests?

In OceanBase Database, the processing of batch execution on INSERT, UPDATE, and DELETE statements is different.

Note

Assume that rewriteBatchedStatements is set to TRUE in all the following scenarios.

INSERT

Scenario 1

The following tables describe the scenario where the PrepareStatement object is used.

useServerPrepStmts	INSERT
TRUE	The `VALUES` clauses of multiple `INSERT` statements are concatenated with commas (,) in one `INSERT` statement in the format of `INSERT INTO TEST1 VALUES (?), (?),..., (?)`.

In scenario 1, an OBServer node receives a COM_STMT_PREPARE (request_type=5) request and a COM_STMT_EXECUTE (request_type=6) request for the INSERT statement. The benefits are:

The batch execution of the INSERT statement is completed in two communications.
The natural attributes of a prepared statement help reduce the compilation time.
If more batch execution requests are initiated and cachePrepStmts and related parameters are properly set, the number of Prepare requests (request_type=5) can be reduced and only the Execute request (request_type=6) needs to be executed.

Scenario 2

The following tables describe the scenario where the PrepareStatement object is used.

useServerPrepStmts	INSERT
FALSE	The `VALUES` clauses of multiple `INSERT` statements are concatenated with commas (,) in one `INSERT` statement in the format of `INSERT INTO TEST1 VALUES (1), (2), ...,(10)`.

In scenario 2, the OBServer node receives a COM_QUERY request (request_type=2) of the INSERT statement. The benefits are as follows:

The batch execution of the INSERT statement is completed in only one communication.

Scenario 3/4

The following tables describe the scenario where the Statement object is used.

useServerPrepStmts	INSERT	Scenario
TRUE	Multiple independent `INSERT` statements are concatenated with semicolons (;).	Scenario 3
FALSE	Multiple independent `INSERT` statements are concatenated with semicolons (;).	Scenario 4

In scenarios 3 and 4, the OBServer node receives a request where multiple independent INSERT statements are concatenated with semicolons and executes the statements in sequence. The benefits are as follows:

The batch execution of the INSERT statement is completed in only one communication.

UPDATE

The following tables describe the scenario where the PrepareStatement object is used.

useServerPrepStmts	UPDATE	Scenario
TRUE	In multiple independent `UPDATE` statements, the variables are replaced with question marks (?).	Scenario 1
FALSE	Multiple independent `UPDATE` statements are concatenated with semicolons (;).	Scenario 2

The following tables describe the scenario where the Statement object is used.

useServerPrepStmts	UPDATE	Scenario
TRUE	Multiple independent `UPDATE` statements are concatenated with semicolons (;).	Scenario 3
FALSE	Multiple independent `UPDATE` statements are concatenated with semicolons (;).	Scenario 4

If ob_enable_batched_multi_statement is not set to TRUE, for batch UPDATE execution in scenarios 1, 2, 3, and 4, the OBServer node executes the statements one by one without optimization.
If ob_enable_batched_multi_statement is set to TRUE, for the batch UPDATE execution in scenarios 2, 3, and 4, the OBServer node parses multiple statements in the same format as one statement and generate a batch physical plan based on the corresponding parameters and distribution of data. This effectively improves the batch execution efficiency. To use this feature, you must enable explicit transaction commit.

DELETE In the current version, no optimization is available for batch DELETE.

How do I choose different configurations in different scenarios?

Note

We recommend that you choose a later version of the oceanbase-client jar package.

The following tables describe the batch execution behavior for prepared statements and statements when rewriteBatchedStatements is set to TRUE.

The following tables describe the scenario where the PrepareStatement object is used.

useServerPrepStmts	INSERT	UPDATE	Scenario
TRUE	The `VALUES` clauses of multiple `INSERT` statements are concatenated with commas (,) in one `INSERT` statement in the format of `INSERT INTO TEST1 VALUES (?), (?),..., (?)`.	In multiple independent `UPDATE` statements, the variables are replaced with question marks (?).	Scenario 1
FALSE	The `VALUES` clauses of multiple `INSERT` statements are concatenated with commas (,) in one `INSERT` statement in the format of `INSERT INTO TEST1 VALUES (1), (2), ...,(10)`.	Multiple independent `UPDATE` statements are concatenated with semicolons (;).	Scenario 2

The following tables describe the scenario where the Statement object is used.

useServerPrepStmts	INSERT	UPDATE	Scenario
TRUE	Multiple independent `INSERT` statements are concatenated with semicolons (;).	Multiple independent `UPDATE` statements are concatenated with semicolons (;).	Scenario 3
FALSE	Multiple independent `INSERT` statements are concatenated with semicolons (;).	Multiple independent `UPDATE` statements are concatenated with semicolons (;).	Scenario 4

Batch INSERT Scenarios 1 and 2 can better showcase the benefits of batch execution and therefore are recommended.

Scenario 1

JDBC object: prepared statement
Parameter on the server:
```
_ob_enable_prepared_statement=TRUE
```
JDBC attribute:

rewriteBatchedStatements=TRUE
useServerPrepStmts=TRUE
cachePrepStmts=TRUE
prepStmtCacheSize=<Specify the value based on the actual situation.>
prepStmtCacheSqlLimit=<Specify the value based on the actual situation.>
maxBatchTotalParamsNum=<Specify the value based on the actual situation.>

Scenario 2

JDBC object: prepared statement

JDBC attribute:

rewriteBatchedStatements=TRUE
useServerPrepStmts=FALSE

Batch UPDATE In scenarios 2, 3, and 4, the text protocol is used for communication and batch UPDATE execution can be used. Therefore, the configurations of these scenarios are recommended.

Scenario 2

JDBC object: prepared statement

Parameter on the server:

ob_enable_batched_multi_statement=TRUE
_enable_static_typing_engine=TRUE

Variable on the server:
```
_enable_dist_data_access_service=1
```

JDBC attribute:

rewriteBatchedStatements=TRUE
useServerPrepStmts=FALSE
allowMultiQueries=TRUE
--These settings aim to avoid behavioral differences among different JDBC driver versions.

Scenario 3/4

JDBC object: statement

Parameter on the server:

ob_enable_batched_multi_statement=TRUE
_enable_static_typing_engine=TRUE

Variable on the server:
```
_enable_dist_data_access_service=1
```

JDBC attribute:

rewriteBatchedStatements=TRUE
allowMultiQueries=TRUE
--These settings aim to avoid behavioral differences among different JDBC driver versions.

How can I check whether batch execution takes effect in OceanBase Database?

You can query the gv$sql_audit view to observe whether batch execution takes effect.

In scenario 1, if batch INSERT takes effect, you will see the following record in the gv$sql_audit view:

  query_sql: insert into test_multi_queries (c1, c2) values (?, ?)
request_type: 5
  ps_stmt_id: 1

  query_sql: insert into test_multi_queries (c1, c2) values (?, ?),(?, ?),(?, ?)
request_type: 5
  ps_stmt_id: 2

  query_sql: insert into test_multi_queries (c1, c2) values (?, ?),(?, ?),(?, ?)
request_type: 6
  ps_stmt_id: 2

In scenario 2, if batch INSERT takes effect, you will see the following record in the gv$sql_audit view:

query_sql: insert into test_multi_queries (c1, c2) values (1, 'PreparedStatement; rewriteBatchedStatements=true&allowMultiQueries=true&useLocalSessionState=true'),(2, 'PreparedStatement; rewriteBatchedStatements=true&allowMultiQueries=true&useLocalSessionState=true'),(3, 'PreparedStatement; rewriteBatchedStatements=true&allowMultiQueries=true&useLocalSessionState=true')

In scenario 2, if batch UPDATE takes effect, you will see the following record in the gv$sql_audit view:
```
            query_sql: update test2 set c2='batch update1' where c1=1;update test2 set c2='batch update2' where c1=2;update test2 set c2='batch update3' where c1=3
            ret_code: 0
is_batched_multi_stmt: 1
```
Notice

If ret_code is -5787, batch UPDATE does not take effect. In this case, you need to find out the cause based on the preceding description.

What is the value returned by the executeBatch() method during batch execution?

After the executeBatch() method is called, it returns an int [] array. For batch INSERT and batch UPDATE:

If the statements are finally executed one by one on the OBClient, this array will return the number of rows updated by each operation in the batch.
If the statements are executed as a whole on the OBClient, for example, the JDBC driver rewrites multiple INSERT statements as one INSERT statement with multiple values (scenarios 1 and 2), or the UPDATE statements are executed as a batch physical plan (scenario 2), each element in this array returns -2, indicating that the execution is successful but the number of updated rows is unknown.

How do I address the "large account" issue in SQL queries?

What is the "large account" issue?

The "large account" issue refers to a situation during SQL statement parsing and execution where the same SQL ID corresponds to different execution plans, resulting in significant performance differences under different query conditions. For example, after a SQL statement is parameterized, there may be an execution plan in the plan cache that is not suitable for all scenarios. This can lead to the following effects:

When the system executes a SQL statement for the first time (such as select * from items where store = 'taobao';), the execution plan (p1) might be a full table scan, which is suitable for cases with very low selectivity.

However, if another parameterized SQL statement is executed afterward (such as select * from items where store = 'xiaomaibu';), and the selectivity is very high, the original plan p1 may no longer be the optimal solution, resulting in degraded performance.
How to identify "large account" SQL statements?

The key to identifying "large account" SQL statements is to monitor the fluctuations of logical reads, affected rows, and returned rows. The specific rules are as follows:
- The fluctuation of logical reads exceeds 1,000 rows.
- The fluctuation of affected rows exceeds 1,000 rows.
- The fluctuation of returned rows exceeds 10,000 rows.
You can manually query the execution records and statistics of the v$ob_sql_audit view to identify "large account" SQL statements.
How to solve the "large account" SQL issue:
- Disable plan cache
  1. Disable plan cache for a single SQL statement:
    - Use the USE_PLAN_CACHE hint, for example:
      SELECT /*+ USE_PLAN_CACHE(none) */ * FROM items WHERE store = 'taobao';
    - Or set the session variable ob_enable_plan_cache:
      SET ob_enable_plan_cache = 0;
  The above methods for disabling plan caches require modifying the SQL statements in the application. If the issue is discovered only after the application is online, you must release a new version of the application to resolve it. You may refer to the method of binding hints using SQL Outline, which allows a database administrator to add controls to the SQL statements.
  1. Bind hints using SQL outline: Use SQL outline to have the DBA control the execution plan of the SQL.
```
CREATE OUTLINE otl_no_plan_cache1
ON select /*+ USE_PLAN_CACHE(NONE) */ * from items where store = 'taobao';
# Or use SQL_ID; SQL_ID can be found in views such as V$OB_SQL_AUDIT
CREATE OUTLINE otl_idx_c2
ON 'ED570339F2C856BA96008A29EDF04C74'
USING HINT /*+ USE_PLAN_CACHE(NONE) */;
```
Enable SQL plan management (SPM)

We recommend that you enable SPM in OceanBase Database V4.2.1 and later. SPM can alleviate the "large account" issue in some cases, especially when a single acceptable execution plan exists among multiple execution plans. SPM automatically evolves plans to ensure that plans are positively evolved when the plan cache is cleared, thereby improving execution performance.
Address the performance degradation caused by the "large account" issue
- Manually optimize the SQL statement: When no single execution plan is suitable, you can manually optimize the SQL statement in the following ways:
  - Add indexes: You can build indexes to improve query performance.
  - Add the cursor_sharing_exact hint: You can add this hint to large accounts to obtain independent execution plans for queries.
- Limit resource consumption: To prevent large account queries from degrading system performance, you can limit resource consumption in the following ways:
  1. Limit the maximum number of concurrent sessions: You can use the MAX_CONCURRENT hint, for example:
```
CREATE OUTLINE otl_sql_throttle1 ON 'SQL_ID' USING HINT /*+ MAX_CONCURRENT(10) */;
```
  2. Limit resource consumption at the SQL statement level: You can bind large account queries to resource groups to limit resource consumption during execution, for example:
```
CALL DBMS_RESOURCE_MANAGER.SET_CONSUMER_GROUP_MAPPING(
    ATTRIBUTE => 'column',
    VALUE => 'items.store = \'taobao\'',
    CONSUMER_GROUP => 'slow_group');
```
    In OceanBase Database V4.3.3 and later, you can use hints to specify a resource group for a specific SQL statement:
```
select /*+ RESOURCE_GROUP('slow_group') */ * from items where store = 'taobao';
```
  3. Automatically isolate large queries: You can set the large_query_threshold parameter to identify large queries and limit CPU usage to ensure that small queries are processed first. If large queries and small queries are processed at the same time, large queries can occupy at most 30% of the tenant's work threads. The value of this percentage can be set by using the large_query_worker_percentage parameter.

SQL tuning

Decrease in query performance due to physical design of the database

The query performance depends largely on the physical design of the database, including the schema information of the object to be accessed. For example, in a query that involves a secondary index but the required projected column is not included in the indexed columns, the cost of the query is much higher because of the table access by index primary key operations. In this case, you can add the projected column to the indexed columns to form a "covering index" to avoid table access by index primary key.

Response of an SQL query affected by the system workload

The overall load of the system not only affects the overall throughput of the system but also causes changes in the response time of an SQL query. In OceanBase Database, the SQL engine processes queries linearly, which means that new queries are put in a queue if all threads are occupied until a thread completes the current query. You can find the queue time of a query in the (G)V$OB_SQL_AUDIT view.

Incorrect selection of the execution plan due to cost model defects

OceanBase Database uses its built-in cost model. The selection of the optimal execution plan depends on this cost model. Therefore, if an incorrect plan is selected due to possible defects in this cost model, you can only bind the "correct" execution plan to avoid selecting the incorrect plan.

Routing feedback logic error occurred between the client and the server

One of the main features of OceanBase Database Proxy (ODP) is to route SQL queries to the right server. Specifically, if you do not specify week-consistency read in the query, ODP routes the query to the server where the leader replica of a partition is located to avoid secondary forwarding between servers. Otherwise, the ODP forwards the query to the right server based on the preset rules. The loose coupling between an ODP and the server may cause latencies in refreshing the physical distribution of data cached on the ODP, resulting in routing errors. The routing information may be changed in the following circumstances:

Re-election of the leader due to load balancing
Re-election of the leader between servers due to network instability
Re-election of the leader due to the addition or removal of an OBServer node or rotating major compactions

You need to consider these circumstances when a large number of remote execution queries are found in the SQL Audit or plan cache. The routing feedback logic exists between the client and the server, so the client refreshes the physical distribution of data when an error occurs. Then, the routing function is restored.

Index monitoring

Index monitoring is enabled by default. Will it affect the performance?

Index monitoring will affect the performance. However, the impact of index monitoring in default mode (SAMPLED) is slight and can be ignored.

Is the data collected in SAMPLED mode accurate? For example, is every query recorded?

Sampling aims to filter some of the data. Therefore, records of a specific query may be discarded.

Does the index monitoring module of the standby database collect statistics of itself or synchronize statistics from the primary database?

Statistics are collected at the tenant level. In the standby database, you can query data synchronized from the primary database but cannot modify the data.

Partition exchange

In Oracle mode of OceanBase Database, why does an error occur during partition exchange between a partitioned table and a non-partitioned table that have the same global index? Is this caused by index mismatch?

In the Oracle mode of OceanBase Database, if you do not add the local keyword when you create an index, the system creates the index as a global index by default. However, partition exchange requires that the local indexes of the partitioned table match all indexes of the non-partitioned table one by one.

Replicated tables

What is a replicated table?

A replicated table is a special table in OceanBase Database that allows you to read the latest data modifications on any replica. It is recommended for business scenarios that have low write frequency and high requirements for read latency and load balancing.

In which scenarios can I use replicated tables?

Scenario 1: Scenarios with low write frequency and high requirements for read latency and load balancing

If the data volume of a table is small and it is accessed frequently, data will be concentrated on one node, forming a hot spot. This affects performance. In this case, you can use a replicated table.

Typical scenarios for replicated tables:

Configuration table. Business systems read configuration information from this table.
Table for storing exchange rates in financial scenarios. This table is not updated in real time and is updated only once a day.
Table for storing branch or branch office information in banks. This table rarely adds new records.

Here are some best practices for using replicated tables:

We recommend that you create replicated tables on demand instead of creating a large number of replicated tables in a tenant.
We recommend that you not write to and read from a replicated table in the same transaction.
When you query a replicated table, if the query statement contains a JOIN clause, design the query (Query) SQL statement in the order of a replicated table JOINing a normal table.

Scenario 2: Scenarios where a table cannot be partitioned or does not need to be partitioned but frequently joins a partitioned table

For business reasons, some tables cannot be partitioned or do not need to be partitioned (for example, if there are no clear partitioning conditions for queries). However, these tables frequently join partitioned tables. To reduce unnecessary distributed queries across servers, you can set these tables as replicated tables. Each node supports strong reads. Regardless of which server the primary replica of a partitioned table is on, the node can join the replica of the replicated table.

In which scenarios should you not use replicated tables?

Scenario 1: You set one of the two partitioned tables to be a replicated table to avoid cross-node joins between the two partitioned tables.

This approach is not scientific. These tables may not be suitable for using replicated tables, and the expected performance may not be achieved.

If the table frequently performs write operations, writing to a replicated table incurs performance overhead. The data needs to be synchronized to all replicas. In theory, synchronization is concurrent. However, if the synchronization is very slow on some nodes due to network issues, the write performance of the replicated table will decrease.
If the data volume of a replicated table is large, the table will have a replica on each node, occupying a large amount of storage space.

Scenario 2: A transaction contains both write and read operations on a replicated table.

We recommend that you perform pure write or pure read operations on a replicated table. You only need to consider whether you can accept write latency.

If a transaction contains both write and read operations on a replicated table, the transaction will degrade to a normal table read, which cannot leverage the advantage of a replicated table that can read from its local replica.

Scenario 3: You perform a join between a replicated table and a normal table.

ODP (obproxy) uses the first table parsed from the join statement as the routing table. If the first table is a replicated table, obproxy randomly selects a replica for routing. If the current node is not the leader of the normal table, remote routing will occur, which affects performance.

We recommend that you adjust the join order to join a normal partitioned table with a replicated table. This way, obproxy will route the normal partitioned table, and generate a local plan as expected to optimize performance.

Scenario 4: You set a replicated table as a partitioned table.

A replicated table has a replica on each node. Therefore, you do not need to set it as a partitioned table.

Frequently asked questions about auto-increment columns

When should you set the data type of an auto-increment column to BIGINT?

The business data grows rapidly and the data retention period is long.
The business does not care whether BIGINT or INT is used to create a table.
The leader is scattered, and the probability of a leader being promoted to the primary leader increases (for example, due to failures, random load balancing, and so on). This increases the probability of auto-increment values jumping.
The NOORDER mode requires that you explicitly specify the auto-increment value.

When can you keep the data type of an auto-increment column as INT?

The business data volume is far below the upper limit of INT.
The business is migrated from MySQL, and the application may be incompatible if BIGINT is used.
Leader promotion is rarely performed in a single-machine scenario.
You have monitoring and operation capabilities. When the number of auto-increment values is close to the upper limit, you can perform maintenance operations, such as rebuilding tables and statistics or modifying the column type from INT to BIGINT (from V4.2.2, the modification of the column type from INT to BIGINT is an online DDL operation).

When can you change the auto-increment column to NOORDER?

If you do not have a requirement for auto-increment values to be ordered at the column level and want to optimize the performance of high-concurrency operations, you can change ORDER to NOORDER.
If you have a requirement for auto-increment values to be ordered at the column level, but all leaders are on one OBServer, and you want to optimize the performance of high-concurrency operations, you can change ORDER to NOORDER.

When can you reduce the value of auto_increment_cache_size?

The auto-increment value jumping problem is prominent.
The business traffic is low.
The performance is not sensitive.
The performance is required, but the business is in single-machine mode and the leader is on one node.

FAQ about SQL statements

SQL execution

How do I locate errors in PL object creation?

How do I analyze PL error logs?

How do I query the source code of a created PL object?

Is the INSERT ALL INTO syntax supported in the MySQL mode of OceanBase Database?

How do the SQL engine and transaction engine allocate and isolate resources for tenants?

What is batch execution in OceanBase Database?

Why is batch execution used?

Which classes/objects in JDBC are suitable for batch execution?

Which JDBC attributes do I need to set to use batch execution?

Note

Note

Note

Note

Note

Which parameters in OceanBase Database are related to batch execution?

What are the differences between statements and prepared statements in terms of behavior and use methods?

What are the batch execution types in OceanBase Database? What are their optimization measures on requests?

Note

How do I choose different configurations in different scenarios?

Note

How can I check whether batch execution takes effect in OceanBase Database?

Notice

What is the value returned by the executeBatch() method during batch execution?

How do I address the "large account" issue in SQL queries?

SQL tuning

Decrease in query performance due to physical design of the database

Response of an SQL query affected by the system workload

Incorrect selection of the execution plan due to cost model defects

Routing feedback logic error occurred between the client and the server

Index monitoring

Index monitoring is enabled by default. Will it affect the performance?

Is the data collected in SAMPLED mode accurate? For example, is every query recorded?

Does the index monitoring module of the standby database collect statistics of itself or synchronize statistics from the primary database?

Partition exchange

In Oracle mode of OceanBase Database, why does an error occur during partition exchange between a partitioned table and a non-partitioned table that have the same global index? Is this caused by index mismatch?

Replicated tables

What is a replicated table?

In which scenarios can I use replicated tables?

Scenario 1: Scenarios with low write frequency and high requirements for read latency and load balancing

Scenario 2: Scenarios where a table cannot be partitioned or does not need to be partitioned but frequently joins a partitioned table

In which scenarios should you not use replicated tables?

Scenario 1: You set one of the two partitioned tables to be a replicated table to avoid cross-node joins between the two partitioned tables.

Scenario 2: A transaction contains both write and read operations on a replicated table.

Scenario 3: You perform a join between a replicated table and a normal table.

Scenario 4: You set a replicated table as a partitioned table.

Frequently asked questions about auto-increment columns

When should you set the data type of an auto-increment column to BIGINT?

When can you keep the data type of an auto-increment column as INT?

When can you change the auto-increment column to NOORDER?

When can you reduce the value of auto_increment_cache_size?

Is the `INSERT ALL INTO` syntax supported in the MySQL mode of OceanBase Database?