SQL tuning basically refers to optimizing the execution plan of SQL and designing well-structured index combinations. Reducing the number of rows scanned is the most effective means of optimization. In addition to this, SQL tuning is inseparable from application and database optimization.
In a broad sense, SQL tuning is to look for best practices of getting data from databases for applications by using SQL requests, which is also the focus of this topic. For a single SQL query, the optimization options are limited. For example, it is almost impossible to optimize an SQL query without any filter conditions, and you may wonder whether the query is even necessary.
This topic provides typical scenarios and cases of SQL tuning in OceanBase Database to help you better understand common methods for SQL diagnostics and optimization. At the same time, we expect to gradually build a library of SQL tuning methods based on these cases, and implement GUI-based SQL diagnostics and optimization capabilities to facilitate SQL risk management and optimization.
SQL capacity management
High SQL performance consumption
SQL execution gradually slows down
Scenarios
SQL execution gradually slows down over a period of time. Here are the typical business scenarios:
- The business periodically writes data, so the data volume increases gradually. As a result, SQL execution slows accordingly.
- For paging SQL queries on large datasets, as the number of pages increases, more data needs to be scanned, which also slows the execution of the queries.
Optimization suggestions
We recommend that you take the following measures in the first scenario:
- Add a covering index and use
ORDER BY+LIMITto control the number of rows processed at a time, so that data is smoothly processed. This keeps the SQL performance relatively constant. - Maintain balance between business data production and downstream consumption, as far as possible, to prevent data accumulation. If accumulation occurs, you can migrate the accumulated data through asynchronous tasks to implement even distribution, thus ensuring stable performance of downstream consumption.
We recommend that you take the following measures in the second scenario:
Do not use offset for paging queries. Otherwise, the performance of queries on large pages becomes poor. You can transfer the primary key id instead. For example, each time you query a page, you can transfer the primary key id or another identifier in the last row on the page returned for the previous query to the current query. This way, you can quickly obtain data based the information transferred and the use of ORDER BY LIMIT n.
SQL queries spike
Scenarios
In this scenario, SQL queries are executed much more frequently in a short period of time, causing high performance consumption and even the exhaust of CPU resources in a tenant. This may affect the overall performance of the database. Here are the typical business scenarios:
- Scheduled tasks for business
- Promotion campaigns
- Hotspot invalid
Optimization suggestions
- For business: Add traffic control at the business layer to prevent requests from instantly using up database resources in scenarios such as hotspot invalid, promotion campaigns, and scheduled tasks.
- For database: Calculate the CPU capacity required by the database and add buffer CPU capacity. Alternatively, enable tenant-level CPU over-allocation to allow the tenant to automatically use over-allocated CPU capacity in emergency situations.
- For SQL emergency response: Throttle abnormal SQL requests to recover the overall service of the database at the expense of a small part of business.
SQL business hotspots
Read hotspots
Scenarios
A read hotspot arises when a row is accessed by SQL requests from a single account in a short period of time. SQL read hotspots are a type of SQL execution spike scenarios. However, if you can locate the abnormal account, you can control the traffic from the account. This affects requests from only this account and has no other business impact.
Optimization suggestions
During SQL emergency response, you can throttle requests from the account that leads to a hotspot to lower the concurrency and recover the database.
Write and lock hotspots
Scenarios
A write hotspot usually triggers lock competition, which leads to a large number of SQL retries or lock waits and eventually causes database exceptions. A write hotspot often arises when two UPDATE or SELECT FOR UPDATE operations are concurrently performed on the same row.
Optimization suggestions
We recommend that you use the NOWAIT keyword when executing SELECT FOR UPDATE and implement a retry mechanism at the business code layer to avoid lock contention at the database layer. During SQL emergency response, you can throttle requests from the account that leads to a hotspot to lower the lock concurrency.
Inverted-pyramid data storage
Scenarios
Inverted-pyramid data storage reflects business data skew. Usually, this problem occurs when a type of accounts (referred to as large accounts) store much more data in a table than other accounts (referred to as small accounts). SQL queries of large accounts scan a great deal of data, leading to performance fluctuations.
Case analysis
Inverted-pyramid data storage is highly prone to occur in the following business scenario:
When a database system holds businesses of different scales, business data is stored in a structure similar to an inverted pyramid, that is, a few businesses occupy the most storage space. As a result, SQL requests of businesses with more data have much worse performance.
Optimization suggestions
Routine SQL tuning does not apply to this scenario. You can perform business structure optimization in the following ways:
- Split data of large and small accounts. In addition to horizontal sharding, you can also vertically split tables based on the business data volume.
- Divert SQL requests of large and small accounts. Since different accounts may have different SQL index requirements, if you can divert SQL requests at the account level, then you can not only efficiently control the index use of the requests but also perform priority-based business guarantee. For example, if you fail to identify the abnormal account during SQL emergency response, you can throttle SQL requests in only one channel to ensure stability of other businesses.
- Use paging queries if possible. Query only one shard at a time with the condition of
id > id in last row on previous page. Then, aggregate data at the business layer. This ensures constant SQL performance for both large and small accounts.
For SQL emergency response, although there exist no ideal methods in addition to tenant specification scale-out and SQL throttling, you can still throttle SQL requests from large accounts.
SQL performance optimization
Execution plan optimization
Execution plan deterioration
Scenarios
OceanBase Database provides the plan cache mechanism for SQL tuning, which is described as follows:
- When receiving the first SQL request, the database hard-parses the request, calculates costs based on parameters in the request, generates an execution plan, and then writes the plan to the plan cache.
- When receiving the second SQL request, the database checks whether the request hits a plan in the plan cache. If so, the database executes the cached plan.
Hard parsing consumes a large amount of CPU resources. The plan cache mechanism can improve SQL performance and the overall database throughput. However, the plan cache mechanism also brings problems. For example, the quality of an SQL execution plan depends on the quality of parameters used for hard parsing. If the parameters in the first SQL request are the "minority", that is, they do not meet requirements of most requests, then the generated execution plan may be inappropriate for the "majority". In addition, if data distribution changes and the original plan is not applicable to the new scenario, performance of the SQL request also deteriorates.
Optimization suggestions
Execution plans that use indexes outperform those involve full table scans. We recommend that you bind a correct plan to an SQL request for emergency response.
Buffer tables
Scenarios
Buffer tables are an SQL exception that is triggered only when the following conditions are met:
Condition 1: Most data inserted into a table will be deleted soon, that is, only a small amount of data persists in the table. In general, SQL execution plans that access this table perform a full table scan. This is because OceanBase Database recycles data block "holes" (unused space) from the table, which makes the table's high-water mark rather low and thus greatly reduces the table scan cost.
Condition 2: A great deal of data is inserted and deleted in a short period of time, and the table's high-water mark remains high due to inefficient "hole" recycling or data in the table piles up because the amount of inserted data is larger than that of deleted data. In this case, the database actually scans lots of data in the table, leading to SQL performance deterioration.
Note that buffer tables arise only when both of the preceding conditions are met. In addition, frequent updates on indexed columns may also trigger this exception. This is because the updates are actually implemented by inserting data into and deleting data from the index table, which makes the index table a buffer table.
Optimization suggestions
We recommend that you do not perform optimization for routine use, because buffer tables occur at a quite low probability with strict trigger conditions. For emergency response, we recommend that you use an outline to bind the settle_id index. If you want to ensure the stability of SQL performance and prevent sudden deterioration, you can also specify an index with the settle_id field in the hint at the cost of some daily performance.
Multi-plan jitter
Scenarios
When multiple indexes are available for an SQL query, the OceanBase Database optimizer may choose different indexes for different execution plans based on data distribution. Because the optimizer always tries to find the optimal execution plan at any time, it estimates the SQL execution cost based on current data distribution and parameters, and selects the plan with the lowest cost, which is considered the one with the best performance. However, data distribution changes as business data is constantly written into or changed in the database, and thereby different execution plans may be generated at different points in time. In addition, because the distribution of different data values may vary greatly, the execution plans may change frequently back and forth.
Optimization suggestions
You can reduce execution plan jitter in scenarios where multiple indexes are available in the following ways:
- Do not create too many redundant indexes. For example, the
status,env, andenv+statusindexes overlap with each other. To avoid ambiguity, you can drop the single-columnstatusindex to prevent miss selection for execution plans and reduce storage costs of index tables. - Do not build an index on enum columns with few enum options, such as gender, age, type, and status, unless the index is highly selective. For example, assume that the status column in a table has 99%
Initializingvalues and 1%Pendingvalues. The business scenario is to insert values into rows wherestatusisInitializing, and then update the value in thestatuscolumn toCompletedfor the rows based on the primary key. After the operations, only 1% of rows meet the conditionstatus = Pending. In this case, if a query needs to obtain data in the pending state form the table, an index on thestatuscolumn will provide high performance.
- For scenarios with serious data skew, you need to perform optimization at the business layer. For more information, see "Inverted-pyramid data storage".
Index optimization
Incorrectly selected index
Scenarios
An SQL query has multiple indexes, but the index selected by the plan does not achieve optimal performance. Possible causes are as follows:
- The index specified in the SQL hint leads to poor performance.
- An inappropriate index is bound to the SQL outline.
- The data distribution of input parameters used for hard parsing of the SQL query is applicable to a minority of scenarios. As a result, an incorrect execution path is selected.
- The plan generated based on the data distribution during SQL tuning is not suitable for the data distribution after the business runs for a period of time.
Optimization suggestions
We recommend that you do not specify an index in the SQL hint, which allows the optimizer to automatically select an index, or specify the idx_fund_inst_type_time index.
No available index
Scenarios
No index is available for filter conditions in an SQL query, so the query has to perform a full table scan or use other indexes. This causes low performance.
Optimization suggestions
- For SQL execution: Create an index on an appropriate column. In this example, you can create a normal index on the
dag_task_idcolumn. Since the index also contains the primary keytask_id, sorting oftask_idwill be eliminated. - For business process: We recommend that you perform SQL review in the code integration phase to rule out SQL queries that have no available indexes.
Inappropriate index
Scenarios
Indexes available to an SQL query are inappropriate and fail to improve performance. However, there exist columns on which you can create a more efficient index.
Optimization suggestions
The optimization suggestions are as follows:
For SQL execution:
- Remove the function operation on the
gmt_createcolumn by rewritingTIMESTAMPDIFF(MINUTE, gmt_create, now()) > ?asgmt_create > DATE_SUB(now(), INTERVAL ? MINUTE). - Change the index that covers only
sceneandgmt_createto an index that coversscene,effective, andgmt_create. The performance can be improved by about three times.
- Remove the function operation on the
For business process: We recommend that you perform SQL review in the code integration phase to rule out SQL queries with predicates involving function operations.