SQL is a common method that you can use to manage data, such as to define, store, update, and query data, in databases. It is a highly non-procedural programming language. That is, to use SQL, simply tell it "what you want" instead of telling it "how to get what you want". Besides, you do not need to specify or understand the method for storing data. With a subject-predicate-object structure, SQL works just like how you talk, making it more "colloquial" than other programming languages. For example, you can say "I want a book named Modern Chinese History" to a librarian to borrow this book from a library, and the equivalent SQL statement will be select * from book where book_name='Modern Chinese History'. Librarians classify and sort all books in advance for easy search. For example, a book named "Modern Chinese History" is placed in the "No. 3 position on layer 2 of the modern history bookshelf in the literature and history area", so that it is easy to find this book. If a large number of books are randomly piled up, it will be inefficient or even unable to find the required book. Such a "classification and sorting" method is just like indexing in a database schema. Efficient and reasonable index design is critical to SQL performance. When we talk about SQL tuning, we usually focus on reducing the amount of data scanned during SQL execution, which means SQL index optimization. In practical scenarios, SQL tuning is much more complex than the optimization of a single SQL statement. For example, you may have the following questions:
- How do I write SQL statements to improve the performance based on existing indexes?
- Which is the most appropriate index for an SQL statement?
- How do I design reasonable indexes for the best performance of a business system?
- How do I find out the problematic SQL statements and quickly recover a database when an exception occurs?
This topic describes how to build an SQL tuning system in the R&D, integration, and O&M phases.
R&D phase: write efficient SQL statements
In the daily R&D process, writing efficient SQL statements involves the following key points:
- Formulate SQL R&D specifications and strictly follow the specifications.
- Plan schemas based on business scenarios reasonably for efficient indexing.
- Optimize single SQL statements.
SQL R&D specifications are the foundation of SQL quality and an important means to ensure stability of SQL performance. SQL issues identified in the SQL execution phase may have already affected the business and are difficult to solve with common SQL emergency plans. Clear SQL R&D specifications help solve elementary SQL errors in code and avoid potential SQL risks in the R&D phase.
Schema design is a process of continuous optimization as business scenarios change. A complex business system involves a variety of SQL statements. The load structure of these SQL statements changes along with the business model. Data distribution also changes continuously as underlying business data is unceasingly written to the system. We need to constantly optimize the schema design based on SQL load to ensure stable SQL performance and business throughput when the load fluctuates. After all, we are not trying to ensure the best performance of each SQL statement, which is not necessary and difficult to do, because it may cost more.
For a single SQL statement, the key point is to make sure that an appropriate index is chosen, which is the most effective optimization method because it reduces the number of rows to scan. In addition, we need to consider, from the business perspective, whether there is a more reasonable way to implement the business requirements of the SQL statement. We also need to predict whether the performance of this SQL statement will reach the inflection point or whether there will be SQL risks as the data volume changes after the statement is launched.
As long as we write efficient SQL statements in the R&D phase, we have ensured their stability to the greatest extent and minimized bad SQL statements at the source.
Integration phase: block risks
In the integration phase, where SQL statements are submitted to the production environment, the key task is SQL review. Just like code review before code submission, SQL review is essential in the integration phase. We shall focus on the following points in SQL review:
- Check compliance with SQL R&D specifications.
- Evaluate SQL performance.
The first point requires you to scan SQL statements to check whether they comply with R&D specifications, which helps implement the specifications. The implementation of SQL R&D specifications not only reduces elementary SQL errors and low-performance SQL scenarios, but also increases the readability of SQL statements, facilitates the understanding of SQL business scenarios, and assists in manual SQL troubleshooting.
To evaluate SQL performance, you need to identify potential SQL risks in advance. The main focuses are to check whether efficient indexing is applied, determine whether statement rewrite is required for optimization, and predict whether performance will degrade. If you need data support or want to learn more about business scenarios during SQL review, you can further evaluate the implementation method.
When we identify potential SQL risks during SQL review in the integration phase, we can effectively prevent risky SQL statements from going online and reduce bad SQL statements in the process.
O&M phase: roll with the punches
SQL tuning in the O&M phase is the most common case. When SQL statements are newly launched, the data volume is small and the concurrency is low. Therefore, in this stage, most SQL statements do not have performance issues. As the business scale grows and data volume increases, SQL execution may become slower and slower or even degrade abruptly due to sudden changes in execution plans. In the O&M phase, SQL tuning encompasses the following tasks:
- Exception diagnostics, which identifies problematic SQL statements
- Root cause analysis, which locates SQL statements that cause sudden database crash
- SQL emergency response, which recovers databases based on root causes
- Daily governance, which continuously handles bad SQL statements in databases
- Continuous optimization, which involves case archiving for later reference
If you are familiar with common SQL problems and experienced in SQL troubleshooting, you can quickly diagnose exceptions and analyze root causes. This facilitates emergency response and helps recover your business. The following topics in this chapter will introduce typical scenarios of SQL tuning.
Methods for SQL emergency response can be roughly classified into two types: execution plan intervention and SQL throttling. If an SQL statement has a correct execution plan or an appropriate index, you can use an outline to bind the execution plan or index to the statement in OceanBase Database for quick response to the emergency. If a new index is required to handle the emergency, the process takes a long time, because index creation is time-consuming. If the SQL emergency is not caused by the execution plan, it is often a problem related to SQL capacity. In this case, the effective solution is SQL throttling. However, SQL throttling is lossy to business and undermines SQL concurrency.
Daily governance and continuous optimization continuously promote the optimization of low-performance SQL statements during database O&M, curbing database crash risks caused by bad SQL statements.