This topic describes the specifications for writing SELECT statements in OceanBase Database.
Specify specific field names in SELECT statements. Do not use "*".
This increases the parsing cost of the query analyzer.
Adding or removing fields may cause inconsistencies with the resultMap configuration.
Remove unnecessary columns in the SELECT clause. This is because extra columns cause the database to perform table lookups (accessing data pages to retrieve specific columns), resulting in more I/O operations.
In SQL statements involving partitioned tables, the predicate must include the partitioning key.
Do not use
count(column_name)orcount(constant)to replacecount(*).count(*)is a standard SQL92 syntax for counting rows, and it is not related to the database, NULL values, or non-NULL values.Note
count(*)counts rows with NULL values, whilecount(column_name)does not count rows where the specified column is NULL.Do not use UNION unless necessary. Instead, use UNION ALL, and limit the number of UNION clauses to 5.
Note
UNION automatically performs a sort to eliminate duplicates at the end, but duplicates are typically not present between different queries.
Avoid using IN clauses. If you must use IN, rewrite the subquery as a JOIN. If you cannot avoid IN, carefully evaluate the number of elements in the IN clause and limit it to 100 or fewer.
Avoid using
SELECT ... FOR UPDATE (WITH CS)for read locks. In high-concurrency scenarios, this can lead to lock waits, affecting business operations.Do not perform equality queries between columns of different types.
Note
If the field type is varchar and the field content in the WHERE condition is int, implicit conversion occurs, preventing the use of indexes and potentially degrading database performance.
In ORDER BY queries, the fields in the ORDER BY clause must be unique or a combination of unique fields. Sorting non-unique fields may result in inconsistent behavior between OceanBase Database and MySQL.
Avoid complex queries in core business processes, such as calculations, multi-table joins, and table traversal with CASE WHEN statements. Instead, split them into simple single-table queries.
For SQL statements with poor filtering conditions, use SQL plan baselines for emergency handling. For long-term use, modify the SQL logic. Excessive HINTs may affect future upgrades and maintenance.
For queries with DISTINCT, ORDER BY, or GROUP BY clauses, limit the intermediate result set to 10,000 rows. For large result sets (intermediate result set exceeding 10,000 rows), perform sorting and grouping on the application side.
When setting the parallelism, the optimal number of parallel threads is equal to the number of partitions in the table.
Avoid arithmetic operations and function calculations on table fields in the
WHEREclause.Avoid assigning values to variables in SQL statements, especially during distributed processing.
Ensure that the data types of condition fields in SQL statements are consistent to avoid implicit type conversions.
Do not use database reserved keywords in the
SELECTprojection fields of SQL statements.
