Applicability
This topic applies only to OceanBase Database in MySQL mode. For information about pagination in Oracle mode, see SIMPLE SELECT.
Pagination optimization
For scenarios where the query results in a database are excessive, it is recommended to implement pagination.
Pagination based on unique keys
Example: When performing pagination, you can use unique keys for pagination.
obclient> select * from tb where c1 = 'xxx' and c2 = 'xxx' and id > 'xxx' limit 100;
Keep the last id of every 100 records as the condition value for the next 100 records (create a composite index on c1, c2, and id).
Notice
- This method is suitable for scenarios where the data distribution is relatively uniform. If the query cost for retrieving 100 records per page is scanning tens of thousands or even hundreds of thousands of records, the performance will not meet the expected standards.
- This method is suitable for scenarios where only pagination based on the id is performed, such as
select * from tb where id > '' limit 100;.
Pagination using delayed association or subqueries
OB does not skip the offset rows but instead retrieves offset + N rows and then discards the first offset rows before returning N rows. Therefore, when the offset is particularly large, the efficiency is very low. When encountering such scenarios, you can consider controlling the total number of pages returned or rewriting the SQL for pages exceeding a specific threshold.
Control the total number of pages returned.
When implementing pagination logic in the code, if the count is 0, directly return without executing the subsequent pagination statements.
Example: In the "Service Market" transaction, when the pagination exceeds 1000 pages, clicking the last page causes the database to be almost paralyzed.
Rewrite the SQL for pages exceeding a specific threshold.
Example: First, quickly locate the required id range and then perform the association:
obclient> select a.name from a,(select id from a where id LIMIT 100000,20) b where a.id=b.id;
Examples
The following example shows how to create a table:
obclient> CREATE TABLE `business_case_analysis_effect_receive_catalog_details` (
`id` bigint(10) NOT NULL AUTO_INCREMENT,
`province_code` varchar(20) CHARACTER SET utf8 NOT NULL DEFAULT '330000' COMMENT 'Province code',
`province_name` varchar(20) CHARACTER SET utf8 NOT NULL DEFAULT 'Zhejiang' COMMENT 'Province name',
`area_code` varchar(20) CHARACTER SET utf8 DEFAULT NULL COMMENT 'City code',
`area_name` varchar(50) CHARACTER SET utf8 DEFAULT NULL COMMENT 'City name',
`region_code` varchar(50) CHARACTER SET utf8 NOT NULL COMMENT 'District code',
`region_name` varchar(50) CHARACTER SET utf8 NOT NULL COMMENT 'District name',
`dept_code` varchar(100) CHARACTER SET utf8 NOT NULL COMMENT 'Department code',
`dept_name` varchar(100) CHARACTER SET utf8 NOT NULL COMMENT 'Department name',
`catalog_id` varchar(50) CHARACTER SET utf8 DEFAULT NULL COMMENT 'Catalog code',
`catalog_name` varchar(500) CHARACTER SET utf8 DEFAULT NULL COMMENT 'Catalog name',
`catalog_code` varchar(50) CHARACTER SET utf8 DEFAULT NULL COMMENT 'Basic catalog code',
`business_code` varchar(50) CHARACTER SET utf8 DEFAULT NULL COMMENT 'Business code',
`business_name` varchar(50) CHARACTER SET utf8 DEFAULT NULL COMMENT 'Business name',
`received_num` bigint(11) NOT NULL DEFAULT '0' COMMENT 'Number of received items',
`app_received_num` bigint(11) NOT NULL DEFAULT '0' COMMENT 'Number of items received via APP',
`pc_received_num` bigint(11) NOT NULL DEFAULT '0' COMMENT 'Number of items received via PC',
`hall_received_num` bigint(11) NOT NULL DEFAULT '0' COMMENT 'Number of items received at the hall',
`window_received_num` bigint(11) NOT NULL DEFAULT '0' COMMENT 'Number of items received at the window',
`two_received_num` bigint(11) NOT NULL DEFAULT '0' COMMENT 'Number of items received in version 2.0',
`two_app_received_num` bigint(11) NOT NULL DEFAULT '0' COMMENT 'Number of items received via APP in version 2.0',
`two_pc_received_num` bigint(11) NOT NULL DEFAULT '0' COMMENT 'Number of items received via PC in version 2.0',
`two_hall_received_num` bigint(11) NOT NULL DEFAULT '0' COMMENT 'Number of items received at the hall in version 2.0',
`two_window_received_num` bigint(11) NOT NULL DEFAULT '0' COMMENT 'Number of items received at the window in version 2.0',
`one_received_num` bigint(11) NOT NULL DEFAULT '0' COMMENT 'Number of items received in version 1.0',
`one_app_received_num` bigint(11) NOT NULL DEFAULT '0' COMMENT 'Number of items received via APP in version 1.0',
`one_pc_received_num` bigint(11) NOT NULL DEFAULT '0' COMMENT 'Number of items received via PC in version 1.0',
`one_hall_received_num` bigint(11) NOT NULL DEFAULT '0' COMMENT 'Number of items received at the hall in version 1.0',
`item_flag` varchar(10) CHARACTER SET utf8 NOT NULL COMMENT 'Whether it is a version 2.0 item (0-No, 1-Yes)',
`stat_date` datetime NOT NULL COMMENT 'Statistics date (the date when the data was generated, usually the previous day)',
`delete_flag` int(1) NOT NULL DEFAULT '0' COMMENT 'Whether it is deleted (0-Not deleted, 1-Deleted)',
`gmt_created` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT 'Creation time',
`gmt_updated` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT 'Update time',
`not_case_assigned_code_num` bigint(20) NOT NULL DEFAULT '0' COMMENT 'Number of items not assigned a code',
`item_type_code` varchar(50) CHARACTER SET utf8 NOT NULL COMMENT 'Power item type ID',
`item_dock_type` varchar(10) CHARACTER SET utf8 NOT NULL COMMENT 'Item docking method (0-All, 1-Overall docking, 2-Form docking for version 2.0)',
`apply_person_type` varchar(20) CHARACTER SET utf8 NOT NULL COMMENT 'Applicant type (0-All, 1-Individual, 2-Organization)',
`two_not_case_assigned_code_num` bigint(20) NOT NULL DEFAULT '0' COMMENT 'Number of items not assigned a code in version 2.0',
`one_not_case_assigned_code_num` bigint(20) NOT NULL DEFAULT '0' COMMENT 'Number of items not assigned a code in version 1.0',
`self_service_num` bigint(11) NOT NULL DEFAULT '0' COMMENT 'Number of items handled by self-service machines',
`two_self_service_num` bigint(11) NOT NULL DEFAULT '0' COMMENT 'Number of items handled by self-service machines in version 2.0',
`one_self_service_num` bigint(11) NOT NULL DEFAULT '0' COMMENT 'Number of items handled by self-service machines in version 1.0',
PRIMARY KEY (`id`) USING BTREE,
KEY `query_catalog_index` (`catalog_id`,`catalog_code`,`stat_date`,`area_code`,`region_code`,`dept_code`,`business_code`,`business_name`) USING BTREE GLOBAL,
KEY `idx_area_region` (`area_code`,`region_code`,`item_flag`,`item_type_code`) USING BTREE GLOBAL,
KEY `idx_statDate` (`stat_date`,`area_code`,`region_code`) USING BTREE GLOBAL,
KEY `idx_catalog_code` (`catalog_code`,`catalog_id`) USING BTREE GLOBAL,
KEY `idx_item_type_code` (`item_type_code`) USING BTREE GLOBAL,
KEY `idx_catalog_business` (`catalog_code`,`catalog_id`,`business_code`) USING BTREE GLOBAL
) AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4 COMMENT='Business dashboard - Case - Effectiveness - Received items - Catalog distribution';
The original query is as follows:
SELECT
catalog_id AS catalogId,
catalog_name AS catalogName,
catalog_code AS catalogCode,
business_code AS businessCode,
business_name AS businessName,
IFNULL(
CAST(
SUM(pc_received_num) / SUM(received_num) AS DECIMAL(14, 4)
),
0
) AS networkHandlePercent,
IFNULL(
CAST(
SUM(app_received_num) / SUM(received_num) AS DECIMAL(14, 4)
),
0
) AS palmtopHandlePercent,
IFNULL(
CAST(
SUM(two_received_num) /(
SUM(received_num) + SUM(not_case_assigned_code_num)
) AS DECIMAL(14, 4)
),
0
) AS netHandleRate,
IFNULL(SUM(not_case_assigned_code_num), 0) AS notCaseAssignedCodeNum,
IFNULL(SUM(received_num), 0) AS receivedNum,
IFNULL(SUM(two_received_num), 0) AS twoReceivedNum,
IFNULL(SUM(pc_received_num), 0) AS pcReceivedNum,
IFNULL(SUM(app_received_num), 0) AS appReceivedNum,
IFNULL(SUM(hall_received_num), 0) AS hallReceivedNum,
IFNULL(SUM(two_window_received_num), 0) AS windowReceivedNum,
IFNULL(SUM(self_service_num), 0) AS selfServiceNum
FROM
business_case_analysis_effect_receive_catalog_details
WHERE
stat_date <= 'xxxx-xx-xx'
AND stat_date >= 'xxxx-xx-xx'
AND item_type_code in ("xx", "xx", "xx", "xx", "xx", "xx", "xx")
GROUP BY
catalog_code,
catalog_id
LIMIT
offset, 15;
The table contains 5 million records.
By modifying the offset parameter, the query execution times are as follows:
| offset | 0 | 100 | 300 | 500 | 1000 | 2000 |
|---|---|---|---|---|---|---|
| Execution time (s) | 0.07 | 3.96 | 6.83 | 8.67 | 14.44 | 18.04 |
Since the table has no indexes, we added an index to the filter condition. However, the execution time did not change significantly. Therefore, we considered using parallel execution. The optimized SQL statement is as follows:
SELECT
/*+ PARALLEL(5),USE_HASH_AGGREGATION*/
catalog_id AS catalogId,
catalog_name AS catalogName,
catalog_code AS catalogCode,
business_code AS businessCode,
business_name AS businessName,
IFNULL(
CAST(
SUM(pc_received_num) / SUM(received_num) AS DECIMAL(14, 4)
),
0
) AS networkHandlePercent,
IFNULL(
CAST(
SUM(app_received_num) / SUM(received_num) AS DECIMAL(14, 4)
),
0
) AS palmtopHandlePercent,
IFNULL(
CAST(
SUM(two_received_num) /(
SUM(received_num) + SUM(not_case_assigned_code_num)
) AS DECIMAL(14, 4)
),
0
) AS netHandleRate,
IFNULL(SUM(not_case_assigned_code_num), 0) AS notCaseAssignedCodeNum,
IFNULL(SUM(received_num), 0) AS receivedNum,
IFNULL(SUM(two_received_num), 0) AS twoReceivedNum,
IFNULL(SUM(pc_received_num), 0) AS pcReceivedNum,
IFNULL(SUM(app_received_num), 0) AS appReceivedNum,
IFNULL(SUM(hall_received_num), 0) AS hallReceivedNum,
IFNULL(SUM(two_window_received_num), 0) AS windowReceivedNum,
IFNULL(SUM(self_service_num), 0) AS selfServiceNum
FROM
business_case_analysis_effect_receive_catalog_details
WHERE
stat_date <= 'xxxx-xx-xx'
AND stat_date >= 'xxxx-xx-xx'
AND item_type_code in ("xx", "xx", "xx", "xx", "xx", "xx", "xx")
GROUP BY
catalog_code,
catalog_id
LIMIT
offset, 15;
After re-executing the query with the same offset values, the response times were around 5 seconds, indicating successful SQL optimization.
Notice
After analyzing this case, the following points were identified:
- In an OceanBase cluster, as the offset increases, the query response time also increases.
- When setting the number of parallel threads, the optimal value is equal to the number of partitions in the table.
Order preservation in pagination
Order preservation in pagination is used to ensure the stability of the query results. It works by first sorting the query results and then performing the pagination to maintain the order of the results.
Enable order preservation in pagination
OceanBase Database provides two methods to control whether to enable order preservation in pagination: the hidden parameter _preserve_order_for_pagination and the query hint PRESERVE_ORDER_FOR_PAGINATION.
OceanBase Database prioritizes the query hint over the hidden parameter. For example, if the hidden parameter is set to enable order preservation in pagination, but the query includes the hint /*+OPT_PARAM('PRESERVE_ORDER_FOR_PAGINATION', 'FALSE')*/, the query will not enable order preservation in pagination.
Use the hidden parameter _preserve_order_for_pagination
The hidden parameter _preserve_order_for_pagination is at the tenant level and controls whether to enable order preservation in pagination for the tenant. The default behavior is to disable it. When enabled, the optimizer will automatically add an ORDER BY clause to ensure ordered output for paginated queries.
| Attribute | Description |
|---|---|
| Parameter type | Boolean |
| Default value | False, indicating that order preservation in pagination is disabled by default. |
| Value range | False or True.
|
| Effective mode | Immediately effective |
| Level | Tenant level |
Here is an example:
To enable order preservation in pagination for a tenant, you can use the following command:
ALTER SYSTEM SET _preserve_order_for_pagination = true;
Use the query hint PRESERVE_ORDER_FOR_PAGINATION
To provide more granular control, OceanBase Database also supports the query-level hint PRESERVE_ORDER_FOR_PAGINATION.
Here is an example:
To enable order preservation in pagination for a specific query.
SELECT /*+OPT_PARAM('PRESERVE_ORDER_FOR_PAGINATION', 'TRUE')*/ * FROM test_tbl1
LIMIT 10;
Notice
This hint cannot be used to control whether to enable order preservation in pagination for a specific query block. It can only be used to control whether to enable order preservation in pagination for the entire query.
Order preservation in pagination
When order preservation in pagination is enabled, OceanBase Database will rewrite the query and add an ORDER BY clause to ensure order preservation in pagination.
For example,
SELECT t1.c1, t1.c2 FROM t1, t2 WHERE t1.c1 = t2.c1 LIMIT 10,10;
For the above query, OceanBase Database will first sort the query results and then perform pagination on the sorted results. The rewritten query is as follows:
SELECT t1.c1, t1.c2 FROM t1, t2 WHERE t1.c1 = t2.c1
ORDER BY t1.c1, t2.c2 LIMIT 10,10;
Notice
After enabling order preservation in pagination, some queries may experience performance degradation due to the additional sorting operations, which can affect query execution performance.
Considerations when enabling order preservation in pagination
After enabling order preservation in pagination, the query output results may differ from previous ones due to the added sorting fields.
For other scenarios, the expected result order may change, for example:
SELECT t2.c1, t2.c2
FROM (SELECT * from t1 ORDER BY t1.c1, t1.c2) V
LEFT JOIN t2 ON V.c1 = t2.c1
LIMIT 10,10;
The user may expect to first sort the results from table t1 and then perform pagination. However, any SQL semantics do not guarantee that the query results need to be sorted by the results from table t1 before pagination. OceanBase Database will add the sorting fields t2.c1 and t2.c2 at the outermost level, sort the results by these fields (i.e., the fields from table t2), and then perform pagination. The equivalent query is as follows:
SELECT t2.c1, t2.c2
FROM t1
LEFT JOIN t2 ON t1.c1 = t2.c1
ORDER BY t2.c1, t2.c2
LIMIT 10,10;
