Applicability
This topic applies only to OceanBase Database in MySQL mode. For information about paging queries in Oracle mode, see SIMPLE SELECT.
Optimize pagination
When the query result set is large, perform a pagination query.
Pagination based on unique keys
Example: You can perform a pagination query based on unique keys.
obclient> select * from tb where c1 = 'xxx' and c2 = 'xxx' and id > 'xxx' limit 100;
Retain the last id of every 100 rows as the condition value for the next 100 rows (create a composite index on c1, c2, and id).
Notice
- This method is suitable for data that is evenly distributed. If the query cost of 100 rows per page is scanning tens of thousands of rows, the performance will not meet the expected level.
- This method is suitable for scenarios where you only need to retrieve all the data in the table based on the id, such as
select * from tb where id > '' limit 100;.
Pagination using delayed joins or subqueries
OB does not skip the offset rows, but rather retrieves offset+N rows and then returns the last N rows by discarding the first offset rows. Therefore, when the offset is very large, the efficiency is very low. When a business scenario involves such a situation, you can consider controlling the total number of pages returned or rewriting the SQL statement for pages exceeding a specific threshold.
Control the total number of pages returned.
When implementing pagination query logic in the code, if the count is 0, directly return the result without executing the subsequent pagination statements.
Bad example: When a transaction in the "Service Market" has more than 1,000 pages, the database is almost paralyzed when the user clicks the last page.
Rewrite the SQL statement for pages exceeding a specific threshold.
Example: First, quickly locate the id segment that needs to be retrieved, and then perform a join:
obclient> select a.name from a,(select id from a where id LIMIT 100000,20) b where a.id=b.id;
Examples
The following statement creates a table:
obclient> CREATE TABLE `business_case_analysis_effect_receive_catalog_details` (
`id` bigint(10) NOT NULL AUTO_INCREMENT,
`province_code` varchar(20) CHARACTER SET utf8 NOT NULL DEFAULT '330000' COMMENT 'Province code',
`province_name` varchar(20) CHARACTER SET utf8 NOT NULL DEFAULT 'Zhejiang Province' COMMENT 'Province name',
`area_code` varchar(20) CHARACTER SET utf8 DEFAULT NULL COMMENT 'City code',
`area_name` varchar(50) CHARACTER SET utf8 DEFAULT NULL COMMENT 'City name',
`region_code` varchar(50) CHARACTER SET utf8 NOT NULL COMMENT 'District code',
`region_name` varchar(50) CHARACTER SET utf8 NOT NULL COMMENT 'District name',
`dept_code` varchar(100) CHARACTER SET utf8 NOT NULL COMMENT 'Department code',
`dept_name` varchar(100) CHARACTER SET utf8 NOT NULL COMMENT 'Department name',
`catalog_id` varchar(50) CHARACTER SET utf8 DEFAULT NULL COMMENT 'Catalog code',
`catalog_name` varchar(500) CHARACTER SET utf8 DEFAULT NULL COMMENT 'Catalog name',
`catalog_code` varchar(50) CHARACTER SET utf8 DEFAULT NULL COMMENT 'Basic catalog code',
`business_code` varchar(50) CHARACTER SET utf8 DEFAULT NULL COMMENT 'Line code',
`business_name` varchar(50) CHARACTER SET utf8 DEFAULT NULL COMMENT 'Line name',
`received_num` bigint(11) NOT NULL DEFAULT '0' COMMENT 'Number of received items',
`app_received_num` bigint(11) NOT NULL DEFAULT '0' COMMENT 'Number of items received via APP',
`pc_received_num` bigint(11) NOT NULL DEFAULT '0' COMMENT 'Number of items received via PC',
`hall_received_num` bigint(11) NOT NULL DEFAULT '0' COMMENT 'Number of items received at the service hall',
`window_received_num` bigint(11) NOT NULL DEFAULT '0' COMMENT 'Number of items received at the dedicated window',
`two_received_num` bigint(11) NOT NULL DEFAULT '0' COMMENT 'Number of items received under the 2.0 system',
`two_app_received_num` bigint(11) NOT NULL DEFAULT '0' COMMENT 'Number of items received via APP under the 2.0 system',
`two_pc_received_num` bigint(11) NOT NULL DEFAULT '0' COMMENT 'Number of items received via PC under the 2.0 system',
`two_hall_received_num` bigint(11) NOT NULL DEFAULT '0' COMMENT 'Number of items received at the service hall under the 2.0 system',
`two_window_received_num` bigint(11) NOT NULL DEFAULT '0' COMMENT 'Number of items received at the dedicated window under the 2.0 system',
`one_received_num` bigint(11) NOT NULL DEFAULT '0' COMMENT 'Number of items received under the 1.0 system',
`one_app_received_num` bigint(11) NOT NULL DEFAULT '0' COMMENT 'Number of items received via APP under the 1.0 system',
`one_pc_received_num` bigint(11) NOT NULL DEFAULT '0' COMMENT 'Number of items received via PC under the 1.0 system',
`one_hall_received_num` bigint(11) NOT NULL DEFAULT '0' COMMENT 'Number of items received at the service hall under the 1.0 system',
`item_flag` varchar(10) CHARACTER SET utf8 NOT NULL COMMENT 'Whether it is a 2.0 matter (0-No, 1-Yes)',
`stat_date` datetime NOT NULL COMMENT 'Statistical date (the date when the data was generated, usually the previous day)',
`delete_flag` int(1) NOT NULL DEFAULT '0' COMMENT 'Whether it has been deleted (0-Not deleted, 1-Deleted)',
`gmt_created` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT 'Creation time',
`gmt_updated` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT 'Update time',
`not_case_assigned_code_num` bigint(20) NOT NULL DEFAULT '0' COMMENT 'Number of unassigned items',
`item_type_code` varchar(50) CHARACTER SET utf8 NOT NULL COMMENT 'Power matter type ID',
`item_dock_type` varchar(10) CHARACTER SET utf8 NOT NULL COMMENT 'Matter docking method (0-All, 1-Overall docking, 2-2.0 form docking)',
`apply_person_type` varchar(20) CHARACTER SET utf8 NOT NULL COMMENT 'Applicant type (0-All, 1-Individual, 2-Organization)',
`two_not_case_assigned_code_num` bigint(20) NOT NULL DEFAULT '0' COMMENT 'Number of unassigned items under the 2.0 system',
`one_not_case_assigned_code_num` bigint(20) NOT NULL DEFAULT '0' COMMENT 'Number of unassigned items under the 1.0 system',
`self_service_num` bigint(11) NOT NULL DEFAULT '0' COMMENT 'Number of self-service machine items',
`two_self_service_num` bigint(11) NOT NULL DEFAULT '0' COMMENT 'Number of self-service machine items under the 2.0 system',
`one_self_service_num` bigint(11) NOT NULL DEFAULT '0' COMMENT 'Number of self-service machine items under the 1.0 system',
PRIMARY KEY (`id`) USING BTREE,
KEY `query_catalog_index` (`catalog_id`,`catalog_code`,`stat_date`,`area_code`,`region_code`,`dept_code`,`business_code`,`business_name`) USING BTREE GLOBAL,
KEY `idx_area_region` (`area_code`,`region_code`,`item_flag`,`item_type_code`) USING BTREE GLOBAL,
KEY `idx_statDate` (`stat_date`,`area_code`,`region_code`) USING BTREE GLOBAL,
KEY `idx_catalog_code` (`catalog_code`,`catalog_id`) USING BTREE GLOBAL,
KEY `idx_item_type_code` (`item_type_code`) USING BTREE GLOBAL,
KEY `idx_catalog_business` (`catalog_code`,`catalog_id`,`business_code`) USING BTREE GLOBAL
) AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4 COMMENT='Business dashboard - Cases - Case effectiveness - Received items - Catalog distribution';
The query SQL statement is as follows:
SELECT
catalog_id AS catalogId,
catalog_name AS catalogName,
catalog_code AS catalogCode,
business_code AS businessCode,
business_name AS businessName,
IFNULL(
CAST(
SUM(pc_received_num) / SUM(received_num) AS DECIMAL(14, 4)
),
0
) AS networkHandlePercent,
IFNULL(
CAST(
SUM(app_received_num) / SUM(received_num) AS DECIMAL(14, 4)
),
0
) AS palmtopHandlePercent,
IFNULL(
CAST(
SUM(two_received_num) /(
SUM(received_num) + SUM(not_case_assigned_code_num)
) AS DECIMAL(14, 4)
),
0
) AS netHandleRate,
IFNULL(SUM(not_case_assigned_code_num), 0) AS notCaseAssignedCodeNum,
IFNULL(SUM(received_num), 0) AS receivedNum,
IFNULL(SUM(two_received_num), 0) AS twoReceivedNum,
IFNULL(SUM(pc_received_num), 0) AS pcReceivedNum,
IFNULL(SUM(app_received_num), 0) AS appReceivedNum,
IFNULL(SUM(hall_received_num), 0) AS hallReceivedNum,
IFNULL(SUM(two_window_received_num), 0) AS windowReceivedNum,
IFNULL(SUM(self_service_num), 0) AS selfServiceNum
FROM
business_case_analysis_effect_receive_catalog_details
WHERE
stat_date <= 'xxxx-xx-xx'
AND stat_date >= 'xxxx-xx-xx'
AND item_type_code in ("xx", "xx", "xx", "xx", "xx", "xx", "xx")
GROUP BY
catalog_code,
catalog_id
LIMIT
offset, 15;
The table contains five million rows.
The query time is obtained by changing the value of the offset parameter:
| offset | 0 | 100 | 300 | 500 | 1000 | 2000 |
|---|---|---|---|---|---|---|
| Execution time (s) | 0.07 | 3.96 | 6.83 | 8.67 | 14.44 | 18.04 |
Because the table has no indexes, the execution time does not significantly change, and parallel execution is considered. The SQL statement is optimized as follows:
SELECT
/*+ PARALLEL(5),USE_HASH_AGGREGATION*/
catalog_id AS catalogId,
catalog_name AS catalogName,
catalog_code AS catalogCode,
business_code AS businessCode,
business_name AS businessName,
IFNULL(
CAST(
SUM(pc_received_num) / SUM(received_num) AS DECIMAL(14, 4)
),
0
) AS networkHandlePercent,
IFNULL(
CAST(
SUM(app_received_num) / SUM(received_num) AS DECIMAL(14, 4)
),
0
) AS palmtopHandlePercent,
IFNULL(
CAST(
SUM(two_received_num) /(
SUM(received_num) + SUM(not_case_assigned_code_num)
) AS DECIMAL(14, 4)
),
0
) AS netHandleRate,
IFNULL(SUM(not_case_assigned_code_num), 0) AS notCaseAssignedCodeNum,
IFNULL(SUM(received_num), 0) AS receivedNum,
IFNULL(SUM(two_received_num), 0) AS twoReceivedNum,
IFNULL(SUM(pc_received_num), 0) AS pcReceivedNum,
IFNULL(SUM(app_received_num), 0) AS appReceivedNum,
IFNULL(SUM(hall_received_num), 0) AS hallReceivedNum,
IFNULL(SUM(two_window_received_num), 0) AS windowReceivedNum,
IFNULL(SUM(self_service_num), 0) AS selfServiceNum
FROM
business_case_analysis_effect_receive_catalog_details
WHERE
stat_date <= 'xxxx-xx-xx'
AND stat_date >= 'xxxx-xx-xx'
AND item_type_code in ("xx", "xx", "xx", "xx", "xx", "xx", "xx")
GROUP BY
catalog_code,
catalog_id
LIMIT
offset, 15;
After the offset was adjusted, the returned time is around 5 seconds, optimizing the SQL queries.
Consider the following before you use this procedure:
An analysis of this case shows the following content:
- In an OceanBase cluster, as the offset increases, the query return time increases.
- Test results show that the optimal parallelism for a table is the number of partitions in the table.
Pagination order preservation
Pagination order preservation ensures the stability of query results. It works by sorting the query results and then applying pagination to maintain a consistent order.
Enable or disable pagination order preservation
OceanBase Database provides two methods to control whether to enable pagination order preservation: the hidden parameter _preserve_order_for_pagination and the query hint PRESERVE_ORDER_FOR_PAGINATION.
OceanBase Database prioritizes the query hint over the hidden parameter. For example, if a tenant has the hidden parameter set to enabled, but the query uses the hint /*+OPT_PARAM('PRESERVE_ORDER_FOR_PAGINATION', 'FALSE')*/, the pagination order preservation feature will be disabled for that query.
Use the hidden parameter to enable or disable pagination order preservation
The hidden parameter _preserve_order_for_pagination is at the tenant level and controls whether to enable or disable the pagination order preservation feature. The default behavior is disabled. When enabled, the optimizer adds an ORDER BY clause to ensure ordered output for paginated queries.
| Attribute | Description |
|---|---|
| Parameter type | Boolean |
| Default value | False, indicating that the pagination order preservation feature is disabled by default. |
| Value range | False or True.
|
| Effective mode | Immediately effective |
| Scope | Tenant-level |
Here is an example:
To enable the pagination order preservation feature at the tenant level, you can use the following command:
ALTER SYSTEM SET _preserve_order_for_pagination = true;
Use the query hint to enable or disable pagination order preservation
To provide more granular control, OceanBase Database also offers the query hint PRESERVE_ORDER_FOR_PAGINATION.
Here is an example:
To enable the pagination order preservation feature for a specific query.
SELECT /*+OPT_PARAM('PRESERVE_ORDER_FOR_PAGINATION', 'TRUE')*/ * FROM test_tbl1
LIMIT 10;
Notice
This hint cannot be used to enable or disable the pagination order preservation feature for specific query blocks. It can only be used to enable or disable the feature for the entire query.
Pagination order preservation
When the pagination order preservation feature is enabled, OceanBase Database will rewrite the query and add an ORDER BY clause to ensure ordered output.
For example,
SELECT t1.c1, t1.c2 FROM t1, t2 WHERE t1.c1 = t2.c1 LIMIT 10,10;
For the above query, OceanBase Database will first sort the query results and then apply pagination. The rewritten query is as follows:
SELECT t1.c1, t1.c2 FROM t1, t2 WHERE t1.c1 = t2.c1
ORDER BY t1.c1, t2.c2 LIMIT 10,10;
Notice
Enabling the pagination order preservation feature may cause performance degradation for some queries due to the additional sorting calculations, which may reduce query execution performance.
Considerations when enabling pagination order preservation
After enabling the pagination order preservation feature, the query output results may differ from before due to the added sorting fields.
In other scenarios, the expected result order may also change, for example:
SELECT t2.c1, t2.c2
FROM (SELECT * from t1 ORDER BY t1.c1, t1.c2) V
LEFT JOIN t2 ON V.c1 = t2.c1
LIMIT 10,10;
The user may expect to sort the results of table t1 and then apply pagination. However, in reality, any SQL semantics do not guarantee that the query results must be sorted by the results of table t1 before applying pagination. OceanBase Database will add sorting fields t2.c1 and t2.c2 at the outermost level, sort the results based on the output fields (i.e., the fields of table t2), and then apply pagination. The equivalent query is as follows:
SELECT t2.c1, t2.c2
FROM t1
LEFT JOIN t2 ON t1.c1 = t2.c1
ORDER BY t2.c1, t2.c2
LIMIT 10,10;