OceanBase logo

OceanBase

A unified distributed database ready for your transactional, analytical, and AI workloads.

DEPLOY YOUR WAY

OceanBase Cloud

The best way to deploy and scale OceanBase

OceanBase Enterprise

Run and manage OceanBase on your infra

TRY OPEN SOURCE

OceanBase Community Edition

The free, open-source distributed database

OceanBase seekdb

Open source AI native search database

Customer Stories

Real-world success stories from enterprises across diverse industries.

View All
BY USE CASES

Mission-Critical Transactions

Global & Multicloud Application

Elastic Scaling for Peak Traffic

Real-time Analytics

Active Geo-redundancy

Database Consolidation

Resources

Comprehensive knowledge hub for OceanBase.

Blog

Live Demos

Training & Certification

Documentation

Official technical guides, tutorials, API references, and manuals for all OceanBase products.

View All
PRODUCTS

OceanBase Cloud

OceanBase Database

Tools

Connectors and Middleware

QUICK START

OceanBase Cloud

OceanBase Database

BEST PRACTICES

Practical guides for utilizing OceanBase more effectively and conveniently

Company

Learn more about OceanBase – our company, partnerships, and trust and security initiatives.

About OceanBase

Partner

Trust Center

Contact Us

International - English
中国站 - 简体中文
日本 - 日本語
Sign In
Start on Cloud

A unified distributed database ready for your transactional, analytical, and AI workloads.

DEPLOY YOUR WAY

OceanBase Cloud

The best way to deploy and scale OceanBase

OceanBase Enterprise

Run and manage OceanBase on your infra

TRY OPEN SOURCE

OceanBase Community Edition

The free, open-source distributed database

OceanBase seekdb

Open source AI native search database

Customer Stories

Real-world success stories from enterprises across diverse industries.

View All
BY USE CASES

Mission-Critical Transactions

Global & Multicloud Application

Elastic Scaling for Peak Traffic

Real-time Analytics

Active Geo-redundancy

Database Consolidation

Comprehensive knowledge hub for OceanBase.

Blog

Live Demos

Training & Certification

Documentation

Official technical guides, tutorials, API references, and manuals for all OceanBase products.

View All
PRODUCTS
OceanBase CloudOceanBase Database
ToolsConnectors and Middleware
QUICK START
OceanBase CloudOceanBase Database
BEST PRACTICES

Practical guides for utilizing OceanBase more effectively and conveniently

Learn more about OceanBase – our company, partnerships, and trust and security initiatives.

About OceanBase

Partner

Trust Center

Contact Us

Start on Cloud
编组
All Products
    • Databases
    • iconOceanBase Database
    • iconOceanBase Cloud
    • iconOceanBase Tugraph
    • iconInteractive Tutorials
    • iconOceanBase Best Practices
    • Tools
    • iconOceanBase Cloud Platform
    • iconOceanBase Migration Service
    • iconOceanBase Developer Center
    • iconOceanBase Migration Assessment
    • iconOceanBase Admin Tool
    • iconOceanBase Loader and Dumper
    • iconOceanBase Deployer
    • iconKubernetes operator for OceanBase
    • iconOceanBase Diagnostic Tool
    • iconOceanBase Binlog Service
    • Connectors and Middleware
    • iconOceanBase Database Proxy
    • iconEmbedded SQL in C for OceanBase
    • iconOceanBase Call Interface
    • iconOceanBase Connector/C
    • iconOceanBase Connector/J
    • iconOceanBase Connector/ODBC
    • iconOceanBase Connector/NET
icon

OceanBase Database

SQL - V4.3.5

    Download PDF

    OceanBase logo

    The Unified Distributed Database for the AI Era.

    Follow Us
    Products
    OceanBase CloudOceanBase EnterpriseOceanBase Community EditionOceanBase seekdb
    Resources
    DocsBlogLive DemosTraining & Certification
    Company
    About OceanBaseTrust CenterLegalPartnerContact Us
    Follow Us

    © OceanBase 2026. All rights reserved

    Cloud Service AgreementPrivacy PolicySecurity
    Contact Us
    Document Feedback
    1. Documentation Center
    2. OceanBase Database
    3. SQL
    4. V4.3.5
    iconOceanBase Database
    SQL - V 4.3.5
    SQL
    KV
    • V 4.4.2
    • V 4.3.5
    • V 4.3.3
    • V 4.3.1
    • V 4.3.0
    • V 4.2.5
    • V 4.2.2
    • V 4.2.1
    • V 4.2.0
    • V 4.1.0
    • V 4.0.0
    • V 3.1.4 and earlier

    Rule-based query rewrite

    Last Updated:2026-04-09 08:28:54  Updated
    share
    What is on this page
    Subquery related rewrite
    View merge
    Subquery unnesting
    Rewrite ANY or ALL by using MAX or MIN
    Outer join elimination
    Condition simplification
    HAVING condition elimination
    Equivalence relation deduction
    Identically true/false elimination
    Non-SPJ rewrite
    Sorting redundancy elimination
    LIMIT pushdown to subquery
    LIMIT pushdown to outer join or cross join
    DISTINCT elimination
    MIN/MAX rewrite

    folded

    share

    Rule-based query rewrite includes subquery-related rewrite, outer join elimination, simplified condition rewrite, and non-select project join (non-SPJ) rewrite.

    Subquery related rewrite

    The optimizer usually executes subqueries in a nested way, which means that each time a parent query generates a row of data, the optimizer executes a subquery. This method requires multiple subqueries to be executed, resulting in low execution efficiency. To optimize the execution, the nested operation is rewritten to a join, which greatly improves the execution efficiency because of the following benefits:

    • It avoids the repeated execution of the same subquery.

    • The optimizer selects a better join order and better join algorithms based on the statistics.

    • After the join and filter conditions of the subquery are rewritten as the conditions of the parent query, more optimization options are available to the optimizer, such as conditional pushdown.

    Frequently applied methods to rewrite a subquery include view merge, subquery expansion, and rewriting ANY or ALL by using MAX or MIN.

    View merge

    View merge refers to the process of merging a subquery that represents a view into a query that contains the view. The merge provides the optimizer with more options of join order types, access paths, and rewrites, making the selection of a better execution plan much easier.

    OceanBase Database supports the merge of SPJ views. In the following example, Q1 is rewritten to Q2:

    obclient>CREATE TABLE t1 (c1 INT, c2 INT);
    Query OK, 0 rows affected
    
    obclient>CREATE TABLE t2 (c1 INT PRIMARY KEY, c2 INT);
    Query OK, 0 rows affected
    
    obclient>CREATE TABLE t3 (c1 INT PRIMARY KEY, c2 INT);
    Query OK, 0 rows affected
    
    Q1:
    obclient>SELECT t1.c1, v.c1
                    FROM t1, (SELECT t2.c1, t3.c2
                    FROM t2, t3
                    WHERE t2.c1 = t3.c1) v
                    WHERE t1.c2 = v.c2;
    <==>
    Q2:
    obclient>SELECT t1.c1, t2.c1
                    FROM t1, t2, t3
                    WHERE t2.c1 = t3.c1 AND t1.c2 = t3.c2;
    

    Before the rewrite, the types of join orders available for query Q1 are as follows:

    • t1, v(t2,t3)

    • t1, v(t3,t2)

    • v(t2,t3), t1

    • v(t3,t2), t1

    After the rewrite, the following types of join orders are available:

    • t1, t2, t3

    • t1, t3, t2

    • t2, t1, t3

    • t2, t3, t1

    • t3, t1, t2

    • t3, t2, t1

    This indicates that view merge increases options for the join order. For complex queries, view merge provides higher flexibility in selecting access paths and rewrite methods, allowing the optimizer to generate a better plan.

    Subquery unnesting

    Subquery unnesting promotes the subquery in a WHERE condition to the parent query and unnests it to produce a join condition in parallel with the parent query. The rewrite deconstructs the subquery and changes the outer parent query to a multi-table join.

    The benefit is that the optimizer takes into account tables in the subquery when it selects access paths, join methods, and sorting methods, thus obtaining a better execution plan. Relevant subquery unnesting expressions include NOT IN, IN, NOT EXIST, EXIST, ANY, and ALL.

    You can unnest a subquery by using the following methods:

    • Rewrite conditions so that the generated join statement is enabled to return the same rows as the original statement.

    • Unnest a subquery to produce a semi join or anti join.

      In the following example, t2.c2 is not unique and the statement is rewritten to a semi join. The new execution plan is as follows:

      obclient> CREATE TABLE t1 (c1 INT, c2 INT);
      Query OK, 0 rows affected
      
      obclient> CREATE TABLE t2 (c1 INT PRIMARY KEY, c2 INT);
      Query OK, 0 rows affected
      
      obclient> EXPLAIN SELECT * FROM t1 WHERE t1.c1 IN (SELECT t2.c2 FROM t2);
      
      Query Plan:
      =======================================
      |ID|OPERATOR      |NAME|EST. ROWS|COST|
      ---------------------------------------
      |0 |HASH SEMI JOIN|    |495      |3931|
      |1 | TABLE SCAN   |t1  |1000     |499 |
      |2 | TABLE SCAN   |t2  |1000     |433 |
      =======================================
      
      Outputs & filters:
      -------------------------------------
        0 - output([t1.c1], [t1.c2]), filter(nil),
            equal_conds([t1.c1 = t2.c2]), other_conds(nil)
        1 - output([t1.c1], [t1.c2]), filter(nil),
            access([t1.c1], [t1.c2]), partitions(p0)
        2 - output([t2.c2]), filter(nil),
            access([t2.c2]), partitions(p0)
      

      After you change the operator before a query to NOT IN, you can rewrite the query to an anti join. The new execution plan is as follows:

      obclient> EXPLAIN SELECT * FROM t1 WHERE t1.c1 NOT IN (SELECT t2.c2 FROM t2);
      
      Query Plan:
      ================================================
      |ID|OPERATOR             |NAME|EST. ROWS|COST  |
      ------------------------------------------------
      |0 |NESTED-LOOP ANTI JOIN|    |0        |520245|
      |1 | TABLE SCAN          |t1  |1000     |499   |
      |2 | TABLE SCAN          |t2  |22       |517   |
      ================================================
      
      Outputs & filters:
      -------------------------------------
        0 - output([t1.c1], [t1.c2]), filter(nil),
            conds(nil), nl_params_([t1.c1], [(T_OP_IS, t1.c1, NULL, 0)])
        1 - output([t1.c1], [t1.c2], [(T_OP_IS, t1.c1, NULL, 0)]), filter(nil),
            access([t1.c1], [t1.c2]), partitions(p0)
        2 - output([t2.c2]), filter([(T_OP_OR, ? = t2.c2, ?, (T_OP_IS, t2.c2, NULL, 0))]),
            access([t2.c2]), partitions(p0)
      
    • Unnest a subquery to produce an inner join.

      For query Q1 in the preceding example, if t2.c2 is changed to t2.c1, which is the primary key, the output of the subquery is unique, and you can rewrite it to an inner join, as shown in the following example:

      Q1:
      obclient> SELECT * FROM t1 WHERE t1.c1 IN  (SELECT t2.c1 FROM t2);
      <==>
      Q2:
      obclient> SELECT t1.* FROM t1, t2 WHERE t1.c1 = t2.c1;
      

      The execution plan of query Q1 is then rewritten as follows:

      obclient> EXPLAIN SELECT * FROM t1 WHERE t1.c1 IN (SELECT t2.c1 FROM t2);
      
      Query Plan:
       ====================================
      |ID|OPERATOR   |NAME|EST. ROWS|COST|
      ------------------------------------
      |0 |HASH JOIN  |    |1980     |3725|
      |1 | TABLE SCAN|t2  |1000     |411 |
      |2 | TABLE SCAN|t1  |1000     |499 |
      ====================================
      
      Outputs & filters:
      -------------------------------------
        0 - output([t1.c1], [t1.c2]), filter(nil),
            equal_conds([t1.c1 = t2.c1]), other_conds(nil)
        1 - output([t2.c1]), filter(nil),
            access([t2.c1]), partitions(p0)
        2 - output([t1.c1], [t1.c2]), filter(nil),
            access([t1.c1], [t1.c2]), partitions(p0)
      

      You can rewrite subquery expressions, such as NOT IN, IN, NOT EXIST, EXIST, ANY, and ALL, in the same way.

    Rewrite ANY or ALL by using MAX or MIN

    For a subquery with an ANY or ALL expression, if the subquery does not contain a GROUP BY clause, aggregate function, or HAVING clause, the expression shown in the following example can be equivalently converted by the aggregate function MIN or MAX, where col_item is a separate, non-NULL column:

    val > ALL(SELECT col_item ...)  <==> val > (SELECT MAX(col_item) ...);
    val >= ALL(SELECT col_item ...) <==> val >= (SELECT MAX(col_item) ...);
    val < ALL(SELECT col_item ...)  <==> val < (SELECT MIN(col_item) ...);
    val <= ALL(SELECT col_item ...) <==> val <= (SELECT MIN(col_item) ...);
    val > ANY(SELECT col_item ...)  <==> val > (SELECT MIN(col_item) ...);
    val >= ANY(SELECT col_item ...) <==> val >= (SELECT MIN(col_item) ...);
    val < ANY(SELECT col_item ...)  <==> val < (SELECT MAX(col_item) ...);
    val <= ANY(SELECT col_item ...) <==> val <= (SELECT MAX(col_item) ...);
    

    After the subquery is converted to contain MAX or MIN, you can further rewrite MAX or MIN to reduce the number of scanning operations on the right-side table. Here is an example:

    obclient> SELECT c1 FROM t1 WHERE c1 > ANY(SELECT c1 FROM t2);
    <==>
    obclient> SELECT c1 FROM t1 WHERE c1 > (SELECT MIN(c1) FROM t2);
    

    After the MAX/MIN rewrite, the primary key t2.c1 can be used to directly push LIMIT 1 down to TABLE SCAN to generate the MIN value. The execution plan is as follows:

    obclient> EXPLAIN SELECT c1 FROM t1 WHERE c1 > ANY(SELECT c1 FROM t2);
    
    Query Plan:
     ===================================================
    |ID|OPERATOR        |NAME          |EST. ROWS|COST|
    ---------------------------------------------------
    |0 |SUBPLAN FILTER  |              |1        |73  |
    |1 | TABLE SCAN     |t1            |1        |37  |
    |2 | SCALAR GROUP BY|              |1        |37  |
    |3 |  SUBPLAN SCAN  |subquery_table|1        |37  |
    |4 |   TABLE SCAN   |t2            |1        |36  |
    ===================================================
    
    Outputs & filters:
    -------------------------------------
      0 - output([t1.c1]), filter([t1.c1 > ANY(subquery(1))]),
          exec_params_(nil), onetime_exprs_(nil), init_plan_idxs_([1])
      1 - output([t1.c1]), filter(nil),
          access([t1.c1]), partitions(p0)
      2 - output([T_FUN_MIN(subquery_table.c1)]), filter(nil),
          group(nil), agg_func([T_FUN_MIN(subquery_table.c1)])
      3 - output([subquery_table.c1]), filter(nil),
          access([subquery_table.c1])
      4 - output([t2.c1]), filter(nil),
          access([t2.c1]), partitions(p0),
          limit(1), offset(nil)
    

    Outer join elimination

    Outer joins are classified into left outer joins, right outer joins, and full outer joins. The order of an outer join cannot be changed during the operation, the optimizer has limited options of join order. Outer join elimination helps convert an outer join into an inner join, providing the optimizer with more join path options.

    However, the outer join elimination requires a "Reject-NULL" condition, which is contained in WHERE conditions, and exports FALSE when the right-side table generates a NULL value.

    Here is an example:

    obclient>SELECT t1.c1, t2.c2 FROM t1 LEFT JOIN t2 ON t1.c2 = t2.c2;
    

    This is an outer join, whose output row t2.c2 may be NULL. If you add the condition t2.c2 > 5, data is filtered by this condition, and the output of t2.c1 cannot be NULL, so that the outer join can be converted into an inner join.

    obclient> SELECT t1.c1, t2.c2 FROM t1 LEFT JOIN t2 ON t1.c2 = t2.c2 WHERE t2.c2 > 5;
    <==>
    obclient> SELECT t1.c1, t2.c2 FROM t1 INNER JOIN t2 ON t1.c2 = t2.c2
                WHERE t2.c2 > 5;
    

    Condition simplification

    HAVING condition elimination

    If no aggregate or GROUP BY operation exists in the query, the HAVING condition can be merged into the WHERE conditions, with the HAVING condition eliminated. This way, the HAVING condition can be managed with other conditions in WHERE for further optimization.

    obclient>SELECT * FROM t1, t2 WHERE t1.c1 = t2.c1 HAVING t1.c2 > 1;
    <==>
    obclient>SELECT * FROM t1, t2 WHERE t1.c1 = t2.c1 AND t1.c2 > 1;
    

    The rewritten plan is shown in the following example, where the condition t1.c2 > 1 is pushed down to the TABLE SCAN layer.

    obclient> EXPLAIN SELECT * FROM t1, t2 WHERE t1.c1 = t2.c1 HAVING t1.c2 > 1;
    
    Query Plan:
    =========================================
    |ID|OPERATOR        |NAME|EST. ROWS|COST|
    -----------------------------------------
    |0 |NESTED-LOOP JOIN|    |1        |59  |
    |1 | TABLE SCAN     |t1  |1        |37  |
    |2 | TABLE GET      |t2  |1        |36  |
    =========================================
    
    Outputs & filters:
    -------------------------------------
      0 - output([t1.c1], [t1.c2], [t2.c1], [t2.c2]), filter(nil),
          conds(nil), nl_params_([t1.c1])
      1 - output([t1.c1], [t1.c2]), filter([t1.c2 > 1]),
          access([t1.c1], [t1.c2]), partitions(p0)
      2 - output([t2.c1], [t2.c2]), filter(nil),
          access([t2.c1], [t2.c2]), partitions(p0)
    

    Equivalence relation deduction

    In the process of equivalence relation deduction, new conditional expressions are deduced based on the transitivity of comparison operators to reduce the number of rows to be processed or to select a more efficient index.

    OceanBase Database supports the deduction of equi-join conditions. For example, assume that a table contains columns a and b. a = b AND a > 1 AND b > 1 can be deduced from a = b AND a > 1. In this case, if column b is indexed and selectivity of the index is low for the condition b > 1, the performance of accessing the table that contains column b can be significantly improved by using the deduced condition.

    The following example shows that the condition t1.c1 = t2.c2 AND t1.c1 > 2 is equivalently deduced to t1.c1 = t2.c2 AND t1.c1 > 2 AND t2.c2 > 2. You can learn from the plan that t2.c2 is pushed down to TABLE SCAN and the corresponding index of t2.c2 is applied.

    obclient> CREATE TABLE t1(c1 INT PRIMARY KEY, c2 INT);
    Query OK, 0 rows affected
    
    obclient> CREATE TABLE t2(c1 INT PRIMARY KEY, c2 INT, c3 INT, KEY IDX_c2(c2));
    Query OK, 0 rows affected
    /*Run this command in MySQL mode*/
    
    obclient> EXPLAIN EXTENDED_NOADDR SELECT t1.c1, t2.c2 FROM t1, t2 WHERE t1.c1 = t2.c2 AND t1.c1 > 2;
    
    Query Plan:
    ==========================================
    |ID|OPERATOR   |NAME      |EST. ROWS|COST|
    ------------------------------------------
    |0 |MERGE JOIN |          |5        |78  |
    |1 | TABLE SCAN|t2(IDX_c2)|5        |37  |
    |2 | TABLE SCAN|t1        |3        |37  |
    ==========================================
    
    Outputs & filters:
    -------------------------------------
      0 - output([t1.c1], [t2.c2]), filter(nil),
          equal_conds([t1.c1 = t2.c2]), other_conds(nil)
      1 - output([t2.c2]), filter(nil),
          access([t2.c2]), partitions(p0),
          is_index_back=false,
          range_key([t2.c2], [t2.c1]), range(2,MAX ; MAX,MAX),
          range_cond([t2.c2 > 2])
      2 - output([t1.c1]), filter(nil),
          access([t1.c1]), partitions(p0),
          is_index_back=false,
          range_key([t1.c1]), range(2 ; MAX),
          range_cond([t1.c1 > 2])
    

    Identically true/false elimination

    The following identically true and false conditions can be eliminated:

    • false and expr = Identically false

    • true or expr = Identically true

    In the following example, for the condition WHERE 0 > 1 AND c1 = 3, AND is identically false because of the condition 0 > 1. Therefore, the result can be directly returned without the need to execute the SQL query, thereby accelerating the execution process.

    obclient> EXPLAIN EXTENDED_NOADDR SELECT * FROM t1 WHERE 0 > 1 AND c1 = 3;
    
    Query Plan:
    ===================================
    |ID|OPERATOR  |NAME|EST. ROWS|COST|
    -----------------------------------
    |0 |TABLE SCAN|t1  |0        |38  |
    ===================================
    
    Outputs & filters:
    -------------------------------------
      0 - output([t1.c1], [t1.c2]), filter([0], [t1.c1 = 3]), startup_filter([0]),
          access([t1.c1], [t1.c2]), partitions(p0),
          is_index_back=false, filter_before_indexback[false,false],
          range_key([t1.__pk_increment], [t1.__pk_cluster_id], [t1.__pk_partition_id]),
          range(MAX,MAX,MAX ; MIN,MIN,MIN)always false
    

    Non-SPJ rewrite

    Sorting redundancy elimination

    Redundancy elimination for sorting removes redundant ordered items to reduce resource consumption for sorting. Redundancy elimination for sorting is required in the following three circumstances:

    • Duplicate columns in the ORDER BY expression list. You can sort the columns after deduplication.

      obclient> SELECT * FROM t1 WHERE c2 = 5 ORDER BY c1, c1, c2, c3 ;
      <==>
      obclient> SELECT * FROM t1 WHERE c2 = 5 ORDER BY c1, c2, c3;
      
    • A single-valued column in both ORDER BY and WHERE. This column can be deleted before sorting.

      obclient> SELECT * FROM t1 WHERE c2 = 5 ORDER BY c1, c2, c3;
      <==>
      obclient> SELECT * FROM t1 WHERE c2 = 5 ORDER BY c1, c3;
      
    • If a query at the current layer contains ORDER BY but not LIMIT, and the query resides in the set operation of the parent query, then ORDER BY can be eliminated. The reason is that the UNION operation on two ordered sets simply leads to an unordered result set. However, if ORDER BY contains LIMIT, which semantically implies the selection of the greatest or smallest N items, ORDER BY must be retained to avoid semantic errors.

      obclient> (SELECT c1,c2 FROM t1 ORDER BY c1) UNION (SELECT c3,c4 FROM t2 ORDER BY c3);
      <==>
      obclient> (SELECT c1,c2 FROM t1) UNION (SELECT c3,c4 FROM t2);
      

    LIMIT pushdown to subquery

    LIMIT pushdown rewrite means to push the LIMIT down to the subquery. OceanBase Database supports pushing down LIMIT to a view (Example 1) or to a subquery that contains the UNION operator (Example 2), without changing the semantics.

    Example 1: Push LIMIT down to the view.

    obclient> SELECT * FROM (SELECT * FROM t1 ORDER BY c1) a LIMIT 1;
    <==>
    obclient> SELECT * FROM (SELECT * FROM t1 ORDER BY c1 LIMIT 1) a LIMIT 1;
    

    Example 2: Push LIMIT down to the subquery corresponding to UNION.

    obclient> (SELECT c1,c2 FROM t1) UNION ALL (SELECT c3,c4 FROM t2) LIMIT 5;
    <==>
    obclient> (SELECT c1,c2 FROM t1 LIMIT 5) UNION ALL (SELECT c3,c4 FROM t2 limit 5) LIMIT 5;
    

    LIMIT pushdown to outer join or cross join

    If an SQL statement does not contain the WINDOW FUNCTION, DISTINCT, GROUP BY, or HAVING clause and does not contain the WHERE or ORDER BY condition or the WHERE or ORDER BY condition is related only to one side of the join, you can push down the LIMIT statement to one side (outer join) or multiple sides (cross join) of the joined table. This rewrite method is called LIMIT pushdown to outer join or cross join. Pushdown of the LIMIT statement can effectively reduce the number of rows that are joined, thus reducing the overheads of queries.

    When the LIMIT statement is pushed down for an outer join, a view is encapsulated on the table pushed down. The following example shows query Q1 for a left outer join:

    Q1:
    SELECT * FROM t1 LEFT JOIN t2 ON t1.c1 = t2.c1 LIMIT 1;
    ==>
    Q2:
    SELECT * FROM V LEFT JOIN t2 ON t1.c1 = t2.c1 LIMIT 1;
    V: (SELECT * FROM t1 LIMIT 1);
    

    Query Q1 does not contain the preceding clauses or the WHERE or ORDER BY condition. Therefore, view V can be encapsulated on table t1 on the left side, and LIMIT 1 can be encapsulated into view V, thus rewriting query Q1 into query Q2. After the LIMIT statement is pushed down, the original LIMIT statement must be retained. The execution plan is rewritten as follows:

    =============================================
    |ID|OPERATOR         |NAME   |EST. ROWS|COST|
    ---------------------------------------------
    |0 |LIMIT            |       |1        |4   |
    |1 | MERGE OUTER JOIN|       |1        |4   |
    |2 |  SUBPLAN SCAN   |VIEW1  |1        |2   |
    |3 |   TABLE SCAN    |t1(idx)|1        |2   |
    |4 |  SORT           |       |1        |2   |
    |5 |   TABLE SCAN    |t2     |1        |2   |
    =============================================
    Outputs & filters:
    -------------------------------------
    0 - output([1]), filter(nil), rowset=256, limit(1), offset(nil)
    1 - output(nil), filter(nil), rowset=256,
    equal_conds([VIEW1.t1.c1 = t2.c1]), other_conds(nil)
    2 - output([VIEW1.t1.c1]), filter(nil), rowset=256,
    access([VIEW1.t1.c1])
    3 - output([t1.c1]), filter(nil), rowset=256,
    access([t1.c1]), partitions(p0),
    limit(1), offset(nil)
    4 - output([t2.c1]), filter(nil), rowset=256, sort_keys([t2.c1, ASC])
    5 - output([t2.c1]), filter(nil), rowset=256,
    access([t2.c1]), partitions(p0)
    

    Likewise, a right outer join can push down the LIMIT statement to the table on the right side when the preceding conditions are met.

    A cross join can push down the LIMIT statement to both sides. Here is an example:

    Q3:
    SELECT 1 FROM t1, t2 WHERE t1.c1 > 0 ORDER BY t1.c1 LIMIT 1;
    ==>
    Q4:
    SELECT 1 FROM V1, V2 LIMIT 1;
    V1: SELECT 1 FROM t1 WHERE t1.c1 > 0 ORDER BY t1.c1 LIMIT 1;
    V2: SELECT 1 FROM t2 LIMIT 1;
    

    Query Q3 does not contain the conditions in the LIMIT pushdown to outer join and contains only the WHERE and ORDER BY conditions applied to table t1. In this case, you can create views V1 and V2 on tables t1 and t2, respectively. Then, you can push down the LIMIT statement to the views, thus rewriting query Q3 into query Q4. The execution plan is rewritten as follows:

    =====================================================
    |ID|OPERATOR                   |NAME |EST. ROWS|COST|
    -----------------------------------------------------
    |0 |LIMIT                      |     |1        |5   |
    |1 | NESTED-LOOP JOIN CARTESIAN|     |1        |5   |
    |2 |  SUBPLAN SCAN             |VIEW1|1        |3   |
    |3 |   TOP-N SORT              |     |1        |3   |
    |4 |    TABLE SCAN             |t1   |3        |3   |
    |5 |  MATERIAL                 |     |1        |2   |
    |6 |   SUBPLAN SCAN            |VIEW2|1        |2   |
    |7 |    TABLE SCAN             |t2   |1        |2   |
    =====================================================
    Outputs & filters:
    -------------------------------------
      0 - output([1]), filter(nil), rowset=256, limit(1), offset(nil)
      1 - output(nil), filter(nil), rowset=256,
          conds(nil), nl_params_(nil)
      2 - output(nil), filter(nil), rowset=256,
          access(nil)
      3 - output([t1.c1]), filter(nil), rowset=256, sort_keys([t1.c1, ASC]), topn(1)
      4 - output([t1.c1]), filter([t1.c1 > 0]), rowset=256,
          access([t1.c1]), partitions(p0)
      5 - output(nil), filter(nil), rowset=256
      6 - output(nil), filter(nil), rowset=256,
          access(nil)
      7 - output([1]), filter([t2.c1 > 0]), rowset=256,
          access([t2.c1]), partitions(p0),
          limit(1), offset(nil)
    

    An SQL query for a multi-table join that meets the preceding conditions can execute LIMIT pushdown to outer joins and cross joins multiple times to increase the room for query rewrite and achieve better rewrite results.

    DISTINCT elimination

    • If the SELECT statement contains only constants, DISTINCT can be eliminated, with LIMIT 1 added.

      obclient> SELECT DISTINCT 1,2 FROM t1 ;
      <==>
      obclient> SELECT 1,2 FROM t1 LIMIT 1;
      
      obclient> CREATE TABLE t1 (c1 INT PRIMARY KEY, c2 INT);
      Query OK, 0 rows affected
      
      obclient> EXPLAIN EXTENDED_NOADDR SELECT DISTINCT 1,2 FROM t1;
      
      Query Plan:
      ===================================
      |ID|OPERATOR  |NAME|EST. ROWS|COST|
      -----------------------------------
      |0 |TABLE SCAN|t1  |1        |36  |
      ===================================
      
      Outputs & filters:
      -------------------------------------
        0 - output([1], [2]), filter(nil),
            access([t1.c1]), partitions(p0),
            limit(1), offset(nil),
            is_index_back=false,
            range_key([t1.c1]), range(MIN ; MAX)always true
      
    • DISTINCT can be eliminated if the SELECT statement contains a column that ensures uniqueness. In the following example, (c1, c2) is the primary key. It ensures the uniqueness of c1, c2, and c3. Therefore, DISTINCT can be eliminated.

      obclient> CREATE TABLE t2(c1 INT, c2 INT, c3 INT, PRIMARY KEY(c1, c2));
      Query OK, 0 rows affected
      
      obclient> SELECT DISTINCT c1, c2, c3 FROM t2;
      <==>
      obclient> SELECT c1, c2 c3 FROM t2;
      
      obclient> EXPLAIN SELECT DISTINCT c1, c2, c3 FROM t2;
      
      Query Plan:
      ===================================
      |ID|OPERATOR  |NAME|EST. ROWS|COST|
      -----------------------------------
      |0 |TABLE SCAN|t2  |1000     |455 |
      ===================================
      
      Outputs & filters:
      -------------------------------------
        0 - output([t2.c1], [t2.c2], [t2.c3]), filter(nil),
            access([t2.c1], [t2.c2], [t2.c3]), partitions(p0)
      

    MIN/MAX rewrite

    • If the MIN or MAX function contains a parameter serving as the index prefix column and does not contain GROUP BY, SCALAR aggregate can be rewritten into an index scan for scanning only one row, as shown in the following example:

      obclient> CREATE TABLE t1 (c1 INT PRIMARY KEY, c2 INT, c3 INT, KEY IDX_c2_c3(c2,c3));
      Query OK, 0 rows affected
      
      obclient> SELECT MIN(c2) FROM t1;
      <==>
      obclient> SELECT MIN(c2) FROM (SELECT c2 FROM t2 ORDER BY c2 LIMIT 1) AS t;
      
      obclient> EXPLAIN SELECT MIN(c2) FROM t1;
      
      Query Plan:
      ==================================================
      |ID|OPERATOR       |NAME          |EST. ROWS|COST|
      --------------------------------------------------
      |0 |SCALAR GROUP BY|              |1        |37  |
      |1 | SUBPLAN SCAN  |subquery_table|1        |37  |
      |2 |  TABLE SCAN   |t1(idx_c2_c3) |1        |36  |
      ==================================================
      
      Outputs & filters:
      -------------------------------------
        0 - output([T_FUN_MIN(subquery_table.c2)]), filter(nil),
            group(nil), agg_func([T_FUN_MIN(subquery_table.c2)])
        1 - output([subquery_table.c2]), filter(nil),
            access([subquery_table.c2])
        2 - output([t1.c2]), filter([(T_OP_IS_NOT, t1.c2, NULL, 0)]),
            access([t1.c2]), partitions(p0),
            limit(1), offset(nil)
      
    • If all parameters of SELECT MIN or SELECT MAX are constants and GROUP BY is contained, you can rewrite MIN or MAX to a constant to reduce the resource consumption of the MIN or MAX operation.

      obclient> SELECT MAX(1) FROM t1 GROUP BY c1;
      <==>
      obclient> SELECT 1 FROM t1 GROUP BY c1;
      
      obclient> EXPLAIN EXTENDED_NOADDR SELECT MAX(1) FROM t1 GROUP BY c1;
      
      Query Plan:
      ===================================
      |ID|OPERATOR  |NAME|EST. ROWS|COST|
      -----------------------------------
      |0 |TABLE SCAN|t1  |1000     |411 |
      ===================================
      
      Outputs & filters:
      -------------------------------------
        0 - output([1]), filter(nil),
            access([t1.c1]), partitions(p0),
            is_index_back=false,
            range_key([t1.c1]), range(MIN ; MAX)always true
      
    • If all parameters of SELECT MIN or SELECT MAX are constants and GROUP BY is not contained, you can rewrite the query to scan only one row by using the index, as shown in the following example:

      obclient> SELECT MAX(1) FROM t1;
      <==>
      obclient> SELECT MAX(t.a) FROM (SELECT 1 AS a FROM t1 LIMIT 1) t;
      
      obclient> EXPLAIN EXTENDED_NOADDR SELECT MAX(1) FROM t1;
      
      Query Plan:
      ==================================================
      |ID|OPERATOR       |NAME          |EST. ROWS|COST|
      --------------------------------------------------
      |0 |SCALAR GROUP BY|              |1        |37  |
      |1 | SUBPLAN SCAN  |subquery_table|1        |37  |
      |2 |  TABLE SCAN   |t1            |1        |36  |
      ==================================================
      
      Outputs & filters:
      -------------------------------------
        0 - output([T_FUN_MAX(subquery_table.subquery_col_alias)]), filter(nil),
            group(nil), agg_func([T_FUN_MAX(subquery_table.subquery_col_alias)])
        1 - output([subquery_table.subquery_col_alias]), filter(nil),
            access([subquery_table.subquery_col_alias])
        2 - output([1]), filter(nil),
            access([t1.c1]), partitions(p0),
            limit(1), offset(nil),
            is_index_back=false,
            range_key([t1.c1]), range(MIN ; MAX)always true
      

    Previous topic

    Overview
    Last

    Next topic

    Cost-based query rewrite
    Next
    What is on this page
    Subquery related rewrite
    View merge
    Subquery unnesting
    Rewrite ANY or ALL by using MAX or MIN
    Outer join elimination
    Condition simplification
    HAVING condition elimination
    Equivalence relation deduction
    Identically true/false elimination
    Non-SPJ rewrite
    Sorting redundancy elimination
    LIMIT pushdown to subquery
    LIMIT pushdown to outer join or cross join
    DISTINCT elimination
    MIN/MAX rewrite