Unit 1: Best practices for optimizing storage structures and query performance|V4.3.5| docs|Distributed Database

Unit 1: Best practices for optimizing storage structures and query performance

Last Updated：2025-07-15 08:58:43 Updated

In the design and optimization of database systems, data table design is a critically important task. Proper design not only ensures data integrity and consistency but also greatly enhances query performance, storage efficiency, and system scalability. As data volume grows and query patterns become more complex, scientifically designing data tables and selecting appropriate storage structures and indexing strategies have become core issues in database optimization.

This topic introduces best practices for data table design, covering how to choose table storage formats, design single tables without primary keys in columnar storage, design partitions and primary keys, and use indexes in HTAP scenarios.

Choose a table storage format

OceanBase supports columnar storage for table data as well as hybrid row-column storage. You can choose the appropriate data storage format based on your business scenario.

Columnar storage: If your business involves only online analytical processing (OLAP) analytics, you can choose columnar storage.
Hybrid row-column storage: If your business involves both OLAP analytics and online transaction processing (OLTP) transactions and you want to ensure both analytical query performance and point query performance, you can enable hybrid row-column storage.

You can configure the default table storage format for a tenant by using the default_table_store_format parameter.

-- Set the default table storage format for tables in the current tenant to columnar storage.
ALTER SYSTEM SET default_table_store_format = "column";

-- Set the default table storage format for tables in the current tenant to hybrid row-column storage.
ALTER SYSTEM SET default_table_store_format = "compound";

Select a single table without a primary key for columnar storage

In data warehouses or OLAP scenarios, business data tables contain a large volume of data, and most queries involve only specific columns. If the business does not have special requirements such as ensuring data uniqueness, avoiding duplicate data writes, or performing transactional updates or inserts, you can create a table without a primary key for columnar storage:

Data import: The data does not need to be sorted by the primary key, so the import performance is faster compared with that of a table with a primary key.
Reduced costs for primary key management.

show parameters like "%store_format%";

  CREATE TABLE customer (
  user_id bigint NOT NULL,
  login_time timestamp NOT NULL,
  customer_name varchar(100) NOT NULL,    -- Assume that the customer name is up to 100 characters in length.
  phone_num bigint NOT NULL,               -- The phone number is stored as a bigint.
  city_name varchar(50) NOT NULL,          -- Assume that the city name is up to 50 characters in length.
  sex int NOT NULL,                        -- The gender is stored as an int, for example, 0 for female and 1 for male.
  id_number varchar(18) NOT NULL,          -- Assume that the ID number is up to 18 characters in length.
  home_address varchar(255) NOT NULL,      -- Assume that the home address is up to 255 characters in length.
  office_address varchar(255) NOT NULL,    -- Assume that the office address is up to 255 characters in length.
  age int NOT NULL                         -- The age is stored as an int.
);

Select a data table partition

Partitioning technology splits a large table into multiple smaller and easier-to-manage subtables, providing significant advantages in queries and data management. Partitions are typically created based on a specific field of data, such as time or region.

Partitioning scenarios

Large data volume: When a table contains hundreds of millions or even billions of records, single-table operations can lead to poor query performance and may cause resource bottlenecks in the database.
Frequent queries that depend on certain fields: For example, time fields are commonly used as query conditions. Time-based partitioning can improve the efficiency of time-related queries.

Partitioning strategies

RANGE partitioning: suitable for partitioning data by time periods, especially when data is generated based on time and typically filtered by time.
HASH partitioning: involves distributing data evenly across multiple partitions using a hash algorithm. It is suitable for scenarios without a clear query pattern.
LIST partitioning: suitable for discrete data, such as partitioning by regions.

In AP scenarios, multiple dimensions are usually involved in analysis queries. If no single dimension can be used for partitioning that applies to all queries, and if partitioning is needed to distribute data across multiple machines to utilize distributed node computing power, you can select a partitioning key for HASH partitioning to evenly distribute user data across partitions:

The number of distinct values (NDV) of the column is much greater than the number of partitions.
The data in this column does not exhibit significant skewness, or only mild skewness exists, such as transaction IDs, user IDs, or auto-increment columns.
Preferably, choose integer or time columns. If you must choose a character column, prefer VARCHAR or CHAR columns.
Prefer columns that frequently appear in query conditions to facilitate partition pruning.

CREATE TABLE customer (
    user_id BIGINT NOT NULL,
    login_time TIMESTAMP NOT NULL,
    customer_name VARCHAR(100) NOT NULL,
    phone_num BIGINT NOT NULL,
    city_name VARCHAR(50) NOT NULL,
    sex INT NOT NULL,
    id_number VARCHAR(18) NOT NULL,
    home_address VARCHAR(255) NOT NULL,
    office_address VARCHAR(255) NOT NULL,
    age INT NOT NULL
)
PARTITION BY HASH(user_id) PARTITIONS 128;

In distributed databases, data of multiple tables can be distributed across different machines due to partitioning. This can lead to cross-machine communication when executing complex operations like JOIN queries. Table groups can be used to avoid the performance issues caused by cross-machine access.

Create a table group named tg1 with adaptive partitioning, and two partitioned tables named customer and sales. When the conditions for joining the two partitioned tables include join conditions for the partitioning key, you can use a partition-wise join to improve performance.

CREATE TABLEGROUP tg1 SHARDING = 'ADAPTIVE';

-- Customer information table
CREATE TABLE customer (
    user_id BIGINT NOT NULL,
    login_time TIMESTAMP NOT NULL,
    customer_name VARCHAR(100) NOT NULL,
    phone_num BIGINT NOT NULL,
    city_name VARCHAR(50) NOT NULL,
    sex INT NOT NULL,
    id_number VARCHAR(18) NOT NULL,
    home_address VARCHAR(255) NOT NULL,
    office_address VARCHAR(255) NOT NULL,
    age INT NOT NULL)
  TABLEGROUP = tg1
  PARTITION BY HASH(user_id) PARTITIONS 128;
-- Transaction table
CREATE TABLE sales (
  order_id INT,
  user_id INT primary key,
  item_id INT,
  item_count INT)
  TABLEGROUP = tg1
  PARTITION BY HASH(user_id) PARTITIONS 128;


SELECT * FROM customer, sales where customer.user_id = sales.user_id;

If a large amount of business data is involved and the query patterns are clear, you can create subpartitions to further leverage partition pruning for accelerating queries. AP scenarios typically involve querying data from recent days or a month, which are usually time-related queries. Therefore, it is recommended to choose a time-related column or apply time-related functions for subpartitioning, and use RANGE partitioning for convenient range queries.

CREATE TABLE customer (
    user_id BIGINT NOT NULL,
    login_time TIMESTAMP NOT NULL,
    customer_name VARCHAR(100) NOT NULL,
    phone_num BIGINT NOT NULL,
    city_name VARCHAR(50) NOT NULL,
    sex INT NOT NULL,
    id_number VARCHAR(18) NOT NULL,
    home_address VARCHAR(255) NOT NULL,
    office_address VARCHAR(255) NOT NULL,
    age INT NOT NULL,
    -- The primary key contains all partitioning keys (user_id and age).
    PRIMARY KEY (user_id, age, login_time)
)
-- Primary partition: user_id is hashed.
PARTITION BY HASH(user_id)
PARTITIONS 128
SUBPARTITION BY RANGE(age)
SUBPARTITION TEMPLATE (
    -- Example partition: divided by age groups.
    SUBPARTITION p_youth VALUES LESS THAN (25),      -- age <25
    SUBPARTITION p_adult VALUES LESS THAN (40),      -- 25 ≤ age <40
    SUBPARTITION p_middle_aged VALUES LESS THAN (60),-- 40 ≤ age <60
    SUBPARTITION p_senior VALUES LESS THAN (MAXVALUE)-- age ≥60
);

Design a primary key

Primary key design is a fundamental part of relational database table design. In OceanBase, primary keys not only ensure data uniqueness, but also optimize performance for range queries since data is sorted by primary key. Different primary key design strategies are suitable for different business scenarios.

Scenarios

Ensure data uniqueness: If every record in a table must be unique, a primary key must be designed.
Improve query efficiency: The optimizer can generate more efficient query plans when a primary key is available. It is recommended to use the primary key as the partitioning key if it is involved in data queries. This way, the primary key can be used to improve query performance;

CREATE TABLE customer (
    user_id BIGINT NOT NULL,
    login_time TIMESTAMP NOT NULL,
    customer_name VARCHAR(100) NOT NULL,
    phone_num BIGINT NOT NULL,
    city_name VARCHAR(50) NOT NULL,
    sex INT NOT NULL,
    id_number VARCHAR(18) NOT NULL,
    home_address VARCHAR(255) NOT NULL,
    office_address VARCHAR(255) NOT NULL,
    age INT NOT NULL,
    -- The primary key contains all partitioning keys (user_id and age).
    PRIMARY KEY (user_id, age, login_time)
)
-- The primary partitions are hash-distributed by user_id.
PARTITION BY HASH(user_id)
PARTITIONS 128
SUBPARTITION BY RANGE(age)
SUBPARTITION TEMPLATE (
    -- Example partitions are created based on age groups.
    SUBPARTITION p_youth VALUES LESS THAN (25),      -- age <25
    SUBPARTITION p_adult VALUES LESS THAN (40),      -- 25 ≤ age <40
    SUBPARTITION p_middle_aged VALUES LESS THAN (60),-- 40 ≤ age <60
    SUBPARTITION p_senior VALUES LESS THAN (MAXVALUE)-- age ≥60
);

Note:

The data import performance for tables with primary keys is inferior to that for tables without primary keys, because data in the table is sorted by the primary key.
The primary key for partitioned tables must include all partitioning keys.

Optimize queries in HTAP scenarios

OceanBase provides various features to support hybrid transactional and analytical processing (HTAP) scenarios, meeting both transaction processing (TP) and analytical processing (AP) business requirements. In addition to supporting row storage, columnar storage, and hybrid row-column storage formats for data tables, OceanBase also allows you to specify the storage format for index tables. By leveraging the advantages of both row-based and columnar storage, you can efficiently handle corresponding scenarios.

For more information about how to create a columnstore index in MySQL-compatible mode, see Create a columnstore index in MySQL-compatible mode. For more information about how to create a columnstore index in Oracle-compatible mode, see Create a columnstore index in Oracle-compatible mode.

Row-based table with a columnstore index

Scenarios: This solution is suitable for workloads that are primarily transactional but also require some analytical queries. It is also suitable for wide tables. Optimization strategy: Create a columnstore index to improve the performance of analytical queries.

Advantages: A columnstore index redundantly stores only specific columns in a compressed format.
Disadvantages: You must identify suitable fields for a columnstore index based on your query statements. This approach degrades the write performance of the system.

Columnar table with a row-based index

Scenarios: This solution is suitable for workloads that are primarily analytical but must support efficient simple queries such as precise queries. Optimization strategy: Create a covering index to avoid multiple table lookups.

Advantages: A covering index redundantly stores only specific columns. You cannot convert a columnar table to a row-based or hybrid row-based and columnar table online. However, you can add a row-based index to improve query performance.
Disadvantages: You must identify suitable fields for a row-based index based on your query statements. This approach degrades the write performance of the system.

Hybrid row and column storage: row-based table with a columnar table

Scenarios: This solution is suitable for workloads that require both transactional and analytical queries and support multiple query modes. Optimization strategy: You can redundantly store two sets of baseline data in row-based and columnar tables. The system then generates an appropriate execution plan based on the query characteristics. By default, range scans in columnar tables are performed in columnar mode, and precise queries are performed in row-based mode.

Advantages: You do not need to analyze the query characteristics of your business. The query results of analytical queries are strongly consistent.
Disadvantages: This solution consumes more disk space. The execution plan may not be accurate.

Columnstore replicas: 2F1A1C

Scenarios: This solution is suitable for typical HTAP workloads where you do not want analytical queries to affect transactional workloads. However, you can accept weak-consistency reads for analytical queries. Optimization strategy: You can use the isolation feature of columnstore replicas to execute queries of AP and TP workloads without affecting each other.

Advantages:
1. You do not need to store redundant data for multiple replicas. This saves disk space.
2. AP and TP workloads are mutually isolated.
3. The write performance of the system is not affected.
Disadvantages:
1. Columnstore replicas support only weak-consistency reads.
2. Columnstore replicas require independent access entries.