In OceanBase Database, tables are the most fundamental data storage units. Tables contain all data that users can access, and each table consists of multiple rows of records, with each row containing multiple columns. The design and usage of each table must be reasonably planned based on business requirements to ensure the efficiency and scalability of the system. This section covers topics such as table types, update modes, data distribution, indexes, data types, and views, and provides scenario recommendations and combinations based on the AP table design principles.
Overview of table design principles in AP scenarios
In AP (Analytical Processing) scenarios, table design can be guided by the following three principles, which can be combined with other capabilities such as table types, data distribution, indexes, and views:
- Wide tables: In analytical scenarios, wide tables with a large number of columns are commonly used. These tables are queried in a way that accesses only a subset of columns during each query. Using columnstore tables or columnstore indexes allows for columnar storage and scanning, reducing I/O and facilitating vectorized operations and skip indexes. Combined with partitioning and table groups, wide tables can be horizontally distributed and aligned with other tables, making joins and parallel scans more efficient. For more information, see the sections on "Table types" and "Data distribution" in this topic, as well as Data table design and query optimization in AP scenarios.
- Pre-aggregation: For fixed-dimensional aggregations and reports, you can precompute and persist results using materialized views. This reduces the overhead of real-time aggregation during queries. This approach is suitable for scenarios such as reports, dashboards, and fixed analysis dimensions. For more information, see the section on "Views" in this topic, as well as Materialized views (MySQL mode) and Materialized views (Oracle mode).
- Partitioning strategy: Using partitioned tables and partitioning keys (such as time, region, or ID), you can horizontally split data. This allows for partition pruning to reduce the amount of data scanned, parallel computation by partition, and lifecycle management (archiving and cleanup) by partition. In AP scenarios, it is recommended to align the partitioning key with commonly used filtering and aggregation dimensions and use table groups to align partitions across multiple tables for optimized joins. For more information, see the section on "Data distribution" in this topic, as well as Table partitioning design in OLAP scenarios.
Table types
OceanBase Database supports various table types, including partitioned tables, replicated tables, primary key tables, non-primary key tables, and external tables.
- Partitioned tables: OceanBase Database allows you to divide the data of a regular table into different partitions based on specific rules. Data within the same partition is physically stored together. This type of table is called a partitioned table. OceanBase Database supports several basic partitioning strategies, including range, list, and hash partitioning.
- Replicated tables: Replicated tables are a special type of table in OceanBase Database. Data in these tables is replicated across all healthy replicas, allowing you to read the latest modifications from any replica.
- Primary key tables and non-primary key tables: A primary key table is a table that contains a primary key. A non-primary key table is a table that does not specify a primary key.
- External tables: In a database, table data is stored in the database's storage space. In contrast, external table data is stored in external storage services.
- Heap tables (heap-organized tables): In heap tables, data is not sorted by the primary key. Instead, the primary key serves as a secondary unique index, and data is decoupled from the index. The primary key enforces uniqueness, while queries rely on the main table. When data is written in chronological order, heap tables can more effectively utilize skip indexes to improve the efficiency of analytical queries. In AP scenarios, new tables are default heap-organized. If the tenant-level configuration parameter default_table_organization is set to
HEAPin the OLAP parameter template and you do not explicitly specifyORGANIZATION, the new table will be a heap table. This facilitates large-scale data imports and skip index utilization. - Temporary tables: In AP scenarios, temporary tables are commonly used to store intermediate results from ETL processes, temporary data from direct load operations, and intermediate results from complex queries. Data in temporary tables is isolated by session or transaction and is automatically cleared when the session ends or the transaction is committed or rolled back. Temporary tables are distinct from persistent tables. In MySQL mode, temporary tables are session-level. In Oracle mode, global temporary tables are supported.
In AP scenarios, OceanBase Database supports various table types beyond those commonly used in TP scenarios, such as replicated tables, partitioned tables, primary key tables, and non-primary key tables. Based on the data storage method, whether data is stored by row or column, new table types have been introduced: columnstore tables and hybrid row-column tables.
Columnstore tables: Columnstore tables store data by column rather than by row, significantly improving the performance of analytical queries, especially in scenarios involving large data volumes and frequent aggregations. For more information, see Columnstore architecture.
Hybrid row-column tables: Hybrid row-column tables store data both by row and by column. The system automatically determines whether to use row-based or column-based queries based on the query statement. This type of table is suitable for scenarios that require both transactional and analytical capabilities.
Table update modes
OceanBase Database allows you to specify the write and query modes for data when creating a table. When you create a table, you can use the merge_engine parameter in the CREATE TABLE statement to choose between the delete_insert update mode and the partial_update update mode. These two modes are data update strategies designed for different business scenarios.
delete_insert (full-column update mode)
This mode prioritizes query performance. It uses the "Merge-On-Write" mechanism to convert
UPDATEoperations into full-columnDELETEandINSERTrecords, ensuring that each row contains complete column values. This mode significantly improves the efficiency of complex queries and batch processing (such as analytical tasks). However, incremental data requires additional storage space. It is suitable for scenarios with frequent incremental data and the need for quick analysis.partial_update (partial update mode)
This mode only records the values of modified columns, avoiding redundant storage. During queries, multiple data records need to be merged to obtain the latest values, resulting in relatively lower performance. However, it is more suitable for scenarios with frequent updates but low query requirements (such as OLTP operations) or environments sensitive to storage costs.
| Feature Category | delete_insert Update Mode | partial_update Update Mode |
|---|---|---|
| Storage Method | Each update writes two rows (DELETE and INSERT) to the SSTable, containing all column data. |
Each update only records the values of modified columns, saving storage space. |
| Query Efficiency |
|
During queries, multiple Memtable/SSTable records need to be merged to obtain the latest primary key values, which may affect performance. It is suitable for scenarios with frequent updates and storage cost sensitivity. |
| Applicable Scenarios | Scenarios with a high proportion of incremental data that require frequent execution of complex queries or batch processing analysis. | Scenarios with frequent updates but low query requirements. |
For more information, see Create a table in MySQL mode and Create a table in Oracle mode.
Data distribution
Partitioning strategy is a crucial aspect of AP table design. OceanBase Database allows you to create partitioned tables to distribute data across different partitions. Data from different partitions can be distributed across different machines. During queries, partition pruning reduces the amount of data scanned and leverages resources from multiple machines to enhance query performance. By default, data between different tables is randomly distributed and not directly related. Load balancing ensures that data from a single table is evenly distributed across the entire cluster.
In distributed AP systems, tables often contain a large amount of data. When data from different tables is randomly distributed, the overhead of data transfer during table joins can be significant. Using table groups (Table Group) allows partitioned tables with the same partitioning method to align according to specific rules. This ensures that related data is concentrated on the same machine, enabling the use of Partition Wise Join during table joins. This effectively reduces data transfer overhead in join scenarios and improves performance.
OceanBase Database supports both primary and secondary partitioned tables and provides three types of partitioning: RANGE, LIST, and HASH.
OceanBase Database provides three types of table groups: OceanBase Database provides three types of table groups:
NONE: All tables and all partitions within the table group are concentrated on the same machine.
PARTITION: Data from each table in the table group is evenly distributed across partitions. For secondary partitioned tables, all secondary partitions under each primary partition are concentrated.
ADAPTIVE: Data from each table in the table group is evenly distributed based on adaptive methods. For primary partitioned tables, data is evenly distributed across primary partitions. For secondary partitioned tables, data is evenly distributed across secondary partitions under each primary partition.
For more information, see Data distribution, Table groups in MySQL mode, and Table groups in Oracle mode. For more information, see Data distribution, Table groups in MySQL mode, and Table groups in Oracle mode.
Index types
Indexes are key components for improving query performance. OceanBase Database supports various index types in the AP scenario.
- Index distribution: Local indexes are indexes that are partitioned and stored in the same partitions as the base table. Global indexes are indexes that are partitioned independently of the base table and can be organized across partitions. Both types are related to "which partitions the indexes are built on and how they are distributed."
- Index storage format: Rowstore indexes and columnstore indexes refer to indexes that are organized by rows or columns. Columnstore indexes, such as
WITH COLUMN GROUP (each column), are suitable for analytical scans and aggregations. Rowstore indexes are suitable for point queries and random access. A columnstore index is an index that uses the columnar storage format, such as a B-tree index specified as columnar. Index tables organized by columns (e.g.,WITH COLUMN GROUP (each column)) are stored by columns rather than rows. They are suitable for aggregations, wide table scans, and analytical queries in HTAP scenarios and can significantly improve the performance of analytical queries, especially when processing large amounts of data. For more information, see Columnstore and Create an index (MySQL mode)—Columnstore index.
Based on index structure, indexes can be categorized into the following types:
- B-tree index (normal index): The default index structure, including unique indexes, non-unique indexes, and function indexes. It is suitable for equality queries, range queries, sorting, and point queries. A B-tree index can be a local index or global index, and it can use rowstore or columnstore format. A B-tree index using the columnar storage format is commonly referred to as a columnstore index and is suitable for analytical scans in HTAP scenarios. For more information, see Index overview, Create an index (MySQL mode), and Columnstore.
- Full-text index: A full-text index is built on
CHAR,VARCHAR, andTEXTcolumns by using an inverted index and tokenization. It is suitable for log analysis, document/knowledge base retrieval, and relevance ranking. Only local indexes are supported. For more information, see Full-text index. - JSON multi-value index: In MySQL mode of OceanBase Database, you can create a multi-value index on an array or a collection to improve the query efficiency of searches based on JSON array elements. It is suitable for tagging, categorization, many-to-many associations, and predicates such as
MEMBER OF(),JSON_CONTAINS(), andJSON_OVERLAPS(). Only local indexes are supported. For more information, see Multi-value index and Create an index (MySQL mode). - Spatial index: A spatial index is an independent index structure for geographic space data and GIS scenarios. It can quickly retrieve data within a specified coordinate range. Only local indexes are supported.
- Vector index: A vector index is built on a Vector column and includes dense vector indexes and sparse vector indexes. Examples of dense vector indexes include HNSW and IVF. For more information about sparse vector indexes, see Vector index overview. It supports L2, inner product, and cosine distance/similarity calculations and is suitable for vector similarity retrieval queries. It has an independent index structure. Only local indexes are supported.
For more information about combinations such as columnstore indexes and rowstore base tables + columnstore indexes in the AP scenario, see Optimize queries in the HTAP scenario. For more information about special indexes such as full-text indexes, JSON multi-value indexes, spatial indexes, and vector indexes, see Best practices for choosing special indexes. For more information about syntax and implementation details, see Index overview (MySQL mode) and Index overview (Oracle mode).
Data types
Before creating and using tables, database administrators need to plan the table structure and data types based on business requirements. To ensure efficient data storage and query optimization, administrators should follow these principles:
- Normalize the table structure: Design the table structure reasonably to minimize data redundancy and improve query efficiency.
- Choose appropriate SQL data types: Select the most suitable SQL data type for each column to reduce storage space and improve query speed.
Common SQL data types include:
- Basic data types: such as
INT,VARCHAR, andDATE. - Complex data types: such as
JSON,ARRAY, andBITMAP, which are suitable for storing more complex data structures.
For more information about SQL data types, see:
Views
OceanBase Database supports standard views and materialized views. OceanBase Database supports standard views and materialized views.
- Standard views: Standard views, also known as non-materialized views, are the most common type of view. They store only the SQL query that defines the view, not the query results.
- Materialized views: Unlike standard views, materialized views store the results of the query in physical storage. OceanBase Database supports asynchronous materialized views, which means that the materialized view is not immediately updated when the data in the base table changes. This ensures the performance of DML operations on the base table. Materialized views are a common implementation of pre-aggregation, suitable for scenarios with fixed analysis requirements, reports, and dashboards. For more information, see Materialized views in MySQL mode and Materialized views in Oracle mode.
Scenario recommendations and combinations
Based on wide tables, pre-aggregation, and partitioning strategies, the following table provides table design recommendations for typical scenarios to help you choose the right design based on your business needs. Specific partition keys, number of partitions, storage formats, and indexes need to be further refined based on data volume and query patterns. For more information, see Data table design and query optimization in AP scenarios, Table partitioning design in OLAP scenarios, and JSON multi-value indexes and full-text indexes.
| Scenario | Recommended table design combination | Description |
|---|---|---|
| Pure analysis, large wide tables, and multi-column scanning | Columnstore table + Partitioned table (partitioned by time or dimension) + Table group (for multi-table joins) | Wide tables use columnar storage to reduce I/O; partitioning strategies enable pruning and parallelism; table groups ensure that partitioned tables are aligned for Partition Wise Join. |
| Fixed reports/dashboards and repeated aggregation queries | Detail table (partitioned and columnar or hybrid row-column) + Materialized view (pre-aggregated results) | Pre-aggregation: Materialized views store aggregated results for direct query; detail tables can still be used for ad-hoc analysis. |
| HTAP with a focus on TP and some analysis | Rowstore base table + Columnstore index + Partitioned table | Write and point queries use rowstore; analysis uses columnstore indexes, and storage is controllable. For more information, see Rowstore base table + Columnstore index in Data table design and query optimization in AP scenarios. |
| Physical isolation between TP and AP | Rowstore or hybrid row-column table on F/R + Columnstore replica (C replica) | AP traffic uses C replicas for weak reads, while TP uses F/R; columnstore scans can be achieved without building columnstore indexes on F/R for analysis. |
| Ad-hoc analysis with variable query columns | Hybrid row-column table + Partitioned table | The optimizer automatically selects between rowstore and columnstore paths; partitions are pruned and managed for lifecycle. |
| Logs/tracking data and queries/cleansing by time range | Columnstore table or hybrid row-column table + RANGE partitioning (by day or hour) | Partitioning strategy: Time-based partitioning enables pruning and partition-level archiving/deletion; columnstore storage supports columnar scanning and compression. |
These combinations can be used together (e.g., partitioned table + columnstore + materialized view), depending on your business write volume, query patterns, and storage costs.
Create data examples
Create a table that contains partitions, columnstore indexes, and rowstore indexes.
CREATE TABLE salesdata (
sale_id INT,
product_id INT NOT NULL,
saledate DATE NOT NULL,
saledate_int INT, -- define it as a normal column
quantity INT,
price DECIMAL(10, 2),
customer_id INT,
PRIMARY KEY (sale_id, saledate_int) -- a primary key can contain a normal column
)
PARTITION BY RANGE COLUMNS (saledate_int) (
PARTITION p2023_q1 VALUES LESS THAN (202304),
PARTITION p2023_q2 VALUES LESS THAN (202307),
PARTITION p2023_q3 VALUES LESS THAN (202310),
PARTITION p2023_q4 VALUES LESS THAN (202401)
)
WITH COLUMN GROUP (each column);
CREATE INDEX idx_product_id ON salesdata(product_id);
CREATE INDEX idx_customer_id ON salesdata(customer_id);
Partitions: The
salesdatatable uses range partitions.PARTITION BY RANGE COLUMNS (saledate_int): Partitions the table by thesaledate_intcolumn.Defined four partitions:
p2023_q1contains data before202304(the first quarter of 2023).p2023_q2,p2023_q3, andp2023_q4contain data from the second, third, and fourth quarters of 2023, respectively.
Column store: Columnstore storage is specified by using
WITH COLUMN GROUP(each column);. Data in these columns will be stored in columnar format, which is suitable for large-scale data analysis scenarios.Rowstore indexes: The
idx_product_idandidx_customer_idindexes are created on theproduct_idandcustomer_idcolumns, respectively. Rowstore indexes can accelerate queries based on specific columns, especially for frequent small queries.
When you create and use columnstore tables and import a large amount of data, you need to perform a major compaction to improve read performance, and perform statistic collection to optimize execution strategies.
Major compaction: After you batch import data, you are recommended to perform a major compaction. This helps improve read performance because a major compaction organizes fragmented data into contiguous storage, reducing disk I/O during reads. After you import data, trigger a major compaction in the tenant to ensure that all data is merged into the baseline layer. For more information, see MAJOR and MINOR (MySQL mode) and MAJOR and MINOR (Oracle mode).
Statistic collection: After the major compaction is completed, you are recommended to collect statistics. This is crucial for the optimizer to generate efficient query plans and execution strategies. Execute GATHER_SCHEMA_STATS (MySQL mode) or GATHER_SCHEMA_STATS (Oracle mode) to collect statistics for all tables. Monitor the progress by using the GV$OB_OPT_STAT_GATHER_MONITOR (MySQL mode) or GV$OB_OPT_STAT_GATHER_MONITOR (Oracle mode) view.
Keep in mind that as the amount of data in the columnstore table increases, the speed of the major compaction may slow down.
