In OceanBase Database, tables are the fundamental data storage units. Tables contain all data accessible to users. Each table consists of multiple rows, and each row contains multiple columns. The design and usage of each table must be reasonably planned based on business requirements to ensure the system's efficiency and scalability. This section covers table types, update modes, data distribution, indexes, data types, and views. It also provides scenario recommendations and combinations based on the AP table design principles.
Overview of table design principles for AP scenarios
In AP (Analytical Processing) scenarios, table design should follow the following three principles. These principles can be combined with other features such as table types, data distribution, indexes, and views:
- Wide tables: In analytical scenarios, wide tables with many columns are often used. A single query may access only a subset of the columns. Using columnstore tables or columnstore indexes allows data to be stored and scanned by column, reducing I/O and facilitating vectorization and skip indexing. Combined with partitioning and table groups, wide tables can be horizontally distributed and aligned with other tables, facilitating joins and parallel scans. For more information, see the sections on "Table Types" and "Data Distribution" in this topic, and Data table design and query optimization in AP scenarios.
- Pre-aggregation: For fixed-dimensional aggregations and reports, you can precompute and persist the results using materialized views. This allows queries to directly read from the materialized views, reducing the overhead of real-time aggregation. This approach is suitable for reports, dashboards, and fixed analysis scenarios. For more information, see the section on "Views" in this topic, and Materialized views (MySQL mode) and Materialized views (Oracle mode).
- Partitioning strategy: Using partitioned tables, you can horizontally split data based on partitioning keys (such as time, region, or ID). This allows for partition pruning to reduce the amount of data scanned, parallel computation by partition, and lifecycle management (archiving and cleanup) by partition. In AP scenarios, it is recommended to align the partitioning key with commonly used filtering and aggregation dimensions and to use table groups for partition alignment to optimize joins. For more information, see the section on "Data Distribution" in this topic, and Table partitioning design in OLAP scenarios.
Table types
OceanBase Database supports various table types, including partitioned tables, replicated tables, primary key tables, non-primary key tables, and external tables.
- Partitioned tables: OceanBase Database can divide the data of a regular table into different partitions based on specific rules. The data within each partition is physically stored together. This type of table is called a partitioned table. OceanBase Database supports basic partitioning strategies such as range partitioning, list partitioning, and hash partitioning.
- Replicated tables: Replicated tables are a special type of table in OceanBase Database. Data in a replicated table can be read from any "healthy" replica.
- Primary key tables and non-primary key tables: A primary key table is a table that contains a primary key. A non-primary key table is a table that does not specify a primary key.
- External tables: The data of a regular table is stored in the storage space of the database, while the data of an external table is stored in an external storage service.
- Heap tables (heap-organized tables): In heap-organized tables, data is not sorted by the primary key. The primary key serves as a secondary unique index, and the data and index are decoupled. The primary key is used for uniqueness constraints, while queries rely on the main table. When user data is written in chronological order, skip indexing can be more effectively utilized to improve the efficiency of analytical queries. In AP scenarios, new tables are default heap-organized. The tenant-level configuration parameter default_table_organization in the OLAP parameter template is set to
HEAP. IfORGANIZATIONis not explicitly specified, the new table is a heap table, which facilitates large-scale data imports and skip index utilization. - Temporary tables: In AP scenarios, temporary tables are commonly used to store ETL intermediate results, temporary data for direct load, and intermediate results of complex queries. Data is isolated by session or transaction. After a session ends or a transaction is committed or rolled back, the data is automatically cleared, distinguishing it from persistent tables. In MySQL mode, temporary tables are session-level. In Oracle mode, global temporary tables are supported.
In AP scenarios, OceanBase Database supports various table types, including replicated tables, partitioned tables, primary key tables, and non-primary key tables as in TP scenarios. Based on the data storage method, whether data is stored by row or column, new table types are introduced: columnstore tables and hybrid row-column tables.
Columnstore tables: In columnstore tables, data is stored by column rather than by row. This significantly improves the performance of analytical queries, especially in scenarios involving large amounts of data and frequent aggregations. For more information, see Columnstore architecture.
Hybrid row-column tables: In hybrid row-column tables, data is stored both by row and by column. The system automatically determines whether to use row-based or column-based queries based on the query statement, making it suitable for scenarios that require both transactional and analytical operations.
Table update modes
OceanBase Database allows you to specify the write and query modes of data when you create a table. When you create a table, you can use the merge_engine parameter in the CREATE TABLE statement to choose between the delete_insert update mode and the partial_update update mode. These two modes are data update strategies designed for different business scenarios.
delete_insert (full-column update mode)
This mode prioritizes query performance. It uses the "Merge-On-Write" mechanism to convert
UPDATEoperations into full-columnDELETEandINSERTrecords, ensuring that each row contains complete column values. This mode significantly improves the efficiency of complex queries and batch processing (such as analytical tasks), but incremental data requires additional storage space. It is suitable for scenarios with frequent incremental data and the need for rapid analysis.partial_update (partial update mode)
This mode only records the values of modified columns, avoiding redundant storage. During queries, multiple data records need to be merged to obtain the latest values, resulting in relatively lower performance. However, it is more suitable for scenarios with frequent updates but low query requirements (such as OLTP applications) or environments sensitive to storage costs.
| Feature Category | delete_insert Update Mode | partial_update Update Mode |
|---|---|---|
| Storage Method | Each update writes two rows (DELETE and INSERT) to the SSTable, containing all column data. |
Only the values of modified columns are recorded, saving storage space. |
| Query Efficiency |
|
Multiple Memtable/SSTable records need to be merged to obtain the latest values for the primary key, which may affect performance. Suitable for scenarios with frequent updates and high storage costs. |
| Applicable Scenarios | Scenarios with a high proportion of incremental data requiring frequent execution of complex queries or batch processing. | Scenarios with frequent updates but low query requirements. |
For more information, see Create a table in MySQL mode and Create a table in Oracle mode.
Data distribution
Partitioning strategy is a crucial aspect of AP table design. OceanBase Database distributes data across different partitions by creating partitioned tables. This allows data from different partitions to be spread across different machines. During queries, partition pruning reduces the amount of data scanned and leverages multi-machine resources to enhance query performance. By default, data between different tables is randomly distributed without direct relationships. Load balancing ensures that data from a single table is evenly distributed across the entire cluster.
In distributed AP systems, tables often contain large amounts of data. When data from different tables is randomly distributed, the overhead of data transfer during table joins can be significant. Using table groups (Table Group) aligns partitioned tables with the same partitioning strategy according to specific rules. This allows related data to be co-located on the same machine, enabling the use of Partition Wise Join during table joins, which effectively reduces data transfer overhead and improves performance.
OceanBase Database supports both primary and secondary partitioned tables and provides three types of partitioning: RANGE, LIST, and HASH.
OceanBase Database provides three types of table groups:
NONE: All tables in the table group are co-located on the same machine.
PARTITION: Data from each table in the table group is distributed across primary partitions. For secondary partitioned tables, all secondary partitions under each primary partition are co-located.
ADAPTIVE: Data from each table in the table group is distributed adaptively. For primary partitioned tables, data is distributed across primary partitions. For secondary partitioned tables, data is distributed across secondary partitions under each primary partition.
For more information, see Data distribution, Table groups in MySQL mode, and Table groups in Oracle mode.
Index types
Indexes are key components for improving query performance. OceanBase Database supports multiple index types in the AP scenario.
- Index distribution: A local index is one whose partitions are bound to those of the base table. A global index is one whose partitions are independent of those of the base table. Both types are related to how indexes are distributed across partitions.
- Index storage format: A rowstore index and a columnstore index refer to indexes that are organized by rows or columns, respectively. Columnstore indexes (such as
WITH COLUMN GROUP (each column)) are suitable for analytical scans and aggregations, while rowstore indexes are suitable for point queries and random access. A columnstore index is an index that uses the columnstore storage format (such as a B-tree index specified as columnstore). Columnstore indexes organize data by columns (WITH COLUMN GROUP (each column)) and store data by columns rather than rows. They are suitable for aggregations, wide table scans, and analytical queries in HTAP scenarios. They can significantly improve the performance of analytical queries, especially when processing large amounts of data, by reducing the time required for data retrieval and analysis. For more information, see Columnstore and Create an index (MySQL mode)—Columnstore index.
Based on index structure, indexes can be categorized into the following types:
- B-tree index (normal index): The default index type, which includes unique indexes, non-unique indexes, and function indexes. B-tree indexes are suitable for equality queries, range queries, sorting, and point queries. B-tree indexes can be local indexes or global indexes, and they can use rowstore or columnstore storage formats. A columnstore index is a B-tree index that uses the columnstore storage format. Columnstore indexes are suitable for analytical scans in HTAP scenarios. For more information, see Index overview, Create an index (MySQL mode), and Columnstore.
- Full-text index: A full-text index is built on
CHAR,VARCHAR, andTEXTcolumns based on inverted indexes and tokenization. It is suitable for log analysis, document/knowledge base retrieval, and relevance ranking. Only local indexes are supported. For more information, see Full-text indexes. - JSON multivalued index: OceanBase Database in MySQL mode supports multivalued indexes, which can be created on arrays or collections to improve query efficiency for searching JSON array elements. Multivalued indexes are suitable for tagging, classification, many-to-many associations, and predicates such as
MEMBER OF(),JSON_CONTAINS(), andJSON_OVERLAPS(). Only local indexes are supported. For more information, see Multivalued indexes and Create an index (MySQL mode). - Spatial index: A spatial index is an independent index structure for geographic space data and GIS scenarios. It enables fast retrieval of data within a specified coordinate range. Only local indexes are supported.
- Vector index: A vector index is created on a Vector column and includes dense vector indexes and sparse vector indexes. Examples of dense vector indexes include HNSW and IVF. For more information about sparse vector indexes, see Vector indexes. Vector indexes support L2, inner product, and cosine distance/similarity calculations and are suitable for vector similarity retrieval queries. They are independent index structures. Only local indexes are supported.
For combinations such as columnstore indexes and rowstore base tables + columnstore indexes in AP scenarios, see Optimize queries in HTAP scenarios in Data table design and query optimization in AP scenarios. For special index types such as full-text indexes, JSON multivalued indexes, spatial indexes, and vector indexes, see Best practices for choosing special indexes. For more information about syntax and implementation details, see Index overview (MySQL mode) and Index overview (Oracle mode).
Data types
Before creating and using tables, database administrators need to plan the table structure and data types based on business requirements. To ensure efficient data storage and query optimization, administrators should follow these principles:
- Normalize the table structure: Design the table structure reasonably to minimize data redundancy and improve query efficiency.
- Select appropriate SQL data types: Choose the most suitable SQL data type for each column to reduce storage space and improve query speed.
Common SQL data types include:
- Basic data types: such as
INT,VARCHAR, andDATE. - Complex data types: such as
JSON,ARRAY, andBITMAP, which are suitable for storing more complex data structures.
For detailed descriptions of SQL data types, see:
Views
OceanBase Database supports standard views and materialized views.
- Standard views: Standard views, also known as non-materialized views, are the most common type of view. They store only the SQL query that defines the view, not the query results.
- Materialized views: Unlike standard views, materialized views store the results of the query in physical storage. OceanBase Database supports asynchronous materialized views, where the materialized view is not immediately updated when the base table data changes. This ensures the performance of DML operations on the base table. Materialized views are a common implementation of pre-aggregation, suitable for scenarios with fixed analysis requirements, reports, and dashboards. For more information, see Materialized views in MySQL mode and Materialized views in Oracle mode.
Recommended scenarios and combinations
Combining wide tables, pre-aggregation, and partitioning strategies, the following table provides recommended table design combinations for typical scenarios, helping you choose the right design based on your business needs. Specific partition keys, number of partitions, storage formats, and indexes should be further refined based on data volume and query patterns. For more information, see Data table design and query optimization practices in AP scenarios, Table partitioning practices in OLAP scenarios, and JSON multi-value indexes and full-text indexes.
| Typical scenario | Recommended table design combination | Description |
|---|---|---|
| Pure analysis, large wide tables, and multi-column scans | Columnar table + Partitioned table (by time/dimension) + Table group (for multi-table joins) | Wide tables use columnar storage to reduce I/O. Partitioning strategies enable pruning and parallel processing. Table groups ensure aligned partitions for associated tables, facilitating Partition Wise Join. |
| Fixed reports/dashboards and repeated aggregation queries | Detail table (partitioned and columnar or hybrid) + Materialized view (pre-aggregated results) | Pre-aggregation: Materialized views store aggregated results for direct query access; detail tables can still support ad-hoc analysis. |
| HTAP, primarily TP with some analysis | Rowstore base table + Columnar index + Partitioned table | Writes and point queries use rowstore; analysis uses columnar indexes, with controlled storage. For more information, see Rowstore base table + Columnar index in Data table design and query optimization practices in AP scenarios. |
| Physical isolation required for TP and AP | Rowstore or hybrid table (F/R) + Columnar replica (C replica) | AP traffic uses C replicas for weak reads; TP uses F/R. No need to create columnar indexes on F/R for analysis to achieve columnar scanning. |
| Ad-hoc analysis with variable query columns | Hybrid table + Partitioned table | Optimizer automatically selects rowstore or columnar paths; partitions enable pruning and lifecycle management. |
| Logs/tracking data, queries and cleanup by time range | Columnar table or hybrid table + RANGE partitioning (by day/hour) | Partitioning strategy: Time-based partitioning enables pruning and partition-level archiving/deletion; columnar storage facilitates columnar scanning and compression. |
These combinations can be used together (e.g., partitioned table + columnar storage + materialized view), depending on business write volume, query patterns, and storage costs.
Example of creating a table with partitions, columnar storage, and indexes
The following example shows how to create a table that includes partitions, columnar storage, and indexes:
CREATE TABLE salesdata (
sale_id INT,
product_id INT NOT NULL,
saledate DATE NOT NULL,
saledate_int INT, -- Defined as a regular column
quantity INT,
price DECIMAL(10, 2),
customer_id INT,
PRIMARY KEY (sale_id, saledate_int) -- The primary key can include regular columns
)
PARTITION BY RANGE COLUMNS (saledate_int) (
PARTITION p2023_q1 VALUES LESS THAN (202304),
PARTITION p2023_q2 VALUES LESS THAN (202307),
PARTITION p2023_q3 VALUES LESS THAN (202310),
PARTITION p2023_q4 VALUES LESS THAN (202401)
)
WITH COLUMN GROUP(each column);
CREATE INDEX idx_product_id ON salesdata(product_id);
CREATE INDEX idx_customer_id ON salesdata(customer_id);
Partitioning: The
salesdatatable uses range partitioning.PARTITION BY RANGE COLUMNS (saledate_int): The table is partitioned based on thesaledate_intcolumn.Four partitions are defined:
p2023_q1contains all data prior to 202304 (i.e., data from Q1 2023).p2023_q2,p2023_q3, andp2023_q4contain data from Q2, Q3, and Q4 of 2023 respectively.
Columnar Storage: Columnar storage is specified through the
WITH COLUMN GROUP(each column);directive. Data in these columns is stored in a columnar format, which is ideal for large-scale data analysis.Row-based Indexes: Indexes named
idx_product_idandidx_customer_idare created on theproduct_idandcustomer_idcolumns respectively. Row-based indexes can speed up queries that target specific columns, especially for frequent small queries.
When creating and using a columnar storage table, if you import a large amount of data, it is recommended to perform a major compaction to improve read performance and to perform a statistics collection to adjust the execution strategy.
Major Compaction: After a large amount of data is imported, it is recommended to perform a major compaction. This helps improve read performance by reorganizing fragmented data into more continuous physical storage, thereby reducing disk I/O during reads. After importing data, trigger a major compaction within the tenant to ensure all data is compacted into the baseline layer. For more information, see MAJOR and MINOR (MySQL mode) and MAJOR and MINOR (Oracle mode).
Statistics Collection: After the major compaction, it is recommended to collect statistics. This is crucial for the optimizer to generate effective query plans and execution strategies. For more information, see GATHER_SCHEMA_STATS (MySQL mode) and GATHER_SCHEMA_STATS (Oracle mode). You can also monitor the progress of statistics collection by querying the views GV$OB_OPT_STAT_GATHER_MONITOR (MySQL mode) and GV$OB_OPT_STAT_GATHER_MONITOR (Oracle mode).
As the amount of data in columnar storage tables increases, the speed of major compactions may slow down.
