Can I add or delete columns in a columnstore table?
You can add or delete columns in a columnstore table.
You can increase or decrease the value length of columns of the VARCHAR data type.
Like rowstore tables, columstore tables also support multiple types of offline DDL operations.
For more information about the conversion between rowstore and columnstore tables, see Convert a rowstore table to a columnstore table (MySQL mode) or Convert a rowstore table to a columnstore table (Oracle mode).
What are the characteristics of columnstore table queries?
In a hybrid rowstore-columnstore table, by default, the columnstore mode is used for range scans and the rowstore mode is used for point get queries.
In a pure columnstore table, the columnstore mode is used for any queries.
Do columnstore tables support transactions? Do they have any limit on the transaction size?
Like rowstore tables, columnstore tables also support transactions and do not have any limit on the transaction size. Columnstore tables also support high consistency.
Do log synchronization and backup and restore for columnstore tables have any special characteristics compared to those for rowstore tables?
No. Log synchronization and backup and restore for columnstore tables are consistent with those for rowstore tables. Synchronized logs are stored in the rowstore format.
Can I convert a rowstore table into a columnstore table by using DDL statements?
Yes. You can execute DDL statements to add columns and drop rows to convert a rowstore table into a columnstore table. The syntax is as follows:
create table t1( pk1 int, c2 int, primary key (pk1));
alter table t1 add column group(all columns, each column);
alter table t1 drop column group(all columns, each column);
alter table t1 add column group(each column);
alter table t1 drop column group(each column);
Note
After alter table t1 drop column group(all columns, each column); is executed, all columns will be put in the default group named DEFAUTL COLUMN GROUP for storing data.
Can I store multiple columns as a whole in a columnstore table?
In OceanBase Database V4.3.0, you can store each column separately or store all columns as a whole in the rowstore format. At present, you cannot store specific columns together.
Can I update a columnstore table? What is the data structure in MemTables?
In OceanBase Database, the addition, deletion, and modification of data are performed in the memory. Data is stored in the rowstore format in MemTables. Baseline data is read-only and stored in the columnstore format on the disk. When you read data from a column, the data returned is a combination of the rowstore data in the MemTable and the columnstore data on the disk. This means that OceanBase Database supports strong-consistency columnstore data read without a latency. Data written to MemTables supports minor compactions. Compacted data is still stored in the rowstore format. After a major compaction, rowstore data and baseline columnstore data are integrated to form new baseline columnstore data.
Notice
If you perform a large number of update operations on a columnstore table without performing a major compaction in a timely manner, the query performance will be compromised. Therefore, we recommend that you initiate a major compaction after batch data import to achieve the optimal query performance. A small number of update operations will not affect the query performance.
Can I create an index on a specific column in a columnstore table?
Yes. You can create indexes of the same index structure on columnstore tables and rowstore tables.
You can create an index on one or more columns of a columnstore table to form a covering index to improve the query performance, or to sort specific columns to improve the sorting performance.
What is a columnstore index?
OceanBase Database also supports columnstore indexes. A columnstore index is different from an index on a columnstore table. For a columnstore index, the index table is in the columnstore format.
For example, if you want to calculate the sum of values in the c3 column of the rowstore table t6 while ensuring the optimal performance, you can create a columnstore index on the c3 column.
create table t6(
c1 TINYINT,
c2 SMALLINT,
c3 MEDIUMINT
);
create /*+ parallel(2) */ index idx1 on t6(c3) with column group (each column);
You can also create indexes in other ways. Here are some examples:
Redundant rowstore in an index
create index idx1 on t1(c2) storing(c1) with column group(all columns, each column); alter table t1 add index idx1 (c2) storing(c1) with column group(all columns, each column);Pure columnstore in an index
create index idx1 on t1(c2) storing(c1) with column group(each column); alter table t1 add index idx1 (c2) storing(c1) with column group(each column);
You can use the STORING clause to store data of non-index columns in an index. This can improve the performance of specific queries by avoiding table access and reducing the index sorting cost. The performance can be significantly improved for a query that only needs to access columns stored in an index, without the need to query the original rows of the table.
What is the difference between a columnstore table and a columnstore replica?
A columnstore table refers to a table where both the leader and follower partitions' schemas are in columnstore format, allowing OLAP queries to be strongly consistent. In contrast, a columnstore replica means that while the leader and follower partitions' schemas are in rowstore format, the read-only replica (learner) is in columnstore format, and OLAP queries can only be eventually consistent (weak read).
What is columnstore replica auto-selection?
OceanBase Database V4.6.0 introduces the columnstore replica auto-selection feature. Simply put, this feature allows the system to:
- Automatically identify: The system automatically identifies which queries are suitable for columnstore replicas (AP queries).
- Intelligent routing: Based on query characteristics and cost, the system automatically routes suitable queries to columnstore replicas for execution.
- Unified entry point: TP and AP traffic use the same entry point, eliminating the need for manual switching.
Why is columnstore replica auto-selection needed?
In HTAP (hybrid transactional/analytical processing) scenarios, business typically involves two types of queries:
| Query Type | Features | Suitable Storage |
|---|---|---|
| TP queries (transaction processing) | Short queries, high QPS, latency-sensitive, small data access volume | Rowstore replicas (accelerated by indexes) |
| AP queries (analytical processing) | Large queries, low QPS, latency-insensitive, large data access volume | Columnstore replicas (accelerated by columnar storage and parallelization) |
In versions of OceanBase Database earlier than V4.6.0, to route queries to columnstore replicas, you needed to manually set ob_route_policy = 'COLUMN_STORE_ONLY'. This approach had the following issues:
- High threshold: Users had to determine which queries were AP queries and manually configure them.
- Non-adaptive: The system could not automatically determine based on query characteristics, and all queries would be routed to columnstore replicas after configuration.
- Strong read limitation: Strongly consistent reads were not supported, only weak reads were allowed, limiting the use cases.
How to determine if a query is routed to a columnstore replica?
Check the execution plan: If a table does not have a columnstore replica in the F replica, but the scan operator is
COLUMN TABLE FULL SCAN, it indicates that the query was routed to a columnstore replica.Check the outline: A more accurate method is to check
outline data. If the hintOPT_PARAM('AP_QUERY_ROUTE_POLICY' 'FORCE')appears, it confirms that the query was routed to a columnstore replica. For example:/*+ ... OPT_PARAM('AP_QUERY_ROUTE_POLICY' 'FORCE') ... */
