This topic highlights only the core changes related to analytical processing (AP) features. For a complete list of all feature enhancements and updates in OceanBase Database, see the Release notes.
V4.3.5 BP4
Version information
- Release date: September 10, 2025
- Version: V4.3.5 BP4
New features and enhancements
Materialized view improvements
V4.3.5 BP4 continues to refine materialized view functionality. In terms of functionality, it now supports incremental refreshes where dimension tables can be excluded from the refresh process. When creating a materialized view, you can use the new AS OF PROCTIME() syntax to specify tables that should not be refreshed. During incremental refreshes, these tables' data will remain unchanged. For incremental refreshes of single-table aggregates, the MIN() and MAX() functions are now supported. In terms of usability, MLOG tables are now managed automatically: when you create an incremental refresh materialized view, the required MLOG tables for base tables are automatically created or replaced, and any redundant MLOG tables are regularly cleaned up in the background. The content of materialized views and error messages during creation have also been optimized.
The direct load feature is now disabled by default in the AP parameter template.
V4.3.5 BP3
Version information
- Release date: July 21, 2025
- Version: V4.3.5 BP3
New features and enhancements
External data sources and integration enhancements
ODPS external table support for Storage API
Previously, ODPS external tables using the Tunnel API required opening a separate session for scanning each partition during execution. Since opening a session can take several seconds, even queries that should be very fast could spend most of their time in session preparation. ODPS also offers an open Storage API that allows direct access to its underlying storage, enabling more optimization strategies and better read performance. This release adds support for the ODPS Storage API, further improving the performance of external table access.
Support for Azure Blob storage
You can now access Azure Blob object storage using the Azure Blob protocol.
Comprehensive materialized view enhancements
Materialized view improvements
This version continues to strengthen materialized view functionality. Major updates include support for renaming materialized views, incremental refresh for outer joins, UNION ALL, and non-aggregate single tables, materialized view logs that support adding columns, and nested materialized views with cascading refresh capabilities. Several stability issues have also been addressed, such as resolving cases where dropping a materialized view could get stuck.
Storage and partition management optimizations
Partition exchange support for partitioned and subpartitioned tables
The partition exchange feature now allows exchanges between partitions of dubpartitioned tables and partitioned tables.
Optimizer and query performance improvements
Enhanced statistics collection
In earlier versions, statistics collection required gathering aggregate data (MIN, MAX, NULL COUNT, NDV, etc.) for each column at both the partition and table levels, with data distribution estimated via sampling, which could lead to significant errors. The new version leverages skip index capabilities at the storage layer to collect more accurate block-level aggregate data for all columns in columnar storage, helping the optimizer generate and refine plans more precisely. Progressive partition-wise statistics collection has also been introduced, solving timeout issues caused by lengthy statistics gathering on large partitioned tables, and improving both system stability and data freshness.
Accelerated data import
Direct load performance optimizations
This release includes performance improvements for direct load operations, such as optimizations to the sorting algorithm, vectorized processing path, CSV parsing, column_conv expressions, and CS encoding. In scenarios involving columnar non-partitioned heap tables, these enhancements can deliver up to a 10% increase in performance.
V4.3.5 BP2
Version information
- Release date: May 15, 2025
- Version: V4.3.5 BP2
Product offering
- Introduction of shared-storage AP deployment
Starting with V4.3.5 BP2, OceanBase Database now supports deploying the database in a shared-storage architecture for AP workloads. This new shared-storage AP model brings key AP features—such as columnstore tables, incremental direct load, materialized views, full-text indexes, and multi-valued indexes—that were previously only available in shared-nothing mode. I/O read operations have also been specifically optimized for shared-storage environments. This deployment is recommended for AP workloads with distinct hot and cold data patterns, especially when reducing storage costs is a priority.
Feature enhancements
Data recovery improvements
Support for restoring columnstore tables
Previously, table-level restore only restored tables in row-store format. This release improves table restore in shared-nothing mode, allowing restoration to columnstore tables or hybrid row-column tables.
Performance enhancements
Direct load performance optimization for heap tables
The process for validating uniqueness when creating unique indexes during direct load into heap tables has been streamlined. As a result, full direct loads into non-partitioned heap tables now achieve performance comparable to index-organized primary key tables.
Hard parse performance improvements
In some AP scenarios, or with some mid-sized and small accounts, disabling the plan cache and relying on hard parsing helps avoid suboptimal execution plans. However, turning off the plan cache increases the demands on hard parse performance. This release boosts optimizer efficiency by optimizing hot functions in the resolver phase, switching unnecessary global cost validations to local ones, reducing memory usage for expressions, and eliminating redundant type inference operations.
Materialized view enhancements
Full refresh materialized views can use external tables as base tables
Previously, materialized views could only be created from user tables or other materialized views. This release expands support to allow external tables as base tables for full refresh materialized views, broadening use cases.
Enhanced diagnostics for materialized views
New system views make it easier to monitor and troubleshoot materialized view operations:
CDB/DBA_MVIEW_RUNNING_JOBSdisplays active materialized view tasks such as refreshes and MLOG cleanups.DBA_MVIEW_DEPSshows dependencies for materialized views. TheCDB/DBA/ALL/USER_MVIEWSviews now record refresh checkpoints and latency.CDB/DBA/ALL/USER_MVIEW_LOGSadds columns for parallelism and duration in MLOG cleanup operations.Resource isolation for materialized view tasks
Refresh and purge operations for materialized views can consume significant system resources, potentially affecting foreground performance. With this release, the resource manager enables resource isolation for incremental refresh and MLOG purge tasks, allowing users to set resource usage limits for these operations.
Partition management
Dynamic partition management
Many businesses use date or time as a range partition key, segmenting data by time periods. DBAs often need to pre-create future partitions to ensure uninterrupted data writes. If future partitions are not created in time, data writes can fail and disrupt operations. This release introduces dynamic partition management, which automatically pre-creates future partitions at the kernel level based on table settings, reducing DBA workload. Expired partitions can also be automatically deleted, helping save storage space.
Indexes and query optimization
NGRAM2 tokenizer for full-text indexes
A new built-in
NGRAM2tokenizer is available for full-text indexing, supportingNGRAMtokenization within a configurable range (min_ngram_sizetomax_ngram_size). This tokenizer is ideal for scenarios where you need to search tokens of varying lengths and are less sensitive to performance and storage overhead. For fixed-length tokens, the originalNGRAMtokenizer remains available.
Data types and storage optimization
Map data type
The map data type allows storage of unordered key-value pairs, such as {a:1, b:2, c:3}. Typical use cases include configuration options, user attributes, and product details. In V4.3.5 BP2, OceanBase Database introduces map data types built on the array framework, supporting map constructors and functions like map_keys, map_values, as well as operators like = and !=.
Semi-structured JSON storage in MySQL-compatible mode
Semi-structured data is usually stored as binary strings, which limits encoding-based compression and leads to redundant metadata storage. JSON, as a semi-structured format, can often be logically split into multiple columns for basic types, with unsplittable parts stored in a binary column. This approach—"structured storage for semi-structured columns"—allows structured columns to benefit from encoding and higher compression, while unsplittable data takes up less space thanks to extracted metadata. In V4.3.5 BP2, OceanBase Database supports semi-structured storage for JSON types with corresponding encoding, reducing storage usage and improving query performance on JSON subcolumns. However, splitting JSON into multiple subcolumns introduces merge overhead for scenarios that frequently read raw JSON, which may impact performance. Enable this feature based on your workload needs.
Query execution
Adaptive plan cache
The plan cache is essential for transactional processing (TP) workloads and is enabled by default. In some analytical processing (AP) scenarios, however, execution time far exceeds plan generation time, and generating a new plan for each query can improve performance. In hybrid HTAP workloads, it is important to let AP-oriented SQL use hard parsing, while TP-oriented SQL continues to benefit from plan reuse. This release introduces adaptive plan cache, which decides whether to use the plan cache for a given SQL statement based on query duration and execution profile.
External data source integration
ODPS (MaxCompute) catalog support in MySQL-compatible mode
To simplify querying data in external sources, this release introduces an external catalog framework, with initial support for ODPS catalogs. OceanBase internal objects belong to the internal catalog, while users can create external catalogs to connect to external sources and access their metadata. You can directly query tables in external catalogs (such as
select col1 from catalog1.database1.table1;), and, with the right permissions, perform cross-catalog joins. There is no need to import data or create external tables one at a time, making access to external data much easier.
Data import
Load Data supports error logging for CSV external tables
Previously, importing data via Load Data would fail immediately if there were issues like type mismatches or precision errors. This release adds the
LOG ERRORSoption for fault-tolerant imports. In MySQL-compatible mode, failed rows are logged and can be reviewed using show warnings for error diagnosis.
V4.3.5 BP1
Version information
- Release date: March 18, 2025
- Version: V4.3.5 BP1
New features and enhancements
Data import and export
Incremental direct load now supports heap tables with unique indexes
Direct load offers both full and incremental options. When a table already contains data, incremental direct load is more efficient than a full import. Previously, incremental direct load could not be used with heap tables that have unique indexes. This release adds support for importing data incrementally into heap tables with a single local unique index.
Direct load now supports HASH partitions and concurrent partition imports
Starting with this version, direct load supports HASH partition-level imports. Incremental direct load also allows for concurrent imports across multiple partitions. However, if multiple import jobs overlap on the same partition, concurrent imports for those partitions are not supported.
Direct load performance improvements
With enhancements in vectorized processing, streamlined handling of temporary files, and optimizations at each stage, importing data into clickbench tables is now up to 17% faster compared to V4.3.5.
External data source integration
External tables can now read HDFS files
HDFS is a key storage medium in data lake architectures and enables data sharing across engines. OceanBase Database now supports reading files directly from HDFS through external tables, including clusters with Kerberos authentication via configuration.
External tables can read and write to MaxCompute (ODPS) data sources
You can now use external tables or URL external tables to read data from MaxCompute and also write OceanBase data back to MaxCompute.
URL external tables
Previously, accessing external data in OceanBase Database required you to first create an external table before running SQL queries. This release introduces URL external tables, allowing you to query external CSV, Parquet, ORC, and MaxCompute data directly with SELECT statements—no need to create an external table first. The LOAD DATA syntax also now supports direct imports from URL external tables, making data import operations simpler and more efficient.
Index optimization
Full-text index enhancements
OceanBase Database has supported full-text search in MySQL-compatible mode since V4.3.1. As the version evolved, additional features and enhancements were gradually introduced. V4.3.5 BP1 introduces an embedded Chinese
IKtokenizer, supportingsmartandmax_wordtokenization modes. It also supports adding new tokenizers via plugins. The new version introduces thePARSER_PROPERTIESattribute for full-text indexes, allowing table-level configuration. In addition to the existing natural language mode, the new version supports Boolean mode, which can be enabled using theIN BOOLEAN MODEkeyword inMATCH AGAINSTqueries. Boolean operators such as+,-,(), and no operator, as well as nested operations, are supported. The new version also supports full-text indexes in index merge operations, improving query performance through index union merge. It optimizes the update and deletion scenarios for full-text indexes on the main table, enhancing update and deletion performance.
Materialized view enhancements
Materialized view improvements
The efficiency of materialized view refreshes depends heavily on parallelism. This release introduces robust parallelism controls: you can set refresh parallelism with the
mview_refresh_dopsystem variable, theDBMS_MVIEW.refreshprocedure, or specifyPARALLELwhen creating the view. Incremental materialized views now support theLOBdata type, and you can directly modify views or their properties usingALTER MATERIALIZED VIEWorALTER MATERIALIZED VIEW LOG. You can adjust parallelism, background refresh intervals, MLOG table parallelism, MLOG cleanup schedules, and LOB inline storage thresholds. Previously, adding columns to a base table with an MLOG required dropping the MLOG first. Now, you can add columns directly to MLOG base tables, whether at the end or in the middle. Additional improvements include faster MLOG cleanup and clearer error reporting for incremental refreshes.
Table structure and storage optimization
Heap table organization model
OceanBase Database's traditional clustered index table model is well-suited for OLTP workloads, as it stores primary keys and data together, enabling fast queries and efficient uniqueness checks. However, for OLAP workloads that require efficient data imports and complex analysis, this model presents some limitations—such as the need to sort all data during import and merge primary key rows during queries, which can impact performance. This release introduces a heap table organization model, where the primary key enforces uniqueness but queries are executed against the main table. When data is time-ordered, skip indexes can be leveraged for faster query performance. Separating the primary key from the main data also means imports no longer require sorting the main table, resulting in improved import efficiency.
Data type and storage optimization
New string data type in MySQL-compatible mode
Some analytical workloads need flexible column lengths and often use variable-length strings as primary or index keys. In OceanBase Database, char/varchar types require a defined length, while mediumtext/text (LOB types) support variable lengths but cannot be used as keys. To address this, a new string data type is introduced—no length required, with a default maximum of 16 MB. If the column length is less than 16 KB and within the
lob_inrow_threshold, it can be used as a primary or index key.Expanded ARRAY type functions
OceanBase Database supports the ARRAY data type and some common functions and operators since V4.3.3. This release adds a wide range of new functions, including:
array_prepend,array_concat,array_compact,array_filter,array_sort,array_sortby,array_length,array_range,array_sum,array_first,array_difference,array_min,array_max,array_avg,array_position,array_slice, andreverse.
Parallel query optimization
Decoupling PX compute nodes from data nodes
Previously, compute and storage resources were closely coupled. The PX scheduling mechanism assigned compute tasks for non-leaf DFOs exclusively to machines that stored the relevant data, which sometimes restricted efficient use of available resources. This release introduces the ability to decouple PX compute nodes from data nodes. You can configure or use hints (
PX_NODE_POLICY) to specify candidate resource pools for non-leaf DFOs, and use hints (PX_NODE_ADDRS,PX_NODE_COUNT) to designate particular machines or the number of machines for task allocation.
Query optimization and pushdown
Query pushdown for rowstore tables with added columns
During rowstore SCAN operations, if there are no overlapping rowids between the major sstable and other sources, the scan is pushed down to the storage layer. After merging and adding new columns, pushdown should still be possible, but previously failed due to parsing errors, causing a fallback to less efficient row-by-row reads. This release optimizes this scenario, enabling query pushdown to the storage layer and improving performance.
Columnar storage optimization
Columnstore replica enhancements
The columnstore replica feature introduced in V4.3.3 had limitations, such as supporting only a single C replica and not allowing full direct loads into C replicas, which affected usability. This release removes those restrictions, allowing multiple C replicas per cluster and direct full direct loads into C replicas. A new system view,
CDB/DBA_OB_CS_REPLICA_STATS, displays progress for partition-level and log stream-level columnstore replica conversions, making it easier to monitor replica status.
Behavior changes
AP parameter template disables nested loop join (NLJ)
The optimizer's nested loop join plan is highly sensitive to the row count of the driving table. If the estimate is too low, actual execution performance may suffer. For OLAP scenarios with large data volumes, nested loop joins offer limited benefits. As a result, the AP scenario now defaults to generating hash join plans, only using nested loop joins for non-equi-join cases.
V4.3.5
Version information
- Release date: December 31, 2024
- Version number: V4.3.5
New features and enhancements
Materialized view improvements
Nested materialized views
Previously, materialized views could only be created on regular user tables. In data warehousing scenarios, materialized views are often used for data transformation, and lightweight real-time data warehouses need more flexibility. With V4.3.5, you can now create materialized views based on existing materialized views—enabling nested materialized views. These support both full and incremental refresh, just like non-nested views, and you can also create materialized view logs for them. The freshness of a nested materialized view depends on its underlying base views; to keep upper-level views up to date, you’ll need to refresh the lower-level materialized views first.
For more information, see the section on creating nested materialized views in Create a materialized view (MySQL-compatible mode) or Create a materialized view (Oracle-compatible mode).
Data import and export
Direct load for specific partitions
Before V4.3.5, OBServer only supported direct load for entire tables, which sped up imports but made it difficult to import data into just certain partitions. Users could work around this by first importing partition data into a non-partitioned table, then swapping it with the target partition—a process that was complicated and inconvenient. V4.3.5 streamlines partition-level imports by allowing you to specify partitions directly in the
LOAD DATAandINSERT INTO SELECTstatements. Note that specifying the last-level partition as hash or key type is not supported.For more information, see Full direct load or Incremental direct load.
Enhanced
select into outfilefeatureV4.3.5 expands the capabilities of
select intoexports. You can now export data in Parquet and ORC formats, and specify compression algorithms when exporting CSV files. The syntax has been enhanced to offer more flexible export options:- Use the format syntax to set export options.
- Specify compression algorithms for CSV exports.
- Export data in Parquet format.
- Export data in ORC format.
Index enhancements
Full-text index improvements
Building on the improvements in V4.3.4, V4.3.5 further enhances full-text indexing with the following features:
- Create full-text indexes using
CREATE FULLTEXT INDEXorALTER TABLE ADD FULLTEXT INDEX - Cost-based plan selection for full-text indexes: For SQL statements with multiple
MATCH AGAINSTfilters, the optimizer selects the most efficient index scan based on cost - Functional lookup support:
- Support for multiple
MATCH AGAINSTexpressions in a single query - Ability to use full-text indexes alongside other indexes
- Use
MATCH AGAINSTexpressions without filter semantics as projection columns - Use filter semantics in
MATCH AGAINSTwith operators like <= and < - Combine filter semantics in
MATCH AGAINSTwith other filters using AND/OR logic
- Support for multiple
For more information, see MATCH AGAINST.
- Create full-text indexes using
External data source integration
External table enhancements
In V4.3.5, external tables now support reading ORC format files.
For more information, see Create an external table (MySQL-compatible mode) or Create an external table (Oracle-compatible mode).
Columnar storage optimization
- Read-only columnstore replicas
The read-only columnstore replica feature introduced in V4.3.3 was experimental. After further development in V4.3.4 and V4.3.5, it now meets general availability (GA) standards.
Asynchronous online DDL for row-to-column store conversion
V4.3.0 introduced three storage formats: rowstore, pure columnstore, and hybrid row-column store. Previously, converting a rowstore table to a hybrid- or columnstore required offline DDL, which could disrupt user operations. V4.3.5 now supports asynchronous online conversion from rowstore to columnstore or hybridstore. By specifying the delayed keyword during DDL, schema changes take effect immediately without blocking data writes, enabling true online conversion. The baseline columnstore data reorganization then happens asynchronously during baseline data merge. Both row-to-column and row-to-hybrid store conversions now support online DDL.
Performance optimization
Vectorized write path for direct load
Prior to V4.3.5, direct load processed writes row by row, resulting in significant function call overhead. Writing to sstable was handled in the background, but some operations were not fully optimized. V4.3.5 introduces vectorized processing to the direct load path and optimizes columnstore encoding, roughly doubling the performance of direct load for columnstore tables without primary keys.
Data type and storage optimization
JSON multi-valued indexes support complex DML
V4.3.5 improves support for complex DML operations with JSON multi-valued indexes. Now,
updateanddeletestatements can use multi-valued predicates, and complex DML operations with multi-valued index back-table access are supported.Expanded array expressions
OceanBase Database introduced the ARRAY data type in V4.3.3. V4.3.5 further enriches ARRAY functionality, introducing a broader range of array expressions to support diverse business scenarios. New expressions include:
array_append,array_distinct,arrayMap,array_remove,cardinality,element_at,array_contains_all,array_overlaps,array_to_string,array_agg,unnest, andrb_build.
V4.3.4
Version information
- Release date: October 31, 2024
- Version number: V4.3.4
New features and enhancements
Storage architecture optimization
Shared storage (experimental feature)
To deliver more cost-effective database services in public cloud environments, OceanBase Database V4.3.4 introduces a shared-storage cluster architecture built on general-purpose object storage. Tenant replicas now share baseline data and log data on standard object storage. With automatic hot and cold data separation, hot data is stored on cloud disks or local SSDs, while cold data remains in object storage—helping reduce overall storage costs. Since each tenant replica only caches hot data and logs locally, compute nodes can be quickly scaled up or down. This architecture is ideal for transactional workloads with clear separation between hot and cold data, and where latency isn’t a major concern, such as historical databases, backup databases, as well as key-value and time-series scenarios.
This feature is experimental in V4.3.4 and will be further refined in future releases before becoming a production feature.
Index enhancements
Full-text index improvements
Since V4.3.1, OceanBase Database has supported full-text indexes, which use keyword indexing to speed up text searches. However, there were limitations when using full-text indexes with complex DML operations, which could be inconvenient for some business scenarios. V4.3.4 addresses this by adding support for complex DML on tables with full-text indexes, including INSERT INTO ON DUPLICATE KEY, REPLACE INTO, multi-table updates and deletes, and updatable views. The new version also improves full-text search performance with vectorized execution and an enhanced TAAT strategy, removes the token count limit for MATCH AGAINST predicates, and allows full-text indexes on partitioned tables without primary keys. Additionally, you can now use the tokenize function to view tokenization results, making it easier to debug tokenization systems.
Multi-valued index improvements
Previously, multi-valued indexes only supported pre-creation and had restrictions on DML and DDL operations. V4.3.4 lifts some of these restrictions, making multi-valued indexes easier to use. The main enhancements include support for complex DML operations, such as
INSERT INTO ON DUPLICATE KEY,REPLACE INTO, multi-table updates and deletes, and updatable views.
Data import optimization
Incremental direct load support for non-unique local indexes
OceanBase Database supports both full and incremental direct load. When a table already contains data, incremental direct load is more efficient than full direct load. Previously, incremental direct load could not be used with tables that have indexes. V4.3.4 adds support for incremental direct load into tables with non-unique local indexes.
Direct load performance improvements
For tables with primary keys, direct load previously required sorting the data by writing it to temporary files, then loading it into memory for sorting and merging before writing to sstable. This split the process into two stages—writing and sorting/merging. If all data was not written, sorting could not begin, and write I/O bottlenecks could slow down imports. In V4.3.4, the process is optimized so data is written directly into memory for sorting and merging, eliminating the need for temporary files. This pipelined approach allows data import and sorting to happen concurrently, improving overall efficiency. For tables with primary keys and 1 TB of data, overall import performance improves by 35%. The more memory and CPU resources available, the greater the performance boost—making this especially beneficial for large tenants, with no noticeable performance drop for small tenants.
Data types and storage optimization
Bitmap enhancements
Bitmaps are data structures that efficiently track the presence of elements in a set. OceanBase Database V4.3.2 and V4.3.3 added partial support for roaring bitmap features. V4.3.4 builds on this by introducing the
rb_iterateandrb_selectexpressions.rb_iterateexpands a roaring bitmap into multiple rows based on the number of elements, whilerb_selectfilters elements from a roaring bitmap according to specified conditions and returns a new roaring bitmap.
V4.3.3
Version information
- Release date: September 30, 2024
- Version number: V4.3.3
New features and enhancements
Columnar storage optimization
Read-only columnstore replicas (experimental feature)
Starting with OceanBase Database V4.3.0, columnar storage has been supported, allowing users to create columnstore tables, rowstore tables, or hybrid row-column tables based on their business needs. Regardless of table type, replica modes remain consistent across all tenant zones. For example, if tenant X has a unit distribution of 1:1:1 and T1 is a hybrid row-column table, then T1 will have both a rowstore replica and a columnstore replica in each of the three zones. To better support HTAP workloads and ensure strong physical isolation between TP and AP resources, V4.3.3 introduces a new deployment model that lets you add a dedicated zone for read-only columnstore replicas (known as "C replicas"). In this zone, all user tables are stored in columnar format, and AP workloads can use a separate ODP instance and set the session variable
ob_route_policytoCOLUMN_STORE_ONLYto route queries to columnstore replicas for analytical queries—without impacting TP workloads. Additionally, a 3+1 zone deployment saves on storage costs compared to hybrid row-column storage. This feature requires ODP V4.3.2 or above.This feature is experimental in V4.3.3 and will be further improved for production use in future releases.
Materialized view enhancements
Materialized view improvements
To reduce the need for manual query rewriting, OceanBase Database V4.3.1 introduced automatic materialized view query rewriting. When the system variable
QUERY_REWRITE_ENABLEDis set toTrue, you can useENABLE QUERY REWRITEwhen creating a materialized view, allowing the system to automatically rewrite queries on base tables to use materialized views instead. V4.3.1 supported rewriting for non-aggregate materialized views when theFROMclause matched exactly and theWHEREclause matched partially. V4.3.3 expands support to include compatible joins in theFROMclause, rewriting queries that reference tables not present in the materialized view, aggregate materialized view rewriting, and aggregate roll-up rewriting.The new version also expands incremental refresh and real-time materialized view support for more SQL types. Earlier versions already supported incremental refresh and real-time queries for single-table aggregates and multi-table joins; V4.3.3 adds support for join-aggregate scenarios.
Previously, only rowstore materialized views were supported. Starting with V4.3.3, columnstore materialized views are also available, providing better query performance in complex analytical scenarios that partially reference materialized views.
Data import and export optimization
External table import performance improvement
V4.3.3 improves the performance of reading external table data during direct load, with about a 15% speed increase compared to the previous version.
Enhanced INSERT OVERWRITE capabilities
Since OceanBase Database V4.3.2, table-level overwrite writes (INSERT OVERWRITE) have been supported, enabling atomic clearing of old data and writing of new data—ideal for periodic data refresh, transformation, and cleaning in AP scenarios. However, V4.3.2 only supported overwriting entire tables, without the ability to overwrite specific partitions or selected columns. V4.3.3 enhances this feature by allowing you to specify target partitions or subpartitions and select specific columns for overwrite in INSERT OVERWRITE statements. This makes data replacement much more flexible and suitable for a wider range of business scenarios.
Load Data and external tables support compressed files
Load Data is a common way to import data, supporting server-side, direct load, and client-side scenarios—all of which can import formatted text files into the database. Previous versions only supported plain text files, so compressed files like GZIP had to be decompressed before import, which was inconvenient. The new version supports importing compressed files directly, including GZIP, DEFLATE, and ZSTD formats, allowing loading, decompression, and writing to happen simultaneously. Additionally, external CSV tables now support compressed files, so you can query data from compressed files directly via external tables.
Data type and storage optimization
ARRAY data type
ARRAY is a widely used complex data type in AP scenarios, allowing you to store multiple elements of the same type. It is especially useful for managing or querying multi-valued attributes that are not easily represented in relational models. OceanBase Database V4.3.3 introduces ARRAY type support in MySQL-compatible mode, letting you define columns as numeric or character arrays, including nested arrays. You can use constructor expressions for querying or writing array objects, use the array_contains expression and the ANY operator to check if an element exists in an array, and perform calculations or comparisons with operators such as +, -, =, and !=.
RoaringBitmap performance optimization
V4.3.2 introduced the RoaringBitmap data type and related expressions to support multidimensional analysis for user profiling, personalized recommendations, and targeted marketing, but performance in some scenarios was suboptimal. The new version focuses on optimizing RoaringBitmap calculations, improving memory allocation and expression execution logic, and streamlining unnecessary overhead. This results in significant performance improvements for cardinality, and/or/xor/andnot, and aggregation scenarios.
Resource management and isolation
Query-level resource group settings
OceanBase Database already supports configuring resource groups for users, background tasks, and column parameters via the DBMS_RESOURCE_MANAGER package, enabling CPU and IOPS isolation. V4.3.3 adds query-level resource group binding, allowing you to specify the resource group for a SQL statement using the
/*+ resource_group('group_name') */hint. If the resource group does not exist, the default group is used. Sessions must reconnect for resource group changes to take effect.
V4.3.2
Version information
- Release date: July 16, 2024
- Version: V4.3.2 Beta
New features and enhancements
Data import and export optimization
Enhanced incremental direct load
Incremental direct load was first introduced as an experimental feature in V4.3.1, but it came with some limitations—it did not support importing
outrow lobdata and only worked with theinc_replaceload mode (usingreplacesemantics). In V4.3.2, support has been added for incremental direct load withoutrow lobdata, and theload_modeoption now includesinc, which defaults toinsertsemantics or switches toignoresemantics if you specify theignorekeyword in SQL. This feature is now officially released.Full direct load performance improvements
In full direct load scenarios, V4.3.2 speeds up imports by reducing column type conversions, lowering CPU usage for statistics collection, removing primary key generation logic from the
SELECTside ofINSERT INTO SELECT, and disabling the default sum skip index (which pre-aggregatesSUMvalues for specified columns in storage). Disabling micro block verification (micro_block_merge_verify_levelset to 0) also contributes to roughly a 20% boost in import performance.Partitioned export using
SELECT INTO OUTFILEOceanBase Database already supports exporting to multiple files with
SELECT INTO OUTFILE, but previously did not allow partition-based exports. The new version adds the ability to export data by partition, creating a clearer directory structure. These directories can also be used as partitioned external tables, enabling partition pruning and improving query efficiency.
External data source integration
External table enhancements
OceanBase Database has long supported CSV external tables, but as analytics workloads have grown, the need to read Parquet files from data lakes has become increasingly common. V4.3.2 introduces support for Parquet external tables, allowing you to import data from files into OceanBase tables or use external tables for cross-source analytical queries. To keep file listings up to date, the new version adds automatic directory refresh options—when creating an external table, you can use the
AUTO_REFRESHoption to specify how file lists are updated (manually, in real time, or on a schedule), and scheduled refresh tasks can be managed via theDBMS_EXTERNAL_TABLE.REFRESH_ALL_TABLE(interval int) procedure.
Performance optimizations
Optimizations for analytical workloads
V4.3.2 delivers comprehensive improvements for data block prefetching, vectorized batch processing, filter pushdown, comparison and aggregation calculations, DTL shuffle, and monotonic filters. In benchmark tests with 100 GB of data for AP scenarios like TPCH, TPCDS, and ClickBench, performance is up by 10–15% compared to V4.3.1.
Data type and storage optimizations
RoaringBitmap
As data analysis needs grow, RoaringBitmap stands out for its space efficiency and fast computation, making it ideal for user profiling, personalized recommendations, and targeted marketing. OceanBase Database V4.3.2 introduces RoaringBitmap support in MySQL-compatible mode, enabling efficient storage and operations on sets of unsigned integers, which greatly improves performance for large-scale set operations and deduplication. This release supports more than twenty expressions for cardinality calculation, set operations, bitmap checks and construction, bitmap output, and aggregation.
Data operation optimizations
Table-level overwrite writes
Overwriting tables is a common requirement in data warehousing for periodic refreshes, data transformation, and cleaning. OceanBase Database V4.3.2 adds support for table-level overwrites (
INSERT OVERWRITE), allowing you to atomically clear old data and write new data. Combined with full direct load,INSERT OVERWRITEis highly efficient. V4.3.2 supports syntax likeINSERT OVERWRITE tablename SELECT * FROM tablename;partition-level overwrites are not yet available but are planned for future releases.
Task scheduling and asynchronous execution
Asynchronous task scheduling
OceanBase Database already supports various data import commands, such as
insert overwrite,insert select,create table as, andload data, for real-time data loading. However, real-time imports require you to wait for the process to finish and do not allow session interruption, which can be inconvenient for large-scale imports. The new version introduces asynchronous task scheduling viaDBMS_SCHEDULER, so you can create, monitor, and cancel import jobs using commands likesubmit job,show job status, andcancel job.
V4.3.1
Version information
- Release date: May 17, 2024
- Version number: V4.3.1 Beta
New features and enhancements
Materialized view enhancements
Enhanced materialized views
OceanBase Database V4.3.0 introduced materialized views, which precompute and store query results to reduce real-time computation, boost query performance, and simplify complex query logic for AP scenarios. Building on this, V4.3.1 adds real-time materialized views, enabling on-the-fly calculations that combine data from both the materialized view and its MLOG (change log)—ideal for analytics that require up-to-date information. This release also introduces primary key constraints for materialized views, letting users specify primary keys to optimize single-row lookups, range queries, and join performance. Additionally, incremental refresh is now supported for materialized views with inner joins, improving refresh efficiency in certain scenarios.
Previously, using materialized views required manually rewriting application queries to reference the view instead of the base table, increasing development effort. In this version, OceanBase Database introduces automatic query rewriting for materialized views in specific cases. When the system variable
QUERY_REWRITE_ENABLEDis set toTrue, you can enable automatic rewriting by specifyingENABLE QUERY REWRITEwhen creating a materialized view. The system will then automatically redirect applicable queries from the base table to the materialized view, reducing manual intervention.
Data import and export optimizations
Incremental direct load (experimental feature)
Since OceanBase Database V4.1.0, direct load has dramatically improved data import efficiency by bypassing SQL, transaction, and memtable layers, and writing data straight to SSTables. However, when importing data multiple times into the same table, each import previously required rewriting all existing data, which slowed down incremental loads. V4.3.1 optimizes this by allowing incremental direct load to only process new data, so repeated imports remain as fast as the initial load. You can use the
/*+ direct(need_sort, max_errors_allowed, load_mode)*/hint inLOAD DATAandINSERT INTO SELECTstatements to specify incremental direct load. Ifload_modeis omitted or set tofull, the standard full direct load is used; settingload_modetoinc_replaceenables incremental direct load. This feature is experimental in V4.3.1 and will be further enhanced in future releases.Performance optimization of SELECT INTO OUTFILE
The
SELECT INTO OUTFILEfeature previously supported parallel reading from tables but only allowed serial writing to external files, creating a bottleneck for data exports. V4.3.1 introduces parallel export capabilities, addingsingleandmax_file_sizeoptions to control how data is written. Thesingleoption lets you choose between exporting to a single file or multiple files; if you set parallelism greater than1andsingle = false, data can be exported to multiple files, allowing both parallel reading and writing. Themax_file_sizeoption lets you set a size limit for each output file.
Index enhancements
Full-text index for MySQL-compatible mode (experimental feature)
Traditional relational databases use indexes for fast, exact-match queries, but standard B-Tree indexes are not effective for fuzzy searches or queries over large text fields—these often require full table scans, resulting in poor performance with big data volumes. More complex queries, like approximate matching or relevance ranking, are also hard to achieve with basic SQL.
To address these challenges, OceanBase Database V4.3.1 introduces full-text indexing, which preprocesses text and builds keyword indexes to significantly improve search efficiency. This feature is experimental in V4.3.1 and will be expanded in future releases. Highlights include:
Full-text indexes are supported in MySQL-compatible mode, compatible with basic MySQL syntax;
- Full-text indexes can be prebuilt for CHAR, VARCHAR, and TEXT columns during table creation
- Full-text indexes are supported for partitioned tables
- Multiple full-text indexes can be created for the main table
- Three built-in tokenizers are supported: SPACE (space), NGRAM, and BENG (basic English)
- A single
MATCH AGAINSTclause can be used to perform full-text searches across multiple columns - The NATURAL LANGUAGE MODE is supported
Partition management
Partition exchange
As tables grow over time, they accumulate historical data that may not be frequently accessed but must be retained for compliance or analytics. To improve query performance, businesses often separate active and inactive data, archiving the latter. While SQL-based data migration is possible, it is inefficient for large volumes. OceanBase Database V4.3.1 introduces partition exchange, which instantly moves data from one table to a partition in another by updating metadata—no physical data copy is needed, greatly speeding up migrations. Key features:
- Exchange data between a partition in a partitioned table and a regular table
- Support for partitioned tables with range (range columns) partitioning
- Subpartitioned tables are supported, with partitions of any type and subpartitions of RANGE (RANGE COLUMNS) type
- Indexes included—local indexes are swapped and remain usable after the exchange
- Support for WITHOUT VALIDATION mode (user must ensure data matches the partition key range)
External data source integration
Partitioned external tables
OceanBase Database has supported external tables since V4.2.0, but only for non-partitioned tables. When dealing with many files, non-partitioned external tables had to scan all files for every query, which hurt performance. V4.3.1 introduces partitioned external tables, supporting list partitioning similar to regular tables and offering both automatic and manual partitioning syntax. With automatic partitioning, the system groups files based on the partition key; with manual partitioning, users specify the file path for each partition. This allows queries to prune partitions and scan only relevant files, significantly improving performance.
Data type and storage optimization
JSON multi-valued index in MySQL-compatible mode (experimental feature)
MySQL 8.0 introduced multi-valued indexes for JSON documents and other collections, enabling efficient element retrieval by indexing arrays and collections. OceanBase Database V4.3.1 brings JSON multi-valued index compatibility to MySQL-compatible mode, allowing secondary indexes on JSON array fields with multiple elements. This boosts query performance on complex JSON structures while keeping the data model flexible. This feature is experimental in V4.3.1 and will be further enhanced for production use. Key features:
- Pre-creation of multi-valued and composite multi-valued indexes
- Support for unique and non-unique multi-valued indexes
- Indexing JSON array elements of types INT, UINT, DOUBLE, FLOAT, CHAR, and more
- Support for partitioned tables
- Use of MEMBER_OF(), JSON_CONTAINS(), and JSON_OVERLAPS() functions with multi-valued index queries
MySQL JSON partial update
Many users store business data in JSON documents. Previously, updating these documents required reading and rewriting the entire document, which was inefficient for large payloads. V4.3.1 introduces JSON partial update, allowing users to update only specific fields using expressions like json_set, json_replace, or json_remove—no need to overwrite the whole document, greatly improving update performance. This feature must be enabled via the log_row_value_options configuration.
Query execution and resource management
SQL temporary result compression
Large SQL queries can run into memory limitations, requiring some operators to materialize intermediate results. If the materialized data is too large and fills up disk space, SQL execution may fail. V4.3.1 introduces SQL temporary result compression, allowing you to specify whether and how to compress temporary results using the tenant-level
spill_compression_codecsetting or a SQL-level hint like/*+opt_param('spill_compression_codec', 'lz4') */. Compressing temporary results helps reduce disk usage and supports larger queries.
V4.3.0 Beta
Version information
- Release date: March 22, 2024
- Version number: V4.3.0 Beta
New features
Columnar storage engine
For large-scale data analysis and ad-hoc queries over massive datasets, columnar storage is essential for analytical databases. Unlike row-based storage, columnar storage organizes data by columns, so queries only need to scan the relevant columns, not entire rows. This approach reduces I/O and memory consumption, accelerating query performance. Columnar storage also naturally enables better data compression, saving storage space and network bandwidth.
Traditional columnar engines are typically designed for mostly static data and do not handle frequent random updates well, which can lead to performance issues. OceanBase Database's LSM-Tree architecture solves this by processing baseline and incremental data separately. Building on this foundation, V4.3.0 introduces a columnar storage engine that seamlessly integrates both row and column storage within a single architecture and OBServer instance, delivering strong performance for both TP and AP workloads.
To make it easier for existing customers and AP workloads to adopt the new engine, OceanBase Database has optimized multiple components—from the optimizer and executor to DDL and transaction processing. Enhancements include a new cost model and vectorized engine for columnar storage, improved query pushdown, skip index support, new column encoding algorithms, and adaptive compaction.
Depending on workload needs, users can flexibly define tables as row-based, column-based, or hybrid (row and column redundant).
New vectorized engine
Earlier versions of OceanBase Database featured a vectorized engine based on a uniform data format, which significantly improved performance compared to non-vectorized processing. However, there were still performance gaps for deep analytical scenarios. V4.3.0 introduces vectorized engine 2.0, which uses a columnar data format to eliminate the memory, serialization, and read/write overhead associated with maintaining ObDatum. With this redesign, a number of core operators and expressions have been reimplemented, including HashJoin, AGGR, HashGroupBy, Exchange (DTL Shuffle), and over 20 MySQL expressions covering relational, logical, and arithmetic operations. Future V4.3.x releases will continue to expand support for more operators and expressions, further boosting analytical performance.
Materialized views
V4.3.0 introduces materialized views, a key feature for analytical workloads. Materialized views precompute and store query results, reducing the need for real-time computation, speeding up queries, and simplifying complex logic. They are especially useful for generating reports and supporting data analysis.
Since materialized views store query results for performance, they rely on the underlying base tables. Whenever the base table data changes, the materialized view needs to be updated to stay in sync. This release introduces a refresh mechanism for materialized views, supporting both full and incremental refresh strategies. Full refresh is straightforward—the system reruns the view’s query and fully replaces the stored results, which works well for smaller datasets. Incremental refresh processes only the changes since the last refresh. To make this possible, OceanBase Database now includes a materialized view log (similar to Oracle’s MLOG), which tracks updates to the base table. This enables fast, precise incremental refreshes, which are especially beneficial for large, frequently changing datasets.