Glossary|V4.2.2| docs|Distributed Database

A

Access Path

An access path can be used to access a database table. In most cases, data can be queried based on a primary key index or a secondary index.

ACID

ACID is an acronym for atomicity, consistency, isolation, and durability. OceanBase Database ensures data reliability and consistency during data updates by using transactions with ACID properties.

Active Geo-redundancy

Active geo-redundancy is a technology designed to enhance system availability and disaster recovery capabilities. It achieves this by replicating and synchronizing data across numerous geographical locations, guaranteeing both high availability and data consistency. In practice, active geo-redundancy is used to duplicate primary business data across multiple data centers or regions.

Active MemTable

An active MemTable refers to the MemTable that is currently active and can receive incremental data writes. It is the counterpart to the Frozen MemTable.

Active Session History

Active Session History (ASH) is a diagnostic tool that records information about all active sessions in OceanBase Database.

Active Session History Report

Active Session History (ASH) reports provide an analytical overview that helps you identify immediate anomalies. Performance reports typically cover snapshot information at the hourly level, which does not allow for deep analysis at the session level. As a result, it can be difficult to obtain execution details for transient jitter from performance reports. Therefore, you can address this issue by leveraging ASH reports, which provide session-level diagnostic information with fine granularity.

Active Transaction

An active transaction is a transaction that has been started but not committed or rolled back. The modifications made by an active transaction are temporary until the transaction is committed. These modifications are also not visible to other transactions.

Adaptive cursor sharing is a mechanism that allows the optimizer to store multiple execution plans for each parameterized SQL statement and select an appropriate plan based on the selectivity of predicates in an SQL statement.

agentrestore.jar

agentrestore.jar is a restore tool and a resident process. It is a JAR package that is written in Java. agentrestore.jar queries the oceanbase_restore table in the MetaDB at a specified interval and controls the initiation of all restore tasks. agentrestore.jar also updates the status of the four tables that are used for restore tasks when tasks proceed.

This term is deprecated in OceanBase Database V4.x.

AgentServer

AgentServer is a backup tool and a resident process. AgentServer queries the base_data_backup table in the MetaDB at a specified interval for backup tasks. AgentServer controls the initiation and cancellation of backup tasks for baseline and incremental data. AgentServer also updates the status of the four tables that are used for backup tasks when tasks proceed.

This term is deprecated in OceanBase Database V4.x.

Antman

Antman is a CLI tool (also known as oat-cli) that supports quick and easy installation of OceanBase Database tools, such as OceanBase Cloud Platform (OCP), OceanBase Migration Service (OMS), and OceanBase Developer Center (ODC).

Arbitration Service

The arbitration service is an independent process that provides arbitration capabilities for the OceanBase cluster. It offers high availability for the OceanBase cluster. A single arbitration service can be connected to multiple OceanBase clusters. The arbitration service is carried out by an arbitration server process, and currently, each arbitration service only consists of one arbitration server process.

Arbitration Server

An arbitration server refers to a machine where the arbitration service is deployed.

Arbitration Member

For tenants with the arbitration service enabled, each log stream corresponds to an instance in the arbitration service, which is referred to as an arbitration member. Arbitration members and log stream replicas are equivalent. For example, if a tenant's deployment scheme is 2F+1A (two full-featured replicas with one arbitration service), then all log streams for that tenant will have two full-featured (F) replicas and one arbitration member.

Arbitration Replica

An arbitration replica is equivalent to an arbitration member.

Arbitration Downgrade

Arbitration downgrade is a disaster recovery behavior where the arbitration service participates in voting when more than half of the full-featured replicas fail. It changes the failed replicas to learners, which no longer participate in log synchronization.

Arbitration Upgrade

Arbitration upgrade is a recovery behavior where the arbitration service participates in voting when a replica fails. It changes the failed replica back to a Paxos acceptor, which becomes a participant in log synchronization again.

Availability Zone

An availability zone (AZ) is a cluster of IDCs that are mutually isolated in the same geographic region. Generally, an AZ uses independent power, network, and cooling systems to ensure high availability and redundancy.

AZs are generally used in cloud computing platforms, such as Amazon Web Services (AWS) and Microsoft Azure, to ensure high availability and fault tolerance. To ensure service continuity in the case of unavoidable failures such as hardware failures and natural disasters, you can deploy applications and data across different AZs in the same region.

An AWS AZ is one or more discrete data centers in an AWS region. This design ensures that the AZs are independent of each other, thus improving the availability and fault tolerance of services running across the AZs. In AWS, each AZ has its dedicated power, network, and cooling systems and is isolated from other AZs. You can deploy your application across two or more AZs in the same region to ensure high availability and reduce the impact of rare but possible AZ failures.

B

Backup MetaDB

The backup MetaDB stores one parameter table backup_base_profile and four tables that are used for backup tasks: base_data_backup, base_data_backup_task, base_data_backup_task_history, and inc_data_backup. In most cases, the backup MetaDB and restore MetaDB are deployed in the same database.

This term is deprecated in OceanBase Database V4.x.

Baseline Data

Baseline data is read-only ordered data that is generated in major compactions and stored on persistent media.

Baseline Data Version

A baseline data version is a version of baseline data.

Block Cache

A block cache is a memory-based cache that stores microblocks. It is used to minimize the I/O operations involved in frequently accessing microblocks, thereby enhancing the performance of queries.

Block Index Cache

Block index cache refers to the cache of micro block indexes in the memory. Block index cache is used to improve query performance in frequent accesses. While each SSTable is built on macro blocks, the granularity of 2 MB is coarse for user queries. Therefore, the required micro blocks need to be located in a macro block based on the query range. Micro block indexes describe the range of all micro blocks in a macro block. When the micro blocks of a macro block need to be accessed, micro block indexes need to be loaded in advance. Micro block indexes are small in size because prefix compression is performed. In addition, micro block indexes have a high priority in OceanBase Database. This results in a high hit rate.

Bloom Filter

A Bloom filter quickly determines whether a row exists in baseline data or minor compaction data. If the row does not exist, disk I/O and CPU consumption can be reduced.

Bloom Filter Cache

In OceanBase Database, Bloom filters are created based on the actual empty query rate on macro blocks. If the number of empty queries on a macro block exceeds the specified threshold, a Bloom filter is created and placed into the cache.

Business Continuity

Business continuity refers to the ability and plan to maintain business continuity and stability in the case of internal or external risks, threats, or disasters.

C

Cascading Replication

Cascading replication is a data replication and synchronization technique that can automatically replicate the update operations in a database to multiple followers to implement data distribution and synchronization.

Change Data

Change data refers to the data updated or inserted within a specific period.

Change Data Capture

Change data capture (CDC) is a service that captures and records data changes in the database. When data is inserted, updated, or deleted in your database, CDC captures and records the changes based on the sequence of these changes. In this way, other systems can promptly process and use the latest data in your database.

Change Synchronization

Change synchronization is to transmit only data or file changes, rather than all data or files.

Cloud Server

A cloud server is a virtual physical server that is built and maintained by the service provider and leased by users on demand. A cloud server can provide secure and reliable elastic computing services on the cloud and supports computing resource scaling based on business needs.

Cluster

An OceanBase cluster spans one or more regions. A region consists of one or more zones, and one or more OBServer nodes are deployed in each zone. Each node can have multiple units. Each unit can have multiple log stream replicas. Each log stream can use multiple tablets.

Cluster Instance

OceanBase Database provides cloud-based database services in clusters and supports the following specifications: 4C, 8C, 14C, 24C, 30C, and 62C. By default, an OceanBase cluster has three replicas, which can be three full-featured replicas, or two full-featured replicas and one log replica. The MySQL and Oracle modes are supported.

Commit Log

OceanBase Database records all modifications to the database status, including transactions, as redo logs, and then persists them in multiple replicas across the cluster through the Multi-Paxos protocol. This set of redo logs is referred to as commit logs (clogs) in OceanBase Database.

Clogs are organized based on log streams, with each log stream having its own set of clog files. Additionally, the clog files of each tenant are isolated, and each tenant can set the clog space size that they require.

Clogs are used to ensure the persistence and atomicity of database transactions in the event of a database instance failure. They are also used for multiple database features such as standby nodes, physical standby databases, and change data capture, which enable real-time querying of status change operations in the database.

Container Service

The container service provides high-performance and scalable management services for containerized applications. It supports the lifecycle management of applications by using containers, providing various deployment methods and continuous delivery capabilities. This service supports microservices architectures and is associated with essential cloud resources such as server nodes, load balancing, and dedicated networks. The service offers secure, high-performance deployment solutions that support hybrid cloud environments. Additionally, it integrates load balancing to provide accessibility to containers. It also streamlines the upstream and downstream delivery processes by using the high availability scheduling strategy.

D

Data Balancing

OceanBase Database uses the Root Service to manage load balancing among the units of a tenant. The resources required vary according to the replica type. Root Service performs load balancing based on the CPU utilization, disk usage, memory usage, and IOPS of each unit. To make full use of resources available on each OBServer node, Root Service balances the usage of various resources among all OBServer nodes after load balancing.

Database

A database is a repository that organizes, stores, and manages data by using data structures. A database contains tables, indexes, and metadata of database objects.

Database as a Service

Database as a service (DBaaS) means that you can purchase cloud-based database services in software-as-a-service (SaaS) mode. The cloud-based database service provider owns all resources, including underlying infrastructure-as-a-service (IaaS) resources and database software. Underlying resources are unperceivable to you. The cloud-based database service provider is responsible for the O&M of the entire database product. DBaaS is a type of SaaS.

Database (Object)

A database can contain database objects such as tables and views.

Data Flow Object

Taking data redistribution points as the boundary, a parallel plan is divided into multiple logical subplans that can be executed concurrently. Each subplan is encapsulated in a data flow object (DFO).

Data Ingestion

Data ingestion is to import data from different sources to a specific system or application for processing, analysis, visualization, or storage.

Data Integration

Data integration is a process that integrates data in different formats and locations from different sources into the same view.

Data Migration

Data migration is to migrate data from one storage location or system to another location or system.

Data Replication

Data replication is a process of replicating data from one database to another database. During data replication, data is extracted from the source database, processed and transmitted, and replicated to the destination database. This ensures synchronization and consistency of data between the systems.

Data Skew

Data skew means that one or more values frequently appear in the data and account for a large proportion of the data. In distributed execution, data skew causes long tails, and execution threads assigned to these values take more time in execution.

Data Subscription

Data subscription is a mechanism that allows you to obtain updated data in real time or periodically by subscribing to specific database objects such as tables, views, or queries.

Data Transfer Layer

Data transfer layer (DTL) is a network transmission framework that you can use to transmit data between execution threads in a distributed parallel execution framework.

Data Transformation

Data transformation is to process and convert the original data based on specific rules to generate new data or extract data characteristics for analysis, modeling, and visualization.

Direct Input/Output

Direct input/output (DIO) is a file read/write method that bypasses the page cache of the operating system.

Distributed Execution

When an execution plan is executed in a distributed manner, the plan is executed on multiple OBServer nodes, each of which contributes to the execution process.

Distributed Plan

When an execution plan involves multiple tables or partitions, the plan is defined as a distributed plan.

Distributed Transaction

Transactions in OceanBase Database are classified into distributed transactions and single-log-stream transactions. The type of a transaction varies based on the session position of the transaction and the number of log stream leaders that are involved in the transaction.

A transaction is called a distributed transaction when it meets any of the following two conditions:

The transaction involves more than one log stream.
The transaction involves only one partition, but the partition leader does not reside on the same server as the transaction session.

DOOBA

DOOBA is an internal O&M script of OceanBase Database that is used for performance monitoring. DOOBA is developed in Python and supports only Python 2.7. After you connect to the sys tenant of OceanBase Database by running a MySQL command, you can run DOOBA to display the queries per second (QPS) and the average response time (RT) of the SQL statements executed in the tenant in real time. The supported SQL types include SELECT, UPDATE, INSERT, DELETE, and COMMIT. You can also view the QPS and RT of SQL statements executed on each OBServer node.

E

Election Without Leader

If no leader exists for a partition, a leader is elected from multiple replicas of the partition. This process is known as election without leader. This process is triggered when a leader needs to be elected for a partition after the cluster is restarted or when the original leader of the partition fails. If a partition already has a leader, you can initiate this process for the partition only after the lease of the original leader expires.

Embedded SQL in C for OceanBase

Embedded SQL in C for OceanBase (ECOB) is an OceanBase precompiler that provides features compatible with Oracle ProC. ECOB comprises the ecob precompiler and the ecoblib library (libecob.so), a runtime dynamic link library.

Encoding

Based on general compression, OceanBase Database provides a hybrid row-column storage encoding method for databases. In contrast to general compression, encoding is a process in which a compression algorithm compresses data blocks based on the format and semantics of data in data blocks. OceanBase Database is a relational database in which data is organized by table. Each column of a table represents a fixed type of data. Therefore, data in the same column is similar in a logical sense. In specific scenarios, data in adjacent rows of a business table are also similar. Therefore, you can compress and store data by column to improve compression performance. To compress data by column, OceanBase Database introduces micro blocks in the encoding format. Unlike micro blocks in the flat format in which all data is serialized by row, micro blocks in the encoding format are stored in hybrid row-column storage mode. Logically, a micro block still stores a set of row data, but the data is encoded by column. Fixed-length encoded data is stored in the column store area of the micro block, and variable-length data is stored in the variable-length area by row. Data in micro blocks in the encoding format can be randomly accessed. If you want to read a specific row of data in a micro block, you can decode only the row of data. This prevents specific decompression algorithms from decompressing the entire data block when you want to read only a part of data in the data block. To reduce projection overheads, you can also decode only specified columns during vectorized execution.

End-to-End Tracing

End-to-end tracing is performed in two paths. In one path, the application sends a request to ODP by using a client, such as JDBC or OCI, to access an OBServer node, which then returns the result to the application. In the other path, the application directly accesses the OBServer node by using a client, which then returns the result to the application. End-to-end tracing locates issues that occur in all components that are involved in the entire data access process.

Execution Plan

An execution plan, or plan for short, is a collection of physical code for executing SQL requests in a database. An execution plan is generally an execution tree that consists of operators.

Execution Plan Binding

Execution plan binding is a process in which execution plans for SQL statements are specified by using outlines without the need to use an optimizer. Execution plan binding is suitable for scenarios in which an execution plan generated by the optimizer is invalid or inefficient.

Execution Plan Matching

Execution plan matching is a process in which a database selects an appropriate execution plan from the plan cache to execute your SQL statement.

F

Failover

The primary and standby databases perform a failover to change their roles. If the primary database cannot provide services due to a failure or interruption, you can perform a failover to switch services to the physical standby database. This process is irreversible.

Fast Parsing

Fast parsing is a process in which real parameters are extracted from SQL statements at a high speed. This process is exclusive to OceanBase Database. Based on inherent characteristics of the plan cache of OceanBase Database, fast parsing prevents semantic analysis during the re-execution of an input SQL statement by adding constraints. This improves plan matching efficiency.

Fault Tolerance

Fault tolerance refers to the ability of a system, device, or software solution to continue to run properly or minimize the impact on the system in the case of a fault or error.

Flashback Query

OceanBase Database supports record-specific flashback queries, which allow you to obtain data of a specific historical version.

Forward Shifting

Forward shifting is the process of gradually transferring data to a new system or platform when data is migrated from one system or platform to another system or platform.

Frozen MemTable

When the memory usage of the active MemTable reaches the specified threshold, a freeze is performed to generate a frozen MemTable. No incremental data is written to the frozen MemTable.

Frozen Version

A frozen version is the version number that is used when a freezing operation is performed.

Full Compaction

In full compaction, all macro blocks of partitions are rebuilt regardless of whether the macro blocks are modified. You can specify whether to enable full compaction for a table. If full compaction is enabled for a table, the system initiates full compaction when the schema changes (such as column addition and deletion) or the storage properties change (such as the modification of the compression level). During the full compaction of a table, progressive compaction is enabled to reduce the time spent on a single compaction.

During a full major compaction, all the current static data is read and compacted with the dynamic data in the memory. The compacted data is written as the new static data to a disk. In this process, all data is rewritten. A full major compaction significantly consumes the disk I/O and space. Therefore, unless specified by a database administrator (DBA), OceanBase Database does not initiate full major compactions.

Full Verification

Full verification is the process of comparing and verifying the backup data and original data during data backup and restore. During full verification, the backup data and original data are compared row by row to ensure the integrity and accuracy of the backup data.

Fuse Row Cache

In the LSM tree architecture, the modifications of the same row may be stored in different SSTables. To optimize storage space usage, OceanBase Database stores only incremental data for each update. Therefore, the query results of various SSTables need to be fused. Before a new update is triggered, the existing fusion result is always valid for queries. OceanBase Database provides a fuse row cache for fusion results to facilitate hotspot row queries. The fuse row cache is similar to the result cache in Oracle and the query cache in MySQL. It can quickly return the result when you query the same data. This reduces the database load and improves the query performance.

G

Geo-redundancy

In geo-redundancy, the same infrastructure, such as computing devices, storage devices, and network devices, is deployed in different geographic locations to ensure that when a disastrous fault occurs in one location, the other location can continue to provide services and ensure data security and reliability.

Global Consistent Snapshot

If global consistent snapshots do not exist, distributed databases cannot support cross-node consistency read or ensure causal sequences. In OceanBase Database V1.4.x, application system designers and developers must ensure that multiple tables and partitions accessed in one SQL statement are located on one OceanBase Database node. In business systems that rely on operation sequences, OceanBase Database V1.4.x cannot ensure that two sequential transactions can respectively modify two tables on two nodes. To resolve these issues, OceanBase Database V2.0 provides the global consistent snapshot feature. Compared with atomic clock-based Google TrueTime, the Global Timestamp Service (GTS) of OceanBase Database is fully dependent on software. GTS does not rely on specific hardware devices or set additional requirements for the deployment environments of customers. This allows OceanBase Database to serve more Apsara Stack customers. After GTS is enabled, OceanBase Database V2.x can support cross-node reads/writes and causal sequences in the same way as that in a standalone database.

Global Index

A global index is a cross-partition data index in OceanBase Database. A global index allows you to quickly locate the partition to which the specified data belongs based on the global key values and primary keys of stored data.

Global Timestamp Service

In OceanBase Database, a global timestamp service (GTS) is started for each tenant. When you commit a transaction, the transaction version number is queried by using the GTS of the current tenant. This ensures a global transaction order.

Granule

A granule is the minimum task granularity that is used to scan tables and indexes in a distributed parallel execution plan. A granule can be a partition or a query range.

Group Commit

Group commit is a process in which logs of multiple transactions are persistently stored to a disk during an I/O operation. Group commit helps improve the log writing efficiency.

H

HBaseAPI

Based on the basic APIs provided by TableAPI, the HBaseAPI client of OceanBase Database encapsulates the APIs compatible with HBase. Currently, the features of HBase 0.94 are supported. If your business system uses the native HBase data operation logic, you can install an OceanBase cluster to build HBase tables on OBServer nodes and perform data operations by using HBaseAPI.

High Availability

High availability is the ability of a system or service to continuously run and provide required services within a period.

Hint

A hint is a user-defined primitive that is used to specify the optimizer behavior in a database.

I

Incremental Compaction

On the storage engine of OceanBase Database, a macro block is the basic I/O write unit. Not all macro blocks are modified in many cases. If a macro block has no incremental modifications, the data of this macro block can be directly reused during a major compaction. This mode is called incremental compaction in OceanBase Database. This mode greatly reduces the compaction workload. Therefore, it is the default compaction mode in OceanBase Database. OceanBase Database splits each macro block into multiple micro blocks. Not all micro blocks are modified in many cases. Therefore, users can reuse the data of specific micro blocks without the need to rewrite data to them. Microblock-level incremental compactions further reduce the compaction time.

Incremental Data

Incremental data, including the data in the MemTable and the minor compaction data, is the data modified by the INSERT, UPDATE, and DELETE operations. Such data is not merged with the baseline data.

Initial Load

Initial load is the process of replicating a complete dataset from a source data source to the destination location.

Initial Load Extract

Initial load extract is the process of extracting a complete dataset from the source database. Generally, an initial load extract is performed during data backup, data replication, or data migration.

J

Join Algorithm

The algorithm for joining two tables, including NESTED LOOP JOIN, MERGE JOIN, and HASH JOIN.

Join Order

Join order refers to the order in which tables are joined in a multi-table join.

L

Label Security

Label security is a forcible access control method. A label column is added to the table to record the label values of each row. When you access the table data, your labels are compared against the data labels to limit data access.

Latency

Latency refers to the time required by a computer or network system to process tasks or data. It is the interval from the point in time when a request is sent to the point in time when the system returns a response.

Leader/Follower

The concepts of leader and follower define the roles of table partition replicas at a specific point in time. In OceanBase Database, each partition has at least three replicas. Redo logs are synchronized from the leader to followers based on the Paxos protocol. A transaction on the leader can be committed only after redo logs are received by more than half of the three members and written to the disk. All replicas (such as three or five replicas) of a partition constitute a Paxos group, and a leader is autonomously elected in the group. If an OceanBase node fails, the observer process of the leader on the node is interrupted, and the OceanBase cluster automatically elects a new leader.

Learner

A learner refers to a replica that does not participate in the majority-based synchronization of logs, such as a read-only replica.

Liboblog

Liboblog is an incremental data synchronization tool of OceanBase Database. It pulls the redo logs from each partition of an OceanBase database by using remote procedure calls (RPCs), converts the redo logs into an intermediate data format based on the schema information of each table and column, and then outputs the modified data as a transaction.

Load Balance

Load balance is a process in which the system dynamically adjusts the locations of units and the locations of replicas in the units based on the specified strategy, to achieve a balanced resource usage on all servers in the same zone. Load balancing strategies are created based on several factors. OceanBase Database achieves load balancing by using two levels of scheduling.

For more information, see Partition Scheduling and Unit Scheduling described in this glossary.

Local Execution

OceanBase Database receives client requests to generate execution plans and executes the plans on the same server.

Local Plan

If an execution plan involves only a single table or a single partition of a partitioned table, and the table or partition is in the current node, the plan is defined as a "local plan".

Local Transaction

Local transaction is a concept opposite to cross-server distributed transaction. In a local transaction, all log stream leaders of tables on which a local transaction operates are on the same server as that of the transaction session.

Locality

Locality describes the replica types and locations of a table. The basic syntax is replicas@location, which consists of the following elements:

replicas: F indicates a full-featured replica, and L indicates a log replica.
location: This element contains a collection of enumerated values that are known to the system. The value of this element is the name of a zone, such as hz1 and bj2.
Count: If the count value is not specified, only one replica exists. {n} indicates that N replicas exist. The special value {all_server} indicates that the number of replicas is the same as the number of available servers. For the sake of implementation, a partition can have at most one full-featured replica and one log replica in a zone. These replicas are part of the Paxos replica group. A partition can contain several read-only replicas in a zone.

Location Cache

OceanBase Database organizes user data by log stream and tablet. Each log stream has multiple replicas for disaster recovery. To execute an SQL statement, OceanBase Database must obtain the locations of partition data. Then, OceanBase Database can locate the specific server to read data from or write data to the corresponding replica. Each observer process provides a service for refreshing and caching the partition locations required by the local server. The service is called a location cache service.

Log Archive

Log archiving refers to the automatic backup of log data. OBServer nodes regularly archive log data to the specified backup path without manual triggering.

Log Group

In OceanBase Database, each archive log is a log set that comprises multiple log entries. This archive log is referred to as a log group. Each log entry is associated with a system change number (SCN). A log group also has an SCN, which is the largest among the SCNs of all log entries. Log archiving is a process that manages and organizes log groups in the archive media.

Log Service

The log service is intended for log data. With the log service, you can complete log data collection, consumption, delivery, query, and analysis without development procedures. The log service improves O&M and business efficiency and provides massive log processing capabilities in the DT era. General log system solutions include self-managed systems based on Elasticsearch, Logstash, and Kibana (ELK Stack), and mature cloud products such as Simple Log Service (SLS) of Alibaba Cloud.

Log Stream

A log stream (LS) is an entity that is automatically created and managed by OceanBase Database. It is a collection of data and contains several tablets and ordered redo logs. It uses the Paxos protocol to synchronize logs between replicas to ensure data consistency between the replicas and thereby implement high availability of data. Log stream replicas can be migrated and replicated between servers for server management and system disaster recovery.

From the perspective of data storage, a log stream can be abstracted as a tablet container that supports the addition and management of tablet data and the transfer of tablets between log streams for data balancing and horizontal scale-out.

From the perspective of transactions, a log stream is a unit for committing transactions. If the modification in a transaction is completed within a single log stream, the transaction can be committed by using the one-phase atomic commit logic. If the modification in the transaction is completed across multiple log streams, the transaction can be committed by using the two-phase atomic commit protocol of OceanBase Database. Log streams are participants of distributed transactions.

Log Stream Group

A log stream group (LS group) is an independent Paxos group that consists of a log stream and its replicas. The Paxos consensus protocol is used to ensure strong consistency among the members. In each LS group, one log stream serves as the leader and the rest as followers. The leader supports strong-consistency reads and writes, and the followers support weak-consistency reads.

Logical Data Center

A logical data center (LDC) is the logical representation of an Internet data center (IDC). If OceanBase Database is deployed in multiple IDCs across regions, all OBServer nodes are grouped by region and zone. In this case, client routing of OceanBase Database and remote procedure call (RPC) routing within OceanBase Database are known as LDC routing.

M

Macro Block

OceanBase Database splits the disk into several 2 MB fixed-length data blocks called macro blocks. A macro block is the basic unit of write I/O operations. Each SSTable comprises several macro blocks. The length of 2 MB fixed-length macro blocks cannot be changed. Minor compactions, major compactions, and data replication and migration are all carried out at the granularity of macro blocks.

Macro Block Merge

After data is deleted from macro blocks, all rows in several adjacent macro blocks can be stored in one macro block. Macro block merge refers to the process that combines several adjacent macro blocks into one macro block.

Macro Block Prefetch

Macro block prefetch is a process in which adjacent macro blocks are pre-read as required during range queries.

Macro Block Recycle

During major compactions, new baseline data is generated based on the original baseline data and modified data. To reuse the macro blocks of the original baseline data is referred to as macro block recycle.

Macro Block Split

When data is inserted or updated in a macro block, the storage capacity of the macro block becomes insufficient. To resolve this issue, the data of the macro block is split and stored in multiple macro blocks. This process is referred to as macro block split.

Major Compaction

In OceanBase Database, the concept of major compaction, also known as daily compaction, has a minor difference from other LSM-tree-based databases. This concept was first proposed to indicate an overall compaction of the entire cluster at about 2:00 a.m. every day. Major compactions are initiated by Root Service of each tenant based on the write status or user settings. Root Service selects a global snapshot in each major compaction of each tenant and performs major compactions for all partitions of the tenant by using the snapshot data. In this way, each major compaction of all data of the tenant generates an SSTable based on the unified snapshot. This helps you integrate incremental data regularly, improves read performance, and provides a natural data verification point. With global consistency snapshots, OceanBase Database supports physical data verification between multiple replicas and between primary tables and index tables.

Major Freeze

In a major freeze, all nodes in the cluster freeze the current active MemTables at the same snapshot point and no longer accept write operations of new transactions. Write operations of new transactions are performed in new active MemTables.

Major Freeze Version

A major freeze version is the version number of a major freeze.

Major SSTable

The L2 layer is where the baseline major SSTable is placed. The major SSTable is read-only and does not participate in actual compaction operations during routine minor compactions. This ensures that the baseline data is consistent among the replicas.

Management as a Service

You can deploy a database software solution in your own infrastructure and purchase management service from a cloud-based database service provider. This is called Management as a Service (MaaS).

Maximum Availability

This protection mode maximizes data protection without compromising cluster availability. By default, a transaction can be committed only after the persistence of redo logs in the primary and the standby clusters in SYNC mode is completed. However, if the primary cluster detects a failure of the standby cluster in SYNC mode, the primary cluster no longer waits for the synchronization to complete. Instead, the services of the primary cluster are resumed in the same way as that in maximum performance mode to ensure the availability of the cluster. After the services of the standby cluster in SYNC mode are resumed, the primary cluster continues to synchronously transfer redo logs to this standby cluster to provide maximum data protection.

Maximum Performance

It is the default protection mode. It protects user data and maximizes the performance of the primary cluster. In this mode, OceanBase immediately commits transactions after redo logs are persisted in the primary cluster. OceanBase asynchronously synchronizes redo logs to the standby clusters without blocking the transaction commit of the primary cluster. Therefore, the performance of the primary cluster is not affected by the synchronization latency of standby clusters.

Maximum Protection

This mode provides the highest level of data protection and prevents data loss when the primary cluster fails. In this mode, transactions can be committed only after the successful persistence of redo logs on the primary cluster and the standby cluster in SYNC mode.

In this protection mode, you can configure only one standby cluster in SYNC mode, and other standby clusters must be in ASYNC mode. If the standby cluster in SYNC mode is unavailable, the primary cluster stops the write service.

Membership Log

Membership logs are a special type of commit logs that record the member group changes of a partition.

MemTable

A MemTable stores all incremental modification records in the memory.

Meta Tenant

Meta tenants are used for internal management in OceanBase Database. When you create a user tenant, a corresponding meta tenant is automatically created. The lifecycle of a meta tenant is the same as that of its user tenant. You can use a Meta tenant to store and manage cluster-related private data of the corresponding user tenant. This private data, such as parameters and information about locations, replicas, log stream status, backup and recovery, and major compaction, does not require cross-database physical synchronization or physical backup and recovery. You cannot log on to a meta tenant. You can only query the data in a meta tenant from views in the sys tenant. A meta tenant has no independent resource units. When a meta tenant is created, resources are reserved for it by default. The resources are deducted from those of the corresponding user tenant.

Micro Block

Data in a macro block is grouped into multiple variable-length data blocks of approximately 16 KB. These blocks are called micro blocks. A micro block contains several rows and is the smallest unit of read I/O operations. Each micro block is compressed based on a specified compression algorithm during construction. Therefore, what the macro block stores are actually the compressed micro blocks. When a micro block is read from the disk, it is decompressed in the background, and decompressed data is stored in the data block cache. You can specify the size of each micro block when you create a table. By default, a micro block is 16 KB in size. You can specify the size of the micro block using statements, but the size of the micro block cannot exceed that of a macro block.

Migration

Migration is a process in which a replica of a partition is migrated from one node to another node. The replica is added to the target node and then deleted from the source node.

Mini SSTable

A mini SSTable is an SSTable at the L0 layer. The SSTables at the L0 layer can be empty based on the parameter settings of different minor compaction strategies. For the L0 layer, server-level parameters are provided to specify the number of sublayers and the maximum number of SSTables allowed per sublayer. The sublayers of the L0 layer are numbered from level-0 to level-n, and the maximum number of SSTables allowed is the same for each sublayer. If the number of SSTables at the level-n sublayer reaches the upper limit, these SSTables are compacted into one SSTable and written to the level-n+1 sublayer. If the number of SSTables at the lowest sublayer of the L0 layer reaches the upper limit, a compaction from L0 to L1 is performed to release the memory space. If the L0 layer exists, the frozen MemTables are compacted to generate a new mini SSTable for the level-0 sublayer of the L0 layer. The multiple SSTables at each sublayer of L0 are sorted by base_version. The versions of SSTables involved in subsequent intra-layer or inter-layer major compactions must be adjacent. In this way, the SSTables are arranged in order by version, which simplifies the operation logic of subsequent reads and major compactions.

The internal layers of L0 slow down the compaction to L1 and reduce write amplification but cause read amplification. For example, L0 contains n sublayers and m SSTables per sublayer. L0 contains at least (n × m + 2) SSTables. Therefore, the number of sublayers and the maximum number of SSTables allowed per sublayer must be controlled within a reasonable range.

Minimal Downtime Migration

Minimal downtime migration is a strategy to minimize service downtime during data migration. Generally, services must be interrupted for a period during data migration or system upgrade to ensure data integrity and consistency. However, a long service downtime may cause a significant impact on businesses. The minimal downtime migration strategy aims to shorten the service downtime as much as possible to reduce the impact on businesses.

Minor Compaction

The concept of minor compaction in OceanBase Database is similar to that of compaction in other LSM-tree-based databases. During minor compactions, data in MemTables is written to SSTables, and data in multiple SSTables is compacted. OceanBase Database adopts the leveled and size-tiered compaction strategy. A database is divided into three layers. The L1 and L2 layers are at a fixed level, and the L0 layer is size-tiered. In the L0 layer, compactions are performed based on the write amplification factor and the number of SSTables.

Minor Freeze

If the incremental data size of a partition in the memory exceeds a specific threshold, the system freezes the current active MemTable of the partition. The MemTable no longer accepts the write operations of new transactions. The write operations of new transactions are performed in a new active MemTable of the partition.

Minor SSTable

The L1 layer is where minor SSTables are placed. The minor SSTables at the L1 layer are sorted in order by rowkey. When the number of mini SSTables at the L0 layer reaches the threshold, the minor SSTables are involved in the compactions at the L0 layer. L1-layer compactions are scheduled only when the ratio of the total size of mini SSTables at the L0 layer to that of minor SSTables at the L1 layer reaches a specified threshold. Otherwise, compactions are performed only within the L0 layer. This improves the compaction efficiency and reduces the overall write amplification.

Monitoring System

A general monitoring system monitors the resource usage and application status. In resource monitoring, basic cloud resources, such as CPU, memory, and I/O of Elastic Compute Service (ECS) or Relational Database Service (RDS), are monitored. In application monitoring, the health status of applications and the specific metrics of the application system, such as remote procedure calls (RPCs), error count, and PV activeness, are monitored.

Move In

A move-in operation adds a table group attribute to a table without a table group attribute by changing the schema. The database checks the partitioning methods of the table and the target table group. The DDL operation succeeds only if the same partitioning methods are used.

Move Out

Move-out is the reverse operation of move-in. This operation deletes the table group attribute from a table by changing the schema. This is equivalent to removing the table from the table group. If you switch the table group attribute of a table from A to B by changing the schema, the table is removed from A and then added to B.

Multi-level Cache

To improve performance, OceanBase Database supports multi-level caches: block cache for data micro blocks in queries, row cache for each SSTable, fuse row cache for fusion query results, and Bloom filter cache for null judgment during insertion. All caches in a tenant share the memory. When data is written to the MemTable in an excessively fast manner, you can flexibly schedule memory resources of cached objects for data writes.

Multi-Paxos

Multi-Paxos is an optimization protocol that runs multiple Paxos instances. OceanBase Database uses the Multi-Paxos protocol to implement multi-IDC persistence of commit logs.

Multi-tenant

In OceanBase Database, each tenant is an instance (similar to a MySQL instance). In an OceanBase database, you can create multiple instances or tenants.

Multi-version Concurrency Control

Multi-version concurrency control (MVCC) manages concurrent operations on multiple versions of data.

MySQL Mode

OceanBase Database supports the MySQL-compatible mode, in which you can create MySQL tenants. This mode reduces the costs of business system transformation due to the migration from MySQL databases to OceanBase Database. This mode provides database designers, developers, and administrators a quick start with OceanBase Database with the knowledge and experience they gained from using MySQL databases. OceanBase Database in MySQL-compatible mode is highly compatible with MySQL syntax and uses the same system tables and functions as MySQL databases.

mysqltest

Mysqltest is a test tool used to test whether the database runs as expected. It consists of a group of test cases and related runtime programs.

N

Nop Log

Nop logs are a specific type of commit logs that indicate null operations. Nop logs are generated in the recovery phase of the Multi-Paxos protocol. If the persistence of a log fails on the majority of replicas, a nop log can be generated as the content of the log in the recovery phase.

O

ob_admin

Ob_admin is an O&M tool of OceanBase Database. Ob_admin provides the slog_tool, clog_tool, dumpsst, and dump_backup features for troubleshooting issues such as data inconsistency, data loss, and data error.

ob_error

Ob_error is an error code parsing tool of OceanBase Database. Ob_error can return causes and solutions based on the error code that you entered.

OBAgent

OBAgent is a data monitoring and collection framework. OBAgent supports data pushing and pulling for different application scenarios. By default, OBAgent supports plug-ins for server data collection, OceanBase Database metrics collection, monitoring data label processing, and HTTP service of the Prometheus protocol. To enable data collection from other sources or customize the data processing flow, you only need to develop the corresponding plug-ins.

Object Storage

Object storage is a data storage architecture used to store unstructured data. It divides data into units (or objects) and stores the units in a flat-structured data environment. Each object contains data, as well as the metadata and a unique identifier that can be used to easily access and retrieve the object. You can use object storage together with cloud-based applications that require high scalability and flexibility.

OBServer Host

An OBServer host is a physical server that is used to deploy one or more OBServer nodes.

OBServer Node

An OBServer node, or simply node, is a server where OceanBase Database is deployed. A server is a physical server that runs the observer process. One or more OBServer nodes can be deployed on one physical server. In OceanBase Database, a server is uniquely identified by its IP address and service port.

observer Process

OceanBase Database is a single-process software. Its process is named observer. Generally, one physical server or virtual machine (VM) runs one observer process that is uniquely identified by an IP address and a port number. This server is called an OBServer node or simply a node. The observer process is the core component of OceanBase Database and provides all database kernel features, such as the SQL engine, storage engine, and transaction engine. The process also provides distributed features such as RPC communication, partition management, and load balancing.

OceanBase Admin Toolkit

OceanBase Admin Toolkit (OAT) is a visual platform that allows you to install and manage products and components in the OceanBase ecosystem. OceanBase ecosystem products include OceanBase Cloud Platform (OCP), OceanBase Development Center (ODC), OB Sharding, and OceanBase Migration Service (OMS). OceanBase components include MetaDB, OBDNS, InfluxDB, and NLB.

OceanBase Call Interface

The OceanBase Call Interface (OBCI) driver is a driver compatible with Oracle Call Interface (OCI) using the C language. Applications developed based on OCI can be driven by OBCI and adopt the Oracle mode of OceanBase Database.

OceanBase Cloud Platform

OceanBase Cloud Platform (OCP) is an enterprise-grade database management platform for OceanBase. OCP provides full-lifecycle management of components such as OceanBase clusters and tenants, and manages OceanBase resources such as hosts, networks, and software packages. It enables you to manage OceanBase clusters more efficiently and reduces the IT O&M costs for enterprises.

OceanBase Cloud

OceanBase Cloud is built based on world-wide public cloud infrastructures such as Alibaba Cloud and AWS. Based on the native distributed database independently developed by Ant Group, OceanBase Cloud provides cost-effective cloud-based database services featuring auto scaling, superior performance, and compatibility with popular database services. OceanBase Cloud provides an end-to-end database service solution that covers monitoring, diagnostics, development, migration, backup, and restore on clouds.

OceanBase Command-Line Client

OceanBase Command-Line Client (OBClient) is a CLI client of OceanBase Database. You can use it to access MySQL and Oracle tenants of OceanBase Database.

OceanBase Connector/C

OceanBase Connector/C is a client development component of OceanBase Database that is based on C/C++. OceanBase Connector/C supports C API libraries. OceanBase Connector/C allows C/C++ applications to access distributed OceanBase clusters from the underlying layer. Then, the applications can perform operations such as database connection, data access, error processing, and prepared statement processing.

OceanBase Connector/J

Java Database Connectivity (JDBC) is a standard API for Java applications to access databases. The database driver adapts the JDBC API to SQL APIs of the corresponding database providers. As the JDBC driver of OceanBase Database, OceanBase Connector/J is compatible with JDBC 4.0, 4.1, and 4.2. By using OceanBase Connector/J, you can enable the MySQL and Oracle modes of OceanBase Database at the same time.

OceanBase Connector/ODBC

Open Database Connectivity (ODBC) is developed to share data between heterogeneous databases. It has become a main component of the Windows Open System Architecture (WOSA) and a standard for database access interfaces in Windows. ODBC provides a unified interface for accessing heterogeneous databases. It allows applications to access data managed by different database management systems (DBMSs) based on SQL. Therefore, applications can directly manipulate data in databases regardless of the database design. ODBC allows you to access database files on all types of computers, as well as non-database objects such as Excel files and ASCII data files.

OceanBase Database Community Edition

OceanBase Database Community Edition, compatible with MySQL, is an open-source database with an integrated architecture for standalone and distributed modes. It is built on the native distributed architecture and supports enterprise-grade features such as financial-grade high availability, transparent horizontal scaling, distributed transactions, multitenancy, and syntax compatibility. OceanBase Database has been used in large-scale business scenarios and serves customers from various industries. Looking ahead, OceanBase Database is hoping to work with community partners and establish an open and sustainable database ecosystem.

OceanBase Database Enterprise Edition

OceanBase Database Enterprise Edition is a native distributed database independently developed by Ant Group for enterprises. It provides financial-grade high availability by using regular hardware. Its ground-breaking deployment mode of "Five IDCs across Three Regions" creates a new standard for automatic lossless disaster recovery at the region level. It refreshes the TPC-C benchmark, supporting more than 1,500 nodes in one cluster. It is cloud-native, highly consistent, and highly compatible with Oracle and MySQL.

OceanBase Database Proxy

OceanBase Database Proxy (ODP) is a proxy server dedicated for OceanBase Database. OceanBase Database stores replicas of user data on multiple OBServer nodes. When ODP receives an SQL statement from a user, it forwards the statement to the optimal OBServer node and returns the execution result to the user.

OceanBase Database

OceanBase Database is a native distributed database independently developed by Ant Group for enterprises. It provides financial-grade high availability by using regular hardware. Its ground-breaking deployment mode of "Five IDCs across Three Regions" creates a new standard for automatic lossless disaster recovery at the region level. It refreshes the TPC-C benchmark, supporting more than 1,500 nodes in one cluster. It is cloud-native, highly consistent, and highly compatible with Oracle and MySQL.

OceanBase Deployer

OceanBase Deployer (OBD) allows you to install and deploy an OceanBase cluster on the CLI or GUI. OBD standardizes the complex configuration process to simplify cluster deployment.

OceanBase Developer Center

OceanBase Developer Center (ODC) is an enterprise-grade database development platform that is designed for OceanBase Database. ODC is connected to OceanBase Database in MySQL or Oracle mode. It also provides database developers with various features, such as daily development operations, WebSQL-based workspace, SQL diagnostics, session management, and data import and export.

OceanBase Loader and Dumper

OceanBase Loader (OBLOADER) is a client-based import tool that is developed in Java. OBLOADER provides extensive command-line options that allow you to import definitions and data to OceanBase Database in many complex scenarios. We recommend that you use OBLOADER along with OceanBase Dumper (OBDUMPER). However, in external services, OBLOADER can also import CSV files that are exported by third-party tools, such as Navicat, MyDumper, and SQL Developer. OBLOADER fully uses the distributed features of OceanBase Database. The tool is particularly optimized for import performance and stability. It is also enhanced to provide more operation monitoring information to improve user experience.

OceanBase Migration Assessment

OceanBase Migration Assessment (OMA) is a database compatibility assessment tool provided by OceanBase Database. OMA provides precise compatibility assessment, efficient performance assessment, and application reconstruction suggestions for data migration to OceanBase Database. OMA can assess the compatibility of OceanBase Database with various databases such as Oracle, DB2 LUW, and PostgreSQL, and provide profile analysis and automatic conversion solutions. OMA supports application load replay to help you predict performance risks after migration and provide optimization solutions. OMA can also assess the compatibility with C and Java business code and drivers to help you efficiently migrate data to OceanBase Database at low costs.

OceanBase Migration Service

OceanBase Migration Service (OMS) supports data exchange between a homogeneous or heterogeneous data source and OceanBase Database. OMS provides the capabilities for online migration of existing data and real-time synchronization of incremental data.

OCP Express

OceanBase Cloud Platform (OCP) Express is a web-based management tool for OceanBase Database V4.x. Integrated with OceanBase clusters, OCP Express allows you to view key performance metrics and perform basic database management operations on OceanBase clusters. OCP Express is derived from OCP. It retains the core capabilities of OCP and adjusts the overall layout of features to provide a brand-new user experience. Feature configurations are also rearranged in OCP Express so that OCP Express can be deployed on any database node with the minimum resource consumption. OCP Express allows you to gain extensive control over OceanBase Database V4.x at minimum costs.

OLAP

OLAP stands for online analytical processing.

OLTP

OLTP stands for online transaction processing.

Operator

An operator is the basic unit of an execution plan. In most cases, multiple operators constitute an execution tree to respond to your SQL requests.

Optimizer

The optimizer is the core module that determines your query execution plan. The optimizer generates the optimal execution plan for your query based on the statistics, the built-in rules, and the cost model of OceanBase Database.

Oracle Mode

OceanBase Database supports the Oracle-compatible mode, in which you can create Oracle tenants. This mode reduces the costs of business system transformation due to the migration from Oracle databases to OceanBase Database. This mode provides database designers, developers, and administrators a quick start with OceanBase Database with the knowledge and experience they gained from using Oracle databases. OceanBase Database in Oracle-compatible mode is highly compatible with Oracle syntax and uses the same system tables and functions as Oracle databases.

P

Parallel Compaction

Major compactions are performed in parallel for different data partitions. However, some partitions may contain a large amount of data. Although incremental compaction remarkably reduces the amount of data to be compacted, some frequently updated services may still involve a large amount of data to be compacted. To resolve this issue, OceanBase Database introduces the parallel compaction mode for different partitions. In this mode, data is distributed to different threads for parallel major compaction. The compaction efficiency is significantly improved.

Parallel Execution

Before you execute an execution plan, the plan is split into one or more tasks based on the partitions that you want to access. During the execution, the scheduler can execute multiple tasks in sequence or at the same time. The process in which multiple tasks are executed at the same time is known as parallel execution.

Parallel Query

A parallel query refers to the process of restructuring an execution plan to increase its CPU resources and I/O processing capacity, and thus reduce the system response time to the corresponding query. The parallel query technology applies to both distributed and local execution plans.

Partition

The concept of partition in OceanBase Database is the same as that in Oracle Database. OceanBase Database supports only horizontal partitioning. Each partition of a table stores a part of data. A table can be partitioned by specified partitioning methods such as hash partitioning, range partitioning, and list partitioning based on the mapping relationships between data and partitions. Each partition can be divided into several subpartitions from different dimensions. For example, you can create multiple hash partitions for a transaction table based on user IDs. You can further divide each hash partition into multiple range subpartitions based on the transaction time.

Partition Merge

Merging is the reverse operation of splitting. You can decrease the number of partitions of a table by changing the schema. The system merges multiple partitions based on a new partitioning method. You can perform a merge operation on a table or a table group. If you perform a merge operation on a table group, the partitions of all tables in the table group are merged in the same manner.

Partition Pruning

Partition pruning is an optimization process that prevents a database from accessing irrelevant partitions based on query conditions. Partition pruning can be static or dynamic.

Partition Replica

In a cluster consisting of OBServer nodes, all data is stored based on partitions. Each partition has multiple replicas to ensure high availability.

Partition Scheduling

Partition scheduling is a process in which replicas between resource units in each zone of a tenant are migrated to balance the resource usage of the resource units.

Multiple partitions that belong to a partitioned table are evenly distributed to different resource units. This ensures that the number of partitions in each resource unit is the same.
Multiple partitions in one partition group are deployed in one resource unit.
If each partition group contains the same number of partitions, you can exchange partitions between partition groups to reduce the disk usage of servers.
You can migrate the partitions of non-partitioned tables to reduce the disk usage of servers.

Partition Split

This operation allows you to adjust the number of partitions of a table by changing the schema. You can change a single-partition table into a multi-partition table or increase the number of partitions of a multi-partition table. This way, data in existing partitions can be reorganized based on new partitions. You can perform a split operation on a table or a table group. If you perform a split operation on a table group, all tables in the table group are split in the same manner.

Partition Table

An OBServer node can divide data in a regular table into different blocks based on specific rules. Data in the same block is physically stored together. Such a table is a partition table.

Partition Key

In each row of a table, a column is used to determine to which partition the row belongs. The collection of such columns is a partition key. Partition tables support partitions and subpartitions. The expression that contains a partition key and determines to which partition a row belongs is a partition expression.

A non-partition table has one partition. A partition table has multiple partitions.

OceanBase Database supports key, hash, list, and range partitions.

A partition key supports the following data types: value, string, date, timestamp, binary, and ROWID.

Pass-through Mode

Pass-through mode means that during data integration, data is directly transmitted from the source system to the destination system without conversion, cleansing, or processing.

Physical Standby Database

Physical standby database is an important part of the high availability solution of OceanBase Database. If the primary database becomes unavailable due to factors such as the failure of the majority of replicas, a standby database takes over the services. Lossless switchover with a recovery point objective (RPO) of 0 and lossy switchover with an RPO greater than 0 are supported to minimize service downtime.

OceanBase Database supports tenant-level physical backup since OceanBase Database V4.1. You can create one or more standby tenants for a primary tenant. You can also distribute resource-intensive report operations to the standby cluster to improve system performance and resource usage.

Piece

OceanBase Database organizes and manages archived logs by piece. A piece is a complete collection of logs of a tenant within a consecutive period. The range of SCNs of logs in a piece is a left-closed and right-open interval.

Pipeline

A pipeline concatenates multiple data processing operations to implement data conversion, processing, and transmission.

Plan Cache

Plan cache is the cache for execution plans on each server. SQL statement optimization is a time-consuming process. To prevent repeated optimization, execution plans are stored to plan caches so that they can be used the next time when the corresponding SQL statements are executed. Each tenant has an independent plan cache on each server to cache the execution plans processed on the server.

Pre-check

Before data migration, a pre-check is performed on data integrity, constraints, correlations, and privileges to ensure security and reliability of the migration operation.

Primary Tenant and Standby Tenant

Tenants are assigned roles since OceanBase Database V4.1. Generally, two roles exist: primary tenant and standby tenant. A primary tenant can provide complete database service capabilities such as queries, DML operations, and DDL operations. A standby tenant only serves as a backup to provide disaster recovery and read-only services. To become a physical hot backup database for a primary tenant, a standby tenant can synchronize and restore redo logs from the configured log restore source. You can configure one or more standby tenants for a primary tenant to implement a tenant-level high availability solution based on physical standby databases.

Primary Zone

A primary zone is a zone in which the leader of a partition is deployed. You can specify a list of zones for a partition. When the leader of the partition needs to be changed, the disaster recovery strategy determines the preferred location of a new leader based on the order of the list.

If you do not specify a primary zone, the system automatically selects one of the full-featured replicas as the leader based on the load balancing strategy.

Progressive Compaction

To reduce the impact of full compaction on the system, a round of compaction merges only a part of macro blocks. All macro blocks can be merged after several rounds of compaction. Progressive compaction reduces the time spent on a single compaction. When the schema of a table changes, such as column addition and compression algorithm change, all rows must be updated during compaction. If the table is large, updating all rows significantly increases the compaction time. In this case, progressive compaction can be used to significantly reduce the time spent on each compaction.

Protection Mode

OceanBase Database V3.x and earlier provide three protection modes: maximum performance, maximum protection, and maximum availability. You can switch between the three modes. Starting from OceanBase Database V4.x, the tenant-level physical standby database solution supports only the maximum performance mode. The maximum protection and maximum availability modes are replaced by arbitration-based disaster recovery capabilities.

For more information about the protection modes, see Maximum Protection, Maximum Availability, and Maximum Performance described in this glossary.

PX Worker

A Parallel eXecution (PX) worker is a worker thread that is used to execute a distributed plan on a server during distributed parallel execution.

Q

Query Coordinator

Query coordinator (QC) is a thread on the primary node that is used to schedule and coordinate the execution of a distributed parallel execution plan.

Query Rewrite

Query rewrite is a process of rewriting user queries equivalently in order to optimize the generation of the best execution plan by the optimizer.

R

Read/Write Zone

In the traditional read/write splitting architecture, the write database and read database are completely different database entities and are naturally isolated, and are synchronized in real time through data synchronization tools. OceanBase Database supports read/write splitting in a single database entity through the functionality of read-only replicas and read-only zones. A read/write zone can accept any read/write requests.

Read-only Zone

Read-only zone is a special type of zone where only read-only replicas are deployed. Usually, if a majority of replicas fail, OceanBase Database stops providing services. However, in this case, the read-only zone can continue to provide weak consistency reads, that is, reads from the read zone and read database. This is also a solution provided by OceanBase Database for read/write splitting. The read-only zone can only accept login authentication requests, weak consistency read requests, SELECT requests without specifying a table name, and the USE DATABASE and SET SESSION VARIABLES statements.

The read-only zone feature is no longer supported since OceanBase Database V2.2.3.

Rebuild

Rebuild refers to the process of catching up when a replica falls behind and cannot fetch the missing logs (because the source logs have been recycled). It involves copying the baseline data to catch up with the current state.

Reconfirm

Reconfirm means to confirm again, which is a process that needs to be executed during the leader election in the Multi-Paxos protocol. After a leader is elected for a partition, persistent logs on the majority of replicas of the partition must be reconfirmed, and these reconfirmed logs must be synchronized to all normally working replicas. The reconfirm operation can be considered successful only when all the logs are persistently stored on the majority of replicas.

Redo Log Index

A redo log index is a data structure used to record the position information of the operation log (commit log) in the log file. In the operation log file, the operation logs are not stored in order of log IDs, but in the order they are written. In order to quickly locate a specific operation log, the Redo Log Index needs to be used. The records in the redo log index are sorted by log IDs, and through the index, the position of the specified log can be quickly found, thereby achieving the function of sequentially reading the operation log.

Re-election with Leader

If a leader exists for a partition, the replica of a partition on a specified OBServer node is re-elected as the leader. This process is known as re-election with leader. You can initiate this process without waiting for the lease of the original leader to expire.

Region

A region is a geographic region or city that contains one or more zones. Different regions are usually far away from each other. In OceanBase Database, you can distribute the same set of data in different replicas across multiple regions to ensure high availability and fault tolerance. Each region indicates the geographic location of a physical data center. Different regions are fully isolated. This ensures high stability and fault tolerance of the system in different regions.

Region/IDC

Each OBServer node belongs to a region and an Internet data center (IDC). A region is the geographic location of the OceanBase cluster. In most cases, a region represents a city. An IDC is a data center of an OceanBase cluster. An OceanBase cluster can span several regions, each region can contain multiple IDCs, and an IDC can host multiple OBServer nodes. Based on the deployment modes of OceanBase clusters in regions and IDCs, the following types of location relationships exist between OceanBase clients and OBServer nodes, or between OBServer nodes: same IDC and same region, different IDCs and same region, and different regions. Priorities of the three location relationships decrease in sequence.

Remote Execution

Remote execution refers to a scenario in which the database server that receives user requests and generates execution plans is different from the database server that actually executes the plan. In this case, only one database server executes the plan.

Remote Plan

When an execution plan involves only a single table or a single partition of a partitioned table, and the table or partition is in a node other than the current one, the plan is defined as a "remote plan".

Replica

Each partition is stored as multiple physical copies to ensure data security and high availability of data services. Each copy is called a replica of the partition. Each replica contains three major types of data: static data stored in the SSTable on the disk, incremental data stored in the MemTable in the memory, and logs that record transactions. Several replica types are available depending on the types of data stored. This is to support the different business preferences in terms of data security, performance scalability, availability, and costs.

Full-featured replica: a regular replica that contains all data and provides full features, including transaction logs, a MemTable, and an SSTable. A full-featured replica can quickly switch to the leader to provide services.
Log replica: A replica that contains only logs. It does not have a MemTable or an SSTable. It provides log services for external applications and participates in log voting. It can facilitate the recovery of other replicas, but it cannot become the leader to provide database services.
Read-only replica: a replica that contains complete logs, a MemTable, and an SSTable. However, its logs are special. It does not participate in log voting as a member of the Paxos group. Instead, it works as an observer that synchronizes logs from the Paxos group members and then locally replays the logs. If an application does not require high consistency in data reading, this type of replicas can provide read-only services. They are not part of the Paxos group. Therefore, they do not increase the latency of transaction commit because the voting membership is not expanded.

The following table describes the types of replicas that are supported by OceanBase Database.

Type	Log	MemTable	SSTable	Data security	Time for regaining leadership	Resource cost	Service	Name (abbreviation)
Full-featured replica	It has logs and participates in voting (SYNC_CLOG).	Yes (WITH_MEMSTORE)	Yes (WITH_SSSTORE)	High	Short	High	Provides data read and write services as the leader and non-consistent read services as a follower.	FULL (F)
Log replica	It has logs and participates in voting (SYNC_CLOG).	No (WITHOUT_MEMSTORE)	No (WITHOUT_SSSTORE)	Low	Not supported	Low	Not readable or writable.	LOGONLY (L)
Read-only replica	It has asynchronous logs. It is only a listener instead of a member of the Paxos group (ASYNC_CLOG).	Yes (WITH_MEMSTORE)	Yes (WITH_SSSTORE)	Medium	Not supported	High	Supports non-consistent read.	READONLY (R)

The system automatically distributes the replicas across multiple servers based on the system load and specific rules. You can migrate, replicate, create, and delete replicas. You can also convert the types of the replicas.

Notice

At present, OceanBase Database V4.0.0 and V4.1.0 support only full-featured replicas.

Replicated Table

Specific applications may frequently access small tables that are infrequently updated. If you want to read the most recent data and ensure data consistency, the applications must access the leaders in strong-consistency read mode. However, the leaders may cause performance bottlenecks due to high access frequency. To broadcast a small table, OceanBase Database provides the replicated table feature that copies the replicas of the small table to all OBServer nodes in the tenant to which the table belongs. The table is a replicated table, and the replicas are replicated replicas. When an update transaction for a replicated table is committed, data is synchronized to all full-featured replicas and replicated replicas of the table. This way, you can read the data that is modified by the transaction on a specified OBServer node in the tenant after the transaction is committed.

Replication

Replication is a technology that can replicate the write operations in the database to multiple followers to improve the read performance and availability of the system.

Resource Group

A resource group is a collection of sessions that are grouped based on resource requirements. The system allocates resources to a resource group rather than to individual sessions.

Resource Management Plan

A resource management plan is a container of resource management plan configs and specifies how resources are allocated to resource groups. You can activate a specific resource management plan to control the allocation of resources. A resource management plan can contain multiple resource management plan configs. However, a resource management plan must not contain two identical resource management plan configs.

Resource Management Plan Config

A resource management plan config associates a resource group with a resource management plan and specifies how resources are allocated to this resource group.

Resource Pool

A tenant has several resource pools, which contain all resources available to the tenant. A resource pool consists of several resource units with the same specification (unit config). A resource pool belongs to only one tenant. A resource unit describes a group of computing and storage resources on a server. You can take it as a lightweight virtual machine with CPU, memory, and disk resources.

A tenant has at most one resource unit on the same server. Replicas are stored in resource units, which means that the resource units are containers of replicas.

Resource Unit

A resource unit, or unit for short, is a container of the resources, such as CPU and memory resources, for a tenant on a node. A tenant has at most one unit on a node.

Restore MetaDB

The restore MetaDB stores four tables that are used for restore tasks: oceanbase_restore, base_data_restore, inc_data_restore, and oceanbase_restore_history. In most cases, the backup MetaDB and restore MetaDB are deployed in the same database.

This term is deprecated in OceanBase Database V4.x.

Reverse Incremental

Reverse incremental refers to a data transmission approach that synchronizes changes made in the destination location to the source data source during data migration.

Root Service

Root Service (RS) of an OceanBase cluster runs on an OBServer node. When the OBServer node where Root Service resides fails, a new Root Service node is elected. Root Service provides resource management, disaster recovery, load balancing, and schema management.

Rotating Major Compaction

To reduce the impact of compaction on your business, the system merges zones in a specified order. A zone that is being merged cannot provide services.

Row

A row consists of several columns. Specific columns can constitute a rowkey, and the entire table is stored based on the order of the rowkey. Specific tables, such as the result tables of the SELECT instruction, may not contain rowkeys.

Row Cache

OceanBase Database caches data rows for each SSTable. During a Get or MultiGet query, the queried data rows can be placed in the row cache to avoid repeated searches for the rows by using the binary location method.

Row Compact

Row compact is a process in which multiple versions of incremental row data are combined into a single row.

Row Merge

Row merge is a process in which baseline row data and incremental row data are merged into a new version of row data.

Rowid

A rowid is stored in a baseline data row of a local partitioned index. It indicates the location of row data in the primary table and helps you quickly locate the corresponding row in the primary table.

Rowkey

Like a primary key in traditional relational databases, a rowkey is a unique identifier of each row of data in a table. Data in the table is sorted by rowkey.

RPC

RPC stands for remote procedure call.

RS List

An Root Service (RS) list records the IP addresses of servers that run Root Service in an OceanBase cluster. In most cases, each zone has an RS list.

RS lists are involved in the creation of clusters and are closely related to the sys tenant.

OceanBase Database supports the following types of RS lists: RS lists that are directly obtained from the config server and RS lists that are updated based on the server list.

S

Savepoint

In OceanBase Database, a savepoint is a user-defined execution mark in a transaction. You can define multiple savepoints in a transaction. In this way, the transaction can be rolled back to a specified savepoint when necessary.

Schema

In most cases, a schema indicates a specific database object such as a table, a view, and an index. On an OBServer node, a schema is a collection of database objects.

Schema Refresh

In OceanBase Database, all changes to system objects that are caused by DDL operations occur on Root Server. After a DDL operation is performed, Root Server notifies each node in the cluster of the latest schema version. Each node compares the schema version received with the schema version stored in the local cache. If the local schema version falls behind the global version, the node queries the system table for the change and updates the system object information stored in the local cache. This process is called schema refresh. Each node compares the schema versions in the background at a specified interval and automatically triggers a schema refresh operation if the local schema version falls behind.

Schema Version

An OBServer node maintains a global schema version. The global schema version increases each time a database object is modified. Each database object also has a version number, which is the global schema version that corresponds to the last modification (such as creation and change) of the database object.

Secondary Index

A secondary index is an auxiliary data structure that is used to access data tables. Compared with a primary key, a secondary index contains a set of key values that are explicitly or implicitly specified by users. In OceanBase Database, secondary indexes are implemented as data tables that are associated with the primary table.

Server List

A server list records the IP addresses of all servers in an OceanBase cluster.

Silent Data Corruption

Silent data corruption means that the storage system provides corrupted data to applications, but no alerts are triggered. For example, bad blocks appear in disks due to medium damage. In this case, when an application reads a bad block, the application reads the wrong data.

Slow Query

A slow query is an SQL query that is not completed in a specified period.

Software as a Service

Software as a service (SaaS) is a software application mode in which software services are provided based on the Internet. In OceanBase Database, SaaS refers to the application mode in which third-party software provides software services for customers in public clouds.

Sorted Strings Table

A sorted string table (SSTable) stores baseline data or minor compaction data. It stores row data in order.

Source Connection Profile

A source connection profile is a set of configuration information and parameters for establishing a connection with the data source. The configuration information includes the data source type, host name, port number, username, password, and database name.

SQL Diagnoser

SQL Diagnoser is an agile SQL diagnostics tool that can directly analyze business clusters to identify common suspicious SQL statements and hidden performance problems.

SQL Plan Management

SQL Plan Management (SPM) is a plan evolution mechanism. With the evolution mechanism, the optimizer verifies if any new plan will cause a performance regression. In the case of a performance regression, the plan is rejected.

SSTable Log

SSTable logs (slogs) are used to maintain the consistency of baseline data on a node.

Standalone Edition

The standalone edition of OceanBase Database is built based on a standalone distributed and integrated architecture.

Statistics

Statistics are a collection of data that describes the information about tables and columns in a database. OceanBase Database supports table-level statistics and column-level statistics.

Stored Procedure

A stored procedure is a programming method that is provided by the server.

Strong Read Consistency/Weak Read Consistency

Strong read consistency is a default SQL execution method in OceanBase Database. In a strong read consistency operation, the SQL statement must be forwarded to the OBServer node where the leader of the involved partition is located. This method allows you to obtain the latest data in real time. Weak read consistency is the opposite of strong read consistency. In a weak read consistency operation, the SQL statement is forwarded to an OBServer node in which a replica of the involved partition is located, regardless of whether the replica is a leader. You can use one of the following methods to set the read consistency level to weak: Execute the SELECT statement with the read_consistency(weak) hint, or set the ob_read_consistency variable of the current session to weak.

Sub Query Coordinator

In a distributed parallel execution plan, each participating server runs a sub query coordinator (SQC). The SQC receives scheduling instructions from the QC, obtains local worker threads, generates the granules of local tasks, and coordinates local execution.

Sub plan

A sub plan is equivalent to a DFO in distributed parallel execution.

Switchover

The primary and standby databases perform a switchover to change their roles. If the primary database cannot provide services due to a failure or interruption, you can perform a switchover to switch services to the physical standby database. This process is reversible.

sys Tenant

A sys tenant is automatically created when you create an OceanBase cluster. Its lifecycle is consistent with that of the cluster. It manages the lifecycles of the cluster and all user tenants in the cluster. A sys tenant has only one log stream with the ID 1, supports only single-point writes, and does not support scaling. You can create user tables in a sys tenant. All user tables and system tables are served by the No.1 log stream. The data of the sys tenant is the private data of the cluster. The physical synchronization and physical backup and recovery of the sys tenant data between the primary and standby clusters are not supported.

Application systems access OceanBase Database from the sys tenant. The client parses the configuration file of the application system and obtains the IP address list of the sys tenant from the Config Server. Then, the client accesses the sys tenant to obtain the metadata and connects to the target tenant. The stability of the sys tenant is challenged by its capacity. When many application systems restart at the same time, the connection establishment requests will produce a traffic peak and exhaust the worker threads of the sys tenant, thereby causing connection establishment failures. The sys tenant does not support horizontal scaling. You can perform vertical scaling or adjust cluster parameters for the sys tenant.

System Parameter

OceanBase Database provides cluster-level and tenant-level parameters. You can set these parameters to control the load balancing, major compaction time, major compaction mode, resource allocation, and module switches of the entire cluster.

System Variable

You can set system variables to ensure that the behaviors of OceanBase Database meet your business requirements. The system variables of OceanBase Database can be categorized into global variables and session-level variables.

T

Table Group

If a collection of tables is frequently accessed at the same time, you can store the same type of replicas of these tables on the same OBServer node to optimize query performance. To achieve this purpose, you can define a table group and add the set of tables to the table group. A table group contains multiple tables that have the same number of partitions and follow the same partitioning rules. If each table in a table group has N partitions, the ith partitions of all the tables constitute a partition group. Leaders of partitions in the same partition group are located on the same OBServer node.

TableAPI

TableAPI is a set of APIs provided by OceanBase Database for reading and writing table model data. OceanBase Database defines a group of general protocols for interaction between the TableAPI client and the database server. You can use the SDK provided by TableAPI to read and write data in relational tables of OceanBase Database. TableAPI directly accesses the storage layer and transaction layer of OceanBase Database to provide efficient data reads and writes.

Tablespace

OceanBase Database provides data encryption methods that are compatible with Oracle databases and encrypts data in tablespaces. OceanBase Database does not support multiple files, and the concept of tablespaces is designed for compatibility. A tablespace is a collection of tables.

Table

A table is a basic unit for data storage in OceanBase Database. Each table consists of several rows of records, and each row has the same pre-defined columns. You can use SQL statements to create, retrieve, update, and delete (CRUD) data in a table. Generally, several columns of a table make up a primary key, which is unique among the datasets of the table.

Tablet

OceanBase Database V4.0.0 introduces the concept of tablet to represent actual data storage objects. A tablet can store data and can be transferred among servers. It is the smallest unit in data balancing. Each tablet corresponds to a partition. A single-partition table will have one tablet, and a multi-partition table will have one tablet for each partition. Each partition of an index table also corresponds to a tablet, including local and global index tables. Specifically, the tablet of a local index table is forcibly bound to the tablet of the primary table to ensure storage on a single machine.

Tenant

OceanBase Database achieves resource isolation by using tenants and supports multiple tenants in a single cluster. A tenant in an OceanBase cluster is equivalent to a MySQL or Oracle instance. OceanBase Database isolates resources and data among tenants. Each tenant owns a group of computing and storage resources and independently provides a complete set of database services.

OceanBase Database supports system tenants and user tenants. A system tenant stores internal metadata managed by OceanBase Database. A user tenant stores user data and database metadata.

Tenant Instance

A tenant instance provides OceanBase Cloud services in tenant mode. At present, only the MySQL mode is supported.

TPC-C

TPC-C is one of the performance benchmarks proposed by the Transaction Processing Performance Council (TPC). TPC-C is an important performance benchmark for databases.

TPC-H

TPC-H is a commonly used benchmark that measures the analysis and decision support capabilities of database systems by using a series of complex queries on massive amounts of data.

U

Unit Config

A unit config is a configuration file that specifies the specifications of the computing and storage resources (including memory, CPU, and I/O resources) required for a resource unit.

Unit Group

OceanBase Database V4.0 and later versions require that all zones of the same tenant have the same number of units. The system assigns an ID to each unit in each zone. Units with the same ID form a unit group. Unit groups have the following characteristics:

Each unit group has a unique ID. You can query for the ID from the UNIT_GROUP_ID field in the oceanbase.DBA_OB_UNITS view.
One log stream belongs only to one unit group, and is distributed only on units in the unit group. Therefore, the same data partitions are distributed on all units in a unit group based on log streams, thus defining a group of data. In this case, all zones must have equivalent service capabilities.
OceanBase Database V4.0 allows you to scale resources only by unit group, but does not support configuration of the number of units for a tenant by zone. For example, if you want to scale out resources for a tenant, you can only increase the number of units for all zones in a unified manner. Correspondingly, if you want to scale in resources for a tenant, you can delete units only by unit group. The unit group mechanism ensures homogeneous data distribution in different zones.

Unit Number

Unit number refers to the number of resource units.

Unit Scheduling

In each zone, resource units are dynamically scheduled to achieve load balancing.

Multiple resource units that belong to one tenant are evenly distributed to different servers.
Multiple resource units that belong to one tenant group are evenly distributed to different servers if possible.
If the overall disk usage of servers in a zone exceeds a specific threshold, you can exchange or migrate resource units to reduce the disk usage.
You can also exchange or migrate resource units based on CPU and memory specifications to reduce the average CPU utilization and memory usage.

Universe

Universe indicates all OceanBase databases deployed across regions.

User Tenant

A user tenant is the opposite of a sys tenant. A user tenant is a tenant created by a user. It provides complete database features and supports two compatibility modes: MySQL and Oracle. A user tenant can distribute its service capabilities on multiple servers and supports dynamic scaling. Log streams are automatically created and deleted based on user configurations. The data of a user tenant, such as the schema data, user table data, and transaction data, requires stronger data protection and higher availability. Physical synchronization and physical backup and restore of user tenant data across clusters are supported.

V

Version Upgrade

Version upgrade refers to the upgrade of the OceanBase Database software version and should be distinguished from the concept of "arbitration upgrade".

Virtual Private Cloud

A virtual private cloud (VPC) is a user-created private network. Different VPCs are logically isolated. You can create and manage cloud resources in a VPC created by yourself.

vSwitch

A vSwitch is a basic network device that connects different cloud resources in a VPC. When you create a cloud resource in a VPC, you must specify the vSwitch to which the cloud resource is connected.

W

Worker Thread

A worker thread is a thread that is used to process tenant requests in OceanBase Database. Worker threads that belong to the same tenant share a task queue to process tenant requests.

X

XA Transaction

The eXtended Architecture (XA) standard is a specification released in 1991 by X/Open. XA ensures the atomicity of global transactions in heterogeneous systems.

Z

Zone

In OceanBase Database, a zone is an IDC or a physical region. It is a logical concept. Generally, a zone contains multiple storage nodes that are physically distributed in different IDCs, racks, or servers. A zone can contain multiple IDCs but an IDC belongs only to one zone.

In OceanBase Database, zones are used to implement data redundancy across IDCs. OceanBase distributes data to different zones based on specific rules to implement data redundancy. When one zone fails, the system automatically switches to the standby zone to ensure data availability.

In addition to data redundancy, OceanBase Database also uses zones as containers for data shards. Data sharding is a technology that divides data into multiple shards and stores the shards on different nodes. This improves the system throughput and performance. In OceanBase Database, different zones serve as the primary zones of different data shards to implement distributed data storage and processing.

Glossary

A

Access Path

ACID

Active Geo-redundancy

Active MemTable

Active Session History

Active Session History Report

Active Transaction

Adaptive Cursor Sharing

agentrestore.jar

AgentServer

Antman

Arbitration Service

Arbitration Server

Arbitration Member

Arbitration Replica

Arbitration Downgrade

Arbitration Upgrade

Availability Zone

B

Backup MetaDB

Baseline Data

Baseline Data Version

Block Cache

Block Index Cache

Bloom Filter

Bloom Filter Cache

Business Continuity

C

Cascading Replication

Change Data

Change Data Capture

Change Synchronization

Cloud Server

Cluster

Cluster Instance

Commit Log

Container Service

D

Data Balancing

Database

Database as a Service

Database (Object)

Data Flow Object

Data Ingestion

Data Integration

Data Migration

Data Replication

Data Skew

Data Subscription

Data Transfer Layer

Data Transformation

Direct Input/Output

Distributed Execution

Distributed Plan

Distributed Transaction

DOOBA

E

Election Without Leader

Embedded SQL in C for OceanBase

Encoding

End-to-End Tracing

Execution Plan

Execution Plan Binding

Execution Plan Matching

F

Failover

Fast Parsing

Fault Tolerance

Flashback Query

Forward Shifting

Frozen MemTable

Frozen Version

Full Compaction

Full Verification

Fuse Row Cache

G

Geo-redundancy

Global Consistent Snapshot