Appendix: Basic concepts of OceanBase Database

Last Updated：2023-10-24 09:23:04 Updated

A

access path

An access path can be used to access a database table. In most cases, data can be queried based on a primary key index or a secondary index. active MemTable

You can use an active MemTable to write incremental data. An active MemTable is the opposite of a frozen MemTable. adaptive cursor sharing

Adaptive cursor sharing is a mechanism that allows the optimizer to store multiple execution plans for each parameterized SQL statement and select an appropriate plan based on the selectivity of predicates in an SQL statement. AgentServer/agentrestore.jar

AgentServer is a backup tool and a resident process. AgentServer queries whether the base_data_backup table in the MetaDB contains backup tasks at a specified interval. Then, AgentServer can control the initiation and cancellation of backup for baseline and incremental data. AgentServer updates the status of the four tables that are used for backup tasks as tasks progress. agentrestore.jar is a restoration tool and a resident process. It is a JAR package that is written in Java. agentrestore.jar queries the control tables in the MetaDB at a specified interval and controls the initiation of all restoration tasks. agentrestore.jar updates the status of the four tables that are used for restoration tasks as tasks progress.

availability zone (zone)

Zone is short for availability zone. An OceanBase cluster spans several zones. Generally, a zone consists of several servers in an IDC. Multiple replicas of data are distributed across different zones to ensure data security and high availability. This design ensures that the failure of a single zone does not affect the database service.

B

backup MetaDB/restore MetaDB

The backup MetaDB stores one parameter table (backup_base_profile) and four tables that are used for backup tasks: base_data_backup, base_data_backup_task, base_data_backup_task_history, and inc_data_backup. The restore MetaDB stores four tables that are used for restoration tasks: oceanbase_restore, base_data_restore, inc_data_restore, and oceanbase_restore_history. In most cases, the backup MetaDB and restore MetaDB are deployed in the same database. baseline data

Baseline data is read-only ordered data that is generated in major compactions and is stored on persistent media. baseline data version

A baseline data version is a version of baseline data. bloom filter cache

The bloom filter cache quickly determines whether a row exists in baseline data or minor compaction data. If the row does not exist, disk I/O and CPU consumption can be reduced.

C

CLOG

Commit logs (CLOGs) are operation logs that record modifications to database objects. The atomicity and durability of transactions are ensured by using the Write Ahead Log (WAL) protocol. When a transaction is being executed in a partition, one or more pieces of CLOGs are generated and copied to other replicas of the partition by using the Multi-Paxos protocol. The complete CLOG sequence of a partition is the complete modification history of the partition.

copy table (OceanBase Database V2.x)

Specific applications may frequently access small tables that are infrequently updated. If you want to read the most recent data and ensure data consistency, the applications must access the leaders in strong-consistency read mode. However, the leaders may cause performance bottlenecks due to high access frequency. To broadcast a small table, OceanBase Database V2.x provides the copy table feature to copy the replicas of the small table to all OBServers under the tenant to which the table belongs. The table is a copy table, and the replicas are copy replicas. When an update transaction for a copy table is committed, data is synchronized to all full-featured replicas and copy replicas of the table. This way, you can read the data that is modified by the transaction on a specified OBServer under the tenant after the transaction is committed.

D

database

A database is a repository that organizes, stores, and manages data by using data structures. A database contains tables, indexes, and metadata of database objects.

data skew

Data skew means that one or more values frequently appear in the data and account for a large proportion of the data. In distributed execution, data skew causes long tails, and execution threads assigned to these values take more time in execution.

DFO

A data flow object (DFO), also known as a sub-plan, is a collection of operators that require pipelined execution in a distributed parallel execution plan. DIO

Direct input-output (DIO) refers to the direct input and output of data. distributed execution

When an execution plan is executed in a distributed manner, the plan is executed on multiple OBServers, each of which contributes to the execution process. DTL

The data transfer layer (DTL) is a network transmission framework that you can use to transmit data between execution threads in a distributed parallel execution framework.

E

election without leader

If no leader exists for a partition, a leader is elected from multiple replicas of the partition. This process is known as election without leader. This process is triggered when a leader needs to be elected for a partition after the cluster is restarted or when the original leader of the partition fails. If a partition already has a leader, you can initiate this process for the partition only after the lease of the original leader expires. execution plan

An execution plan is a collection of physical code for executing SQL requests in a database. An execution plan is generally an execution tree that consists of operators. execution plan binding

Execution plan binding is a process in which execution plans for SQL statements are specified by using outlines without the need to use an optimizer. Execution plan binding is suitable for scenarios in which an execution plan generated by the optimizer is invalid or inefficient. execution plan matching

Execution plan matching is a process in which a database selects an appropriate execution plan from the plan cache to execute your SQL statement.

F

fast parameterization

Fast parameterization is a process in which real parameters are extracted from SQL statements at a high speed. This process is exclusive to OceanBase Database. Based on inherent characteristics of the plan cache of OceanBase Database, fast parameterization prevents semantic analysis during the re-execution of an input SQL statement by adding constraints. This improves plan matching efficiency.

frozen MemTable

If an active MemTable reaches a specific memory threshold, the system freezes the active MemTable. Incremental data cannot be written to frozen MemTables. frozen version

A frozen version is the version number that is used when a freezing operation is performed. full compaction

In full compaction, all macroblocks of partitions are rebuilt regardless of whether the macroblocks are modified. You can specify whether to enable full compaction for a table. If full compaction is enabled for a table, the system initiates full compaction when the schema changes (such as column addition and deletion) or the storage properties change (such as the modification of the compression level). During the full compaction of a table, progressive compaction is enabled to reduce the time spent on a single compaction.

G

granule

A granule is the minimum task granularity that is used to scan tables and indexes in a distributed parallel execution plan. A granule can be a partition or a query range. global index

A global index is a cross-partition data index in OceanBase Database. A global index allows you to quickly locate the partition to which the specified data belongs based on the global key values and primary keys of stored data.

global consistent snapshot (OceanBase Database V2.x)

If global consistent snapshots do not exist, distributed databases cannot support cross-node consistency read or ensure causal sequences. In OceanBase Database V1.4.x, application system designers and developers must ensure that multiple tables and partitions accessed in one SQL statement are located on one OceanBase Database node. In business systems that rely on operation sequences, OceanBase Database V1.4.x cannot ensure that two sequential transactions can respectively modify two tables on two nodes. To resolve these issues, OceanBase Database V2.0 provides the global consistent snapshot feature. Compared with atomic clock-based Google TrueTime, the Global Timestamp Service (GTS) of OceanBase Database is fully dependent on software. GTS does not rely on specific hardware devices or set additional requirements for the deployment environments of customers. This allows OceanBase Database to serve more Apsara Stack customers. After GTS is enabled, OceanBase Database V2.x can support cross-node reads/writes and causal sequences in the same way as that in a standalone database. group commit

Group commit is a process in which logs of multiple transactions are persistently stored to a disk during an I/O operation. Group commit helps improve the log writing efficiency.

H

hint

A hint is a user-defined primitive that is used to specify the optimizer behavior in a database.

I

incremental compaction

During incremental compaction, the system regenerates only macroblocks that contain modified rows. incremental data

Incremental data, including the data in the MemTable and the minor compaction data, is the data modified by the INSERT, UPDATE, and DELETE operations. Such data is not merged with the baseline data.

J

join order

A join order specifies the sequence that is used to join multiple tables in an execution plan.

L

LDC

A logical data center (LDC) is a logical division of IDCs. If OceanBase Database is deployed in multiple IDCs across regions, all OBServers are grouped by region and zone. In this case, client routing of OceanBase Database and remote procedure call (RPC) routing within OceanBase Database are known as LDC routing.

leader/follower

The concepts of leader and follower define the roles of table partition replicas at a specific point in time. In OceanBase Database, each partition has at least three replicas. Redo logs are synchronized from the leader to followers based on the Paxos protocol. A transaction on the leader can be committed only after redo logs are received by more than half of the three members and written to the disk. All replicas (such as three or five replicas) of a partition constitute a Paxos group, and a leader is autonomously elected in the group. If an OceanBase node fails, the observer process of the leader on the node is interrupted, and the OceanBase cluster automatically elects a new leader.

local execution

OceanBase Database receives client requests to generate execution plans and executes the plans on the same server.

location cache

Location cache stores location information about partitions. Each node maintains location information about partitions. In the plan generation phase, the system generates different types of execution plans based on location information. In the plan execution phase, the system sends the plans to the corresponding nodes for execution based on location information. Location cache is updated as needed. If an error occurs during statement execution because of a location failure, the location information in the corresponding partition is refreshed.

load balancing

The system dynamically adjusts the locations of resource units and replicas within the resource units based on specific strategies to balance the resource usage of all OBServers in a zone. Load balancing strategies are created based on several factors. OceanBase Database achieves load balancing by using two levels of scheduling. For more information, see the description of partition scheduling and unit scheduling in this topic.

locality

Locality describes the replica types and locations of a table. The basic syntax is replicas@location, which consists of the following elements:

replicas: F indicates a full-featured replica, and L indicates a log replica.
location: This element contains a collection of enumerated values that are known to the system. The value of this element is the name of a zone, such as hz1 and bj2.
Count: If the count value is not specified, only one replica exists. {n} indicates that N replicas exist. The special value {all_server} indicates that the number of replicas is the same as the number of available servers. For the sake of implementation, a partition can have at most one full-featured replica and one log replica in a zone. These replicas are part of the Paxos replica group. A partition can contain several read-only replicas in a zone.

M

macroblock

A macroblock is an internal unit that is used to manage data files in OceanBase Database. A macroblock consists of several microblocks and is the minimum unit that is used to write data to the storage system of OceanBase Database. macroblock split

When data is inserted or updated in a macroblock, the storage capacity of the macroblock becomes insufficient. To resolve this issue, the macroblock split process stores the data of the macroblock to multiple macroblocks. macroblock merge

After data is deleted from several adjacent macroblocks, all rows in the macroblocks can be stored in one macroblock. Then, the macroblock merge process combines several adjacent macroblocks into one macroblock. macroblock recycle

During major compactions, new baseline data is generated based on the original baseline data and modified data. The macroblock recycle process reuses the macroblocks of the original baseline data. macroblock prefetch

Macroblock prefetch is a process in which adjacent macroblocks are pre-read as required during range queries. major compaction

Major compaction is a process in which incremental data in MemTables and minor compactions and baseline data in persistent storage are combined into new baseline data. major freeze

All nodes in a cluster freeze the most recent active MemTables at a unified snapshot point. The MemTables no longer accept the write operations of new transactions, which are performed in new active MemTables. major freeze version

A major freeze version is the version number of a major freeze. membership log

Membership logs are special CLOGs that record the changes of a partition member group. MemTable

A MemTable stores all incremental modification records in the memory. merge partition

Merging is the reverse operation of splitting. You can decrease the number of partitions of a table by changing the schema. The system merges multiple partitions based on a new partitioning method. You can perform a merge operation on a table or a table group. If you perform a merge operation on a table group, the partitions of all tables in the table group are merged in the same manner. migration

Migration is a process in which a replica of a partition is migrated from one node to another node. The replica is added to the target node and then deleted from the source node.

microblock

A microblock is an internal unit that is used to manage macroblocks. It is the minimum unit that is used to read data. microblock cache

Microblock cache refers to the cache of microblocks in the memory. Microblock cache is used to reduce the number of accesses to microblocks and improve query performance. microblock index cache

Microblock index cache refers to the cache of microblock indexes in the memory. Microblock index cache is used to improve query performance in frequent accesses.

minor freeze

If the incremental data size of a partition in the memory exceeds a specific threshold, the system freezes the current active MemTable of the partition. The MemTable no longer accepts the write operations of new transactions, which are performed in a new active MemTable of the partition.

minor compaction

Minor compaction is a process in which incremental data in a MemTable frozen in a minor freeze operation and minor compaction data of an earlier version (if any) are merged and are persistently stored to a disk. During this process, minor compaction data and baseline data are merged to generate new baseline data. minor compaction version

A minor compaction version is a version number of minor compaction data. move in

This operation adds a table group attribute to a table without a table group attribute by changing the schema. The database checks the partitioning methods of the table and the target table group. The DDL operation succeeds only if the same partitioning methods are used. move out

Move-out is the reverse operation of move-in. This operation deletes the table group attribute from a table by changing the schema. This is equivalent to removing the table from the table group. If you switch the table group attribute of a table from A to B by changing the schema, the table is removed from A and then added to B.

Multi-Paxos

Multi-Paxos is an optimization protocol that runs multiple Paxos instances. OceanBase Database uses the Multi-Paxos protocol to implement multi-IDC persistence of CLOGs.

MVCC

Multi-version concurrency control (MVCC) manages concurrent operations on multiple versions of data.

MySQL mode

OceanBase Database supports the MySQL-compatible mode, in which you can create MySQL tenants. This mode reduces the costs of business system transformation due to the migration from MySQL databases to OceanBase Database. This mode provides database designers, developers, and administrators a quick start with OceanBase Database with the knowledge and experience they gained from using MySQL databases. OceanBase Database in MySQL-compatible mode is highly compatible with MySQL syntax and uses the same system tables and functions as MySQL databases.

N

nop log

Nop logs are special CLOGs that record empty operations. Nop logs are generated in the recovery phase of the Multi-Paxos protocol. If the persistence of a log fails on the majority of replicas, a nop log can be generated as content of the log in the recovery phase.

O

OceanBase Database

OceanBase Database is a financial-grade distributed relational database that is independently developed by Ant Group and Alibaba Group. ODP

An ODP is a high-performance reverse proxy server in OceanBase Database. It receives requests from application clients and forwards the requests to an OBServer. The OBServer returns data to the ODP. Then, the ODP forwards the data to the application clients. OBProxies can prevent transient disconnections and block backend exceptions such as downtime, upgrade errors, and network jitters. OBProxies are compatible with the MySQL protocol and support strong verification, hot upgrades, and multi-cluster deployment.

OBServer

An OBServer is a server of OceanBase Database. A server is a physical server that runs the observer process. One or more OBServers can be deployed on one physical server. In OceanBase Database, each OBServer is uniquely identified by its IP address and service port.

OCP

OceanBase Cloud Platform (OCP) provides the visual monitoring, O&M, and alerting features.

OLAP

OLAP stands for online analytical processing.

OLTP

OLTP stands for online transaction processing. operator

An operator is the basic unit of an execution plan. In most cases, multiple operators constitute an execution tree to respond to your SQL requests. optimizer

The optimizer is the core module that determines your query execution plan. The optimizer generates the optimal execution plan for your query based on the statistics, the built-in rules, and the cost model of OceanBase Database.

Oracle mode ( OceanBase Database V2.x )

OceanBase Database supports the Oracle-compatible mode, in which you can create Oracle tenants. This mode reduces the costs of business system transformation due to the migration from Oracle databases to OceanBase Database. This mode provides database designers, developers, and administrators a quick start with OceanBase Database with the knowledge and experience they gained from using Oracle databases. OceanBase Database in Oracle-compatible mode is highly compatible with Oracle syntax and uses the same system tables and functions as Oracle databases.

P

parallel compaction

Parallel compaction is suitable only for a single partition. parallel execution

Before you execute an execution plan, the plan is split into one or more tasks based on the partitions that you want to access. During the execution, the scheduler can execute multiple tasks in sequence or at the same time. The process in which multiple tasks are executed at the same time is known as parallel execution.

partition scheduling (partition load balancing)

Partition scheduling is a process in which replicas between resource units in each zone of a tenant are migrated to balance the resource usage of the resource units.

Multiple partitions that belong to one partitioned table are evenly distributed to different resource units. This ensures that the number of partitions in each resource unit is the same.
Multiple partitions in one partition group are deployed in one resource unit.
If each partition group contains the same number of partitions, you can exchange partitions between partition groups to reduce the disk usage of servers.
You can migrate the partitions of non-partitioned tables to reduce the disk usage of servers.

partition split

This operation allows you to adjust the number of partitions of a table by changing the schema. You can change a single-partition table into a multi-partition table or increase the number of partitions of a multi-partition table. This way, data in existing partitions can be reorganized based on new partitions. You can perform a split operation on a table or a table group. If you perform a split operation on a table group, all tables in the table group are split in the same manner. partition

Oracle Database and OceanBase Database support only horizontal partitioning. Each partition of a table stores a part of data. Partitions are classified into the following types based on the mapping relationship between row data and partitions: hash partitions, range partitions, and list partitions. Each partition can be divided into several subpartitions from different dimensions. For example, you can create multiple hash partitions for a transaction table based on the user ID. You can further divide each hash partition into multiple secondary range partitions based on the transaction time.

partitioned table

An OBServer can divide data in a regular table into different blocks based on specific rules. Data in the same block is physically stored together. Such a table is a partitioned table. partition pruning

Partition pruning is an optimization process that prevents a database from accessing irrelevant partitions based on query conditions. Partition pruning can be static or dynamic.

partition key

In each row of a table, a column is used to determine to which partition the row belongs. The collection of such columns is a partition key. Partitioned tables support partitions and subpartitions. The expression that contains a partition key and determines to which partition a row belongs is a partitioning expression.

A non-partitioned table has one partition. A partitioned table has multiple partitions.

OceanBase Database supports key, hash, list, and range partitions.

A partition key supports the following data types: value, string, date, timestamp, binary, and ROWID. plan cache

A plan cache refers to the cache of execution plans on a server. SQL statement optimization is a time-consuming process. To prevent repeated optimization, execution plans are stored to plan caches so that they can be used next time the corresponding SQL statements are executed. Each tenant has an independent plan cache on each server to cache the execution plans processed on the server. primary zone

A primary zone is a zone in which the leader of a partition is deployed. You can specify a list of zones for a partition. When the leader of the partition needs to be changed, the disaster recovery strategy determines the preferred location of a new leader based on the order of the list.

If you do not specify a primary zone, the system automatically selects one of the full-featured replicas as the leader based on the load balancing strategy. progressive compaction

To reduce the impact of full compaction on the system, a round of compaction merges only a part of macroblocks. All macroblocks can be merged after several rounds of compaction. Progressive compaction reduces the time spent on a single compaction. When the schema of a table changes, such as column addition and deletion, all rows must be updated during compaction. If the table is large, updating all rows significantly increases the compaction time. In this case, progressive compaction can be used to significantly reduce the time spent on each compaction. PX worker

A PX worker is a worker thread that is used to execute a distributed plan on a server during distributed parallel execution.

Q

Query Coordinator (QC) is a thread on the master node that is used to schedule and coordinate the execution of a distributed parallel execution plan. query rewrite/transformation

Query rewrite/transformation is a process in which a query is transformed to an equivalent alternative. This way, the optimizer can generate an execution plan that can meet your business requirements.

R

read-only zone

A read-only zone is a special zone in which only read-only replicas are deployed. In most cases, if the majority of the replicas fail, OceanBase Database stops providing services. However, read-only zones can still support weak-consistency read. This achieves read/write splitting in OceanBase Database.

read/write zone and read-only zone

In a traditional read/write splitting architecture, the write database and read database are different database entities. The databases are isolated, and real-time data can be synchronized to the databases only by using data synchronization tools. OceanBase Database supports read/write splitting in a single database entity based on read-only replicas and zones. A read/write zone accepts all read and write requests. A read-only zone accepts only logon authentication requests, weak-consistency read requests, SELECT requests without specifying table names, and the use database and set session variables statements. reconfirm

The Multi-Paxos protocol requires reconfirmation on the election of leaders. After a leader is elected for a partition, persistent logs on the majority of replicas of the partition must be reconfirmed, and the reconfirmed logs must be synchronized to all regular replicas. After all the logs are persistently stored on the majority of replicas, the reconfirm operation succeeds.

redo log index

Operation logs are not stored in the log file based on the order of log IDs. To allow you to read operation logs in order, index logs (ILOGs) record the locations of operation logs in the log file in order of log IDs. re-election with leader

If a leader exists for a partition, the replica of a partition on a specified OBServer is re-elected as the leader. This process is known as re-election with leader. You can initiate this process without waiting for the lease of the original leader to expire. region

A region is a geographic location or a city such as Hangzhou, Shanghai, and Shenzhen. A region has one or more zones, and different regions are usually far away from each other. Cross-region deployment of multiple replicas of the same data is supported in OceanBase Database.

region/IDC

Each OBServer belongs to a region and an IDC. A region is the geographic location of the OceanBase cluster. In most cases, a region represents a city. The IDC is the data center of the OceanBase cluster. An OceanBase cluster spans several regions. A region contains several IDCs. An IDC deploys several OBServers. Based on various regions and IDCs, the following types of location relationships exist between OceanBase clients and OBServers, or between OBServers: belonging to the same IDC in the same region, belonging to different IDCs in the same region, and belonging to different regions. Priorities of the three location relationships decrease in sequence.

remote execution

The OBServer that receives your request and generates an execution plan is different from the OBServer that executes the plan. In addition, only one OBServer executes the plan. replica

Each partition is stored as multiple physical copies to ensure data security and high availability of data services. Each copy is called a replica of the partition. Each replica contains three major types of data: static data stored in the SSTable on the disk, incremental data stored in the MemTable in the memory, and logs that record transactions. Several replica types are available depending on the types of data stored. This is to support the different business preferences in terms of data security, performance scalability, availability, and costs.

Full-featured replica: a regular replica that contains all data and provides full features, including transaction logs, a MemTable, and an SSTable. A full-featured replica can quickly switch to the leader to provide services.
Log replica: a replica that contains only logs. It does not have a MemTable or an SSTable. It provides log services for external applications and participates in log voting. It can facilitate the recovery of other replicas, but it cannot become the leader to provide database services.
Read-only replica: a replica that contains complete logs, a MemTable, and an SSTable. However, its logs are special. It does not participate in log voting as a member of the Paxos group. Instead, it works as an observer that synchronizes logs from the Paxos group members and then locally replays the logs. If an application does not require high consistency in data reading, this type of replicas can provide read-only services. They are not part of the Paxos group. Therefore, they do not increase the latency of transaction commit because the voting membership is not expanded.

The following table describes the types of replicas that are supported by OceanBase Database.

Type	Log	MemTable	SSTable	Data security	Time to become the leader	Resource cost	Service	Name (abbreviation)
Full-featured replica	It has logs and participates in voting (SYNC_CLOG).	Yes (WITH_MEMSTORE)	Yes (WITH_SSSTORE)	High	Short	High	Provides data read and write services as the leader and non-consistent read services as a follower.	FULL (F)
Log replica	It has logs and participates in voting (SYNC_CLOG).	No (WITHOUT_MEMSTORE)	No (WITHOUT_SSSTORE)	Low	Not supported	Low	Not readable or writable.	LOGONLY (L)
Read-only replica	It has asynchronous logs. It is only a listener instead of a member of the Paxos group (ASYNC_CLOG).	Yes (WITH_MEMSTORE)	Yes (WITH_SSSTORE)	Medium	Not supported	High	Non-consistent read.	READONLY (R)

The system automatically distributes the replicas across multiple servers based on the system load and specific rules. You can migrate, replicate, create, and delete replicas. You can also convert the types of the replicas. resource pool

A tenant has several resource pools, which contain all resources available to the tenant. A resource pool consists of several resource units with the same unit configurations. A resource pool belongs to only one tenant. A resource unit is a group of computing and storage resources on a server. You can take it as a lightweight VM with CPU, memory, and disk resources.

A tenant has at most one resource unit on the same server. Replicas are stored in resource units, which means that the resource units are containers of replicas. rotating compaction

To reduce the impact of compaction on your business, the system merges zones in a specified order. A zone that is being merged cannot provide services. row

A row consists of several columns. Specific columns can constitute a row key, and the entire table is stored based on the order of the row key. Specific tables, such as the result tables of the SELECT instruction, may not contain row keys. row compaction

Row compaction is a process in which multiple versions of incremental row data are combined into a single row. row cache

Row cache refers to cached row data of baseline data and minor compaction data. Row cache helps improve query performance. row key

Like a primary key in traditional relational databases, a row key is a unique identifier of each row of data in a table. Data in the table is sorted by row key. Rowid

A Rowid is stored in a baseline data row of a local partitioned index. It indicates the location of row data in the primary table and helps you quickly locate the corresponding row in the primary table. row merge

Row merge is a process in which baseline row data and incremental row data are merged into a new version of row data. row purge/range purge

This operation deletes rows. RPC

RPC stands for remote procedure call. RS

RootServer (RS) is the server that runs RootService. It manages clusters, data distribution, and replicas.

RS list

An RS list records the IP addresses of servers that run RootService in an OceanBase cluster. In most cases, each zone has an RS list.

RS lists are involved in the creation of clusters and are closely related to the SYS tenant.

OceanBase Database supports the following types of RS lists: RS lists that are directly obtained from the config server and RS lists that are updated based on the server list.

S

schema

In most cases, a schema indicates a specific database object such as a table, a view, and an index. On an OBServer, a schema is a collection of database objects. schema version

An OBServer maintains a global schema version. The global schema version increases each time a database object is modified. Each database object also has a version number, which is the global schema version that corresponds to the last modification (such as creation and change) of the database object. schema refresh

In OceanBase Database, all changes to system objects that are caused by DDL operations occur on RootServer. To allow each node in a cluster to obtain the latest schema information, RootServer regularly broadcasts the latest schema version to each node in the cluster. After a node receives the schema version, the node compares the schema version with the schema version in the local cache. If the schema version in the local cache lags behind the received schema version, the node obtains the changes from the system table to update the system objects in the local cache. This process is known as schema refresh.

secondary index

A secondary index is an auxiliary data structure that is used to access data tables. Compared with a primary key, a secondary index contains a set of key values that are explicitly or implicitly specified by users. In OceanBase Database, secondary indexes are implemented as data tables that are associated with the primary table.

server list A server list records the IP addresses of all servers in an OceanBase cluster. Slog/SSTable log

Slogs/SSTable logs are used to maintain the consistency of baseline data on a node. slow query

A slow query is an SQL query that is not completed in a specified period of time.

statistics

Statistics are a collection of data that describes the information about tables and columns in a database. OceanBase Database supports table-level statistics and column-level statistics. strong-consistency read and weak-consistency read

Strong-consistency read is a default SQL execution method in OceanBase Database. In a strong-consistency read operation, the SQL statement must be forwarded to the OBServer where the leader of the involved partition is located. This method allows you to obtain the latest data in real time. Weak-consistency read is the opposite of strong-consistency read. In a weak-consistency read operation, the SQL statement is forwarded to an OBServer in which a replica of the involved partition is located, regardless of whether the replica is a leader. You can use one of the following methods to set the read consistency level to weak: Execute the SELECT statement with the read_consistency(weak) hint, or set the ob_read_consistency variable of the current session to weak. stored procedure

A stored procedure is a programming method that is provided by the server. SQC

In a distributed parallel execution plan, each participating server runs a Sub Query Coordinator (SQC). The SQC receives scheduling instructions from the QC, obtains local worker threads, generates the granules of local tasks, and coordinates local execution. SSTable

An SSTable stores baseline data or minor compaction data. It stores row data in order. start working log

After a leader is elected for a partition, the system generates a log before the leader processes new transactions. The log is known as start working log. It records the point in time at which the leader is elected. sub-plan

A sub-plan is equivalent to a DFO in distributed parallel execution.

T

table

A table is a basic unit of data storage in OceanBase Database. Each table consists of several rows of records, and each row has the same pre-defined columns. You can use SQL statements to create, retrieve, update, and delete (CRUD) data in a table. Generally, several columns of a table make up a primary key, which is unique among the datasets of the table. table group

If a collection of tables is frequently accessed at the same time, you can store the same type of replicas of these tables to the same OBServer to optimize query performance. To achieve this purpose, you can define a table group and add the collection of tables to the table group. A table group contains multiple tables that have the same number of partitions and follows the same partitioning rules. If each table in a table group has N partitions, the ith partitions of all the tables constitute a partition group. The leaders of partitions in a partition group are stored on the same OBServer.

tenant

OceanBase Database achieves resource isolation by using tenants and supports multiple tenants in a single cluster. A tenant in an OceanBase cluster is equivalent to a MySQL or Oracle instance. OceanBase Database isolates resources and data among tenants. Each tenant owns a group of computing and storage resources and independently provides a complete set of database services.

OceanBase Database supports system tenants and user tenants. A system tenant stores internal metadata managed by OceanBase Database. A user tenant stores user data and database metadata.

U

unit scheduling (unit load balancing)

Unit scheduling is a strategy that dynamically schedules resource units in each zone to achieve load balancing.

Multiple resource units that belong to one tenant are evenly distributed to different servers.
Multiple resource units that belong to one tenant group are evenly distributed to different servers if possible.
If the overall disk usage of servers in a zone exceeds a specific threshold, you can exchange or migrate resource units to reduce the disk usage.
You can also exchange or migrate resource units based on CPU and memory specifications to reduce the average CPU utilization and memory usage.

universe

The universe indicates all OceanBase databases deployed across regions.

W

worker thread

A worker thread is a thread that is used to process tenant requests in OceanBase Database. Worker threads that belong to the same tenant share a task queue to process tenant requests.

Z

zone

Zone is short for availability zone.

An OceanBase cluster consists of several zones. A zone usually refers to an IDC. Multiple replicas of data are distributed across different zones to ensure data security and high availability. This design ensures that the failure of a single zone does not affect the database service. A zone consists of multiple physical servers.