This topic describes the physical backup modes and architecture.
Physical backup
OceanBase Database provides the online physical backup feature, which consists of log archiving and data backup. The log backup feature continuously backs up logs generated by a tenant, whereas the data backup feature backs up snapshots. Together, the two features can restore data to any point in time after the backup checkpoint.
Log archiving
OceanBase Database provides tenant-level log archiving capabilities.
The leader of a log stream is responsible for log archiving. Logs are physically backed up based on log streams at the log entry level.
The default log backup interval is 2 minutes. Logs can be backed up in quasi-real time. By default, directories are split by 24 hours to facilitate backup data management.
Compared with that in earlier version of OceanBase Databases, such as V2.2.x and V3.x, the log archiving feature in OceanBase Database V4.0.0 is no longer based on partition-level log management. This significantly reduces the number of I/O operations incurred in log archiving and effectively reduces performance requirements on the backup media.
Data backup
OceanBase Database provides tenant-level data backup capabilities.
The data backup process is orchestrated by the RootService node and is carried out according to log streams. The backup data includes the partition’s metadata and macro block data. Physical backup refers to the physical backup of macro block data, while metadata refers to values that have been serialized from memory.
Each baseline macro block in OceanBase Database has a globally unique logical identifier. This logical identifier enables incremental backups to reuse macro blocks. In OceanBase Database, an incremental backup consists of a full backup of metadata plus a backup of only the incremental data macro blocks. The process and performance of restoring from an incremental backup are essentially the same as restoring from a full backup; the only difference is that macro blocks are read from different backup sets based on their logical identifiers.
Compared with earlier versions (V2.2.x and V3.x) of OceanBase Database, V4.x removes the dependency on retaining data snapshot points for data backup. Initiating a major freeze during the backup process no longer causes node storage space to bloat.
In V4.x, when selecting backup servers, OceanBase Database prioritizes the follower nodes of the log stream. This approach reduces the load on the log stream’s leader node and improves backup efficiency. Specifically, the system will first select all eligible follower nodes for backup, and will only choose leader nodes if necessary. This strategy ensures that the backup process is more efficient and stable.
Data cleanup
OceanBase Database supports automatic cleanup for the specified backup path. RootService periodically checks the specified backup cleanup strategy and deletes unnecessary data backups. The system also deletes unnecessary log backups based on the earliest replay checkpoint of the retained data backups.