Structure of log archive directories
You can specify the time interval for splitting log archive directories. By default, the directories are split by day. In other words, the log data directory of one day corresponds to a backup_piece.
When you use Network File System (NFS), Alibaba Cloud Object Storage Service (OSS), or Azure Blob Storage as the backup media, the structure of log archive directories is as follows:
log_archive_dest
├── check_file
│ └── 1002_connect_file_20230111T193049.obbak // The connectivity check file.
├── format.obbak // The format information of the backup path.
├── rounds // The round placeholder directory.
│ └── round_d1002r1_start.obarc // The round start placeholder.
├── pieces // The piece placeholder directory.
│ ├── piece_d1002r1p1_start_20230111T193049.obarc // The piece start placeholder, in the format of piece_DESTID_ROUNDID_PIECEID_start_DATE.
│ └── piece_d1002r1p1_end_20230111T193249.obarc // The piece end placeholder, in the format of piece_DESTID_ROUNDID_PIECEID_end_DATE.
└── piece_d1002r1p1 // The piece directory, named in the format of piece_DESTID_ROUNDID_PIECEID.
├── piece_d1002r1p1_20230111T193049_20230111T193249.obarc // The contiguous interval of a piece.
├── checkpoint
│ └── checkpoint.1673436649723677822.obarc // The checkpoint_scn of the piece. This file is named in the format of `checkpoint.checkpoint_scn`.
│ └── checkpoint_info.0.obarc // The metadata of the checkpoint.
├── single_piece_info.obarc // The metadata of the piece.
├── tenant_archive_piece_infos.obarc // The metadata of all frozen pieces before this piece.
├── file_info.obarc // The list of files in all log streams.
├── logstream_1 // Log stream 1.
│ ├── file_info.obarc // The list of files in log stream 1.
│ ├── log
│ │ └── 1.obarc // The archive file in log stream 1.
│ └── schema_meta // The metadata of data dictionaries. This file is generated only in log stream 1.
│ └── 1677588501408765915.obarc
└── logstream_1001 // Log stream 1001.
├── file_info.obarc // The list of files in log stream 1001.
└── log
└── 1.obarc // The archive file in log stream 1001.
In the above directory structure, the top-level directory contains the following types of data:
format.obbak: records the metadata of the archive path, including the tenant that uses the path.check_file: checks the connectivity to the root archive directory.rounds: the summary directory that records all rounds of log archiving.pieces: the summary directory that records all pieces of log archiving.piece_d1002r1p1: the piece directory for log archiving, named in the format ofpiece_DESTID_ROUNDID_PIECEID. Here,DESTIDrefers to the ID corresponding tolog_archive_dest,ROUNDIDrefers to the ID of the log archiving round, which is a monotonically increasing integer, andPIECEIDrefers to the ID of the log archiving piece, which is also a monotonically increasing integer.In a log archiving piece directory, the following data is included:
piece_d1002r1p1_20230111T193049_20230111T193249.obarc: This file displays the ID, start time, and end time of the current piece and is used only for display purposes.checkpoint: This directory is used to record the archiving timestamps of active pieces. The ObArchiveScheduler module periodically updates the timestamp information in this directory. Files in this directory are as follows:checkpoint.1673436649723677822.obarc: The name of this file contains the checkpoint_scn, which is1673436649723677822in this case, of the piece.checkpoint_info.0.obarc: This file records the checkpoint metadata of an active piece, including tenant_id, dest_id, round_id, and piece_id. The metadata in a piece remains unchanged.
single_piece_info.obarc: This file records the metadata of the current piece.tenant_archive_piece_infos.obarc: This file records the metadata of all frozen pieces in the current tenant.file_info.obarc: This file records the list of log streams within the piece.logstream_1: This directory records the log files of log stream 1, which is the system log stream of an OceanBase Database tenant.logstream_1001: This directory records the log files of log stream 1001, where log streams with numbers greater than 1000 are user log streams of an OceanBase Database tenant.
Additionally, each log stream backup contains three types of data:
file_info.obarc: This file records the list of files in the log stream.log: This directory contains all the archive files of the current log stream, with file names consistent with those in the source cluster.schema_meta: This directory records the metadata of data dictionaries. It is only present in the system log stream but not in user log streams.
For AWS S3 and backup media compatible with the S3 protocol, the structure of log archive directories is different from that of OSS, NFS, and Azure Blob Storage. A single archive file consists of multiple small files and corresponding metadata files. The directory structure is as follows.
Note
Although the log archive directory structure of AWS S3 differs from that of backup media such as NFS, OSS, and Azure Blob Storage, backup files on AWS S3 can still be restored after being copied across clouds to other backup media. For example, if you copy archived data from AWS S3 to OSS, you can still successfully restore using the OSS path.
log_archive_dest
├── ......
└── piece_d1002r1p1 // The piece directory, named in the format of piece_DESTID_ROUNDID_PIECEID.
├── ...... // The list of files in all log streams.
├── logstream_1 // Log stream 1.
│ ├── file_info.obarc // The list of files in log stream 1.
│ ├── log
│ │ └── 1.obarc // The archive file in log stream 1, which is identified by a prefix.
| | └── @APD_PART@0-32472973.obarc // The actual data in the archive file, including the data from byte 0 to byte 32472973 in the log file.
| | └── ......
| | └── @APD_PART@FORMAT_META.obarc // The format of the archive file.
| | └── @APD_PART@SEAL_META.obarc // The metadata of the archive file.
│ └── schema_meta // The metadata of the data dictionary. This file is generated only in log stream 1.
│ └── 1677588501408765915.obarc
└── logstream_1001 // Log stream 1001.
├── file_info.obarc // The list of files in log stream 1001.
└── log
└── 1.obarc // The archive file in log stream 1001.
In the preceding directory structure, 1.obarc indicates a single archive file that is identified by a prefix. The prefix and the name of the archive file are the same. An archive file contains the following three types of data:
@APD_PART@FORMAT_META.obarc: When data is written to the archive file for the first time, theformat_metafile is generated in this directory to record the format of the archive file.@APD_PART@0-32472973.obarc: The actual data in the archive file is written to the file named with this prefix, and the start offset and the end offset of each write are recorded in the file name.@APD_PART@SEAL_META.obarc: After data is written to the archive file for the last time, theseal_metafile is generated in this directory to record the metadata in the archive file.
Structure of data backup directories
Each data backup corresponds to a backup_set directory. Each time the ALTER SYSTEM BACKUP DATABASE statement is executed, a new backup_set directory is generated, containing all data that is backed up this time.
The structure of data backup directories is as follows:
data_backup_dest
├── format.obbak // The format information of the backup path.
├── check_file
│ └── 1002_connect_file_20230111T193020.obbak // The connectivity check file.
├── backup_sets // The summary directory that contains all data backup sets.
│ ├── backup_set_1_full_end_success_20230111T193420.obbak // The end placeholder for a full backup.
│ ├── backup_set_1_full_start.obbak // The start placeholder for a full backup.
│ ├── backup_set_2_inc_start.obbak // The start placeholder for an incremental backup.
│ └── backup_set_2_inc_end_success_20230111T194420.obbak // The end placeholder for an incremental backup.
└── backup_set_1_full // A full backup set. A directory whose name ends with "full" is a full backup set, and a directory whose name ends with "inc" is an incremental backup set.
├── backup_set_1_full_20230111T193330_20230111T193420.obbak // The placeholder that represents the start and end time of a full backup.
├── single_backup_set_info.obbak // The metadata of the current backup set.
├── tenant_backup_set_infos.obbak // The full backup set information of the current tenant.
├── infos
│ ├── table_list // The list of table name files.
│ │ ├── table_list.1702352553000000000.1.obbak // Table name file 1.
│ │ ├── table_list.1702352553000000000.2.obbak // Table name file 2.
│ │ └── table_list_meta_info.1702352553000000000.obbak // The table name metadata file.
│ ├── major_data_info_turn_1 // The tenant-level baseline data backup file when the turn ID is 1.
│ │ ├── tablet_log_stream_info.obbak // The file describing the mapping between tablets and log streams.
│ │ ├── tenant_major_data_macro_range_index.0.obbak // The macroblock index in the baseline data backup file.
│ │ ├── tenant_major_data_meta_index.0.obbak // The meta index in the baseline data backup file.
│ │ └── tenant_major_data_sec_meta_index.0.obbak // The mapping between the logic ID and the physical ID of the meta index in the baseline data backup file.
│ ├── minor_data_info_turn_1 // The tenant-level minor-compacted backup file when the turn ID is 1.
│ │ ├── tablet_log_stream_info.obbak // The file describing the mapping between tablets and log streams.
│ │ ├── tenant_minor_data_macro_range_index.0.obbak // The macroblock index in the minor-compacted backup file.
│ │ ├── tenant_minor_data_meta_index.0.obbak // The meta index in the minor-compacted backup file.
│ │ └── tenant_minor_data_sec_meta_index.0.obbak // The mapping between the logic ID and the physical ID of the meta index in the minor-compacted backup file.
│ ├── diagnose_info.obbak // The diagnosis information file of the backup set.
| ├── tenant_parameter.obbak // The tenant-level parameters that are not default settings of the current tenant.
│ ├── locality_info.obbak // The locality information of the current backup set, including the resource configuration and replica distribution information of the tenant.
│ └── meta_info // The tenant-level log stream metadata file, which contains the metadata of all log streams.
│ ├── ls_attr_info.1.obbak // The snapshot of log streams during backup.
│ └── ls_meta_infos.obbak // The collection of metadata of all log streams.
├── logstream_1 // Log stream 1.
│ ├── major_data_turn_1_retry_0 // The baseline data when the turn ID is 1 and retry ID is 0.
│ │ ├── macro_block_data.0.obbak // A data file sized between 512 MB and 4 GB.
│ │ ├── macro_range_index.obbak // The macro index.
│ │ ├── meta_index.obbak // The meta index.
│ │ └── sec_meta_index.obbak // The file describing the mapping between the logical ID and the physical ID.
│ ├── meta_info_turn_1_retry_0 // The metadata file of the log stream when the turn ID is 1 and retry ID is 0.
│ │ ├── ls_meta_info.obbak // The metadata of the log stream.
│ │ └── tablet_info.1.obbak // The metadata of log stream tablets.
│ ├── minor_data_turn_1_retry_0 // The minor-compacted data when the turn ID is 1 and retry ID is 0.
│ │ ├── macro_block_data.0.obbak
│ │ ├── macro_range_index.obbak
│ │ ├── meta_index.obbak
│ │ └── sec_meta_index.obbak
│ └── sys_data_turn_1_retry_0 // The system tablet data when the turn ID is 1 and retry ID is 0.
│ ├── macro_block_data.0.obbak
│ ├── macro_range_index.obbak
│ ├── meta_index.obbak
│ └── sec_meta_index.obbak
└── logstream_1001 // Log stream 1001.
├── major_data_turn_1_retry_0
│ ├── macro_block_data.0.obbak
│ ├── macro_range_index.obbak
│ ├── meta_index.obbak
│ └── sec_meta_index.obbak
├── meta_info_turn_1_retry_0
│ ├── ls_meta_info.obbak
│ └── tablet_info.1.obbak
├── minor_data_turn_1_retry_0
│ ├── macro_block_data.0.obbak
│ ├── macro_range_index.obbak
│ ├── meta_index.obbak
│ └── sec_meta_index.obbak
└── sys_data_turn_1_retry_0
├── macro_block_data.0.obbak
├── macro_range_index.obbak
├── meta_index.obbak
└── sec_meta_index.obbak
In the backup directory structure, the top-level directory contains the following types of data:
format.obbak: records the metadata of the backup path.check_file: checks the connectivity to the root backup directory.backup_sets: the summary directory that contains all data backup sets.backup_set_1_full: a directory that represents a data backup set. If the directory name ends withfull, the backup set is a full backup set. If the directory name ends withinc, the backup set is an incremental backup set.In a data backup set, the following data is included:
backup_set_1_full_20230111T193330_20230111T193420.obbak: This file records the ID, start time, and end time of the current backup set. This file is used only for display purposes.single_backup_set_info.obbak: This file records the metadata of the current backup set, including the backup timestamp, dependent logs, and other information.tenant_backup_set_infos.obbak: This file records the metadata of all existing backup sets for the current tenant.infos: This directory records the metadata of the data backup set.logstream_1: This directory records all the data of log stream 1, which is the system log stream of an OceanBase Database tenant.logstream_1001: This directory records all the data of log stream 1001, where log streams with numbers greater than 1000 are user log streams of an OceanBase Database tenant.
In addition, each log stream backup has four types of directories. Directories whose names contain
retryrecord information about log stream-level retries, and directories whose names containturnrecord information about tenant-level retries.meta_info_xx: This directory records log stream metadata and tablet metadata.sys_data_xx: This directory records the data of internal system tablets in log streams.minor_data_xx: This directory records the minor-compacted data of regular tablets.major_data_xx: This directory records the baseline data of regular tablets.
Directory for storing cluster-level configuration item backups
Each time you back up cluster-level parameters, the system generates a backup file of the cluster-level parameters in the specified directory. The directory structure is as follows.
cluster_parameters_backup_dest
├── cluster_parameter.20240710T103610.obbak # Information about non-default cluster-level configurations, file naming format: `cluster_parameter.[timestamp]`
└── cluster_parameter.20241018T140609.obbak