OceanBase Database supports automatic scale-out of disk space available for data files based on the actual situation.
Background information
In OceanBase Database, the files used to store data are named block_file. These block_file files are located in the /store/sstable directory under the installation directory of OBServer nodes. They are created after each OBServer node is started. The size of the block_file files is controlled by the datafile_size or datafile_disk_percentage parameter. see Structure of the OBServer installation directory.
Prior to OceanBase Database V4.2.0, the system adopted a pre-allocation approach where a portion of disk space was reserved for data files. This ensured that the data files occupied a contiguous section of the disk space, preventing insufficient disk resources caused by other applications preempting the disk. However, this pre-allocation method occupied a significant amount of disk space, and even if the disk space was not being used, it could not be released.
To address this issue, OceanBase Database introduced several parameters for progressive disk space utilization for data files. The system first pre-allocates a reasonable disk size for the data files and then automatically scales up based on the actual disk usage and user configuration.
Related parameters
OceanBase Database automatically scales out the disk space available for the data file on an OBServer node based on the following parameters:
datafile_size: the size of disk space available for the data file. For more information about the cluster-level parameterdatafile_size, see datafile_size.datafile_disk_percentage: the percentage of disk space that can be occupied by the data file. The default value is0. For more information about the cluster-level parameterdatafile_disk_percentage, see datafile_disk_percentage.datafile_next: the automatic scale-out step for the disk space for the data file. The default value is0M, indicating that automatic scale-out is disabled. To enable automatic scale-out, set the parameter to a nonzero value. For more information about the cluster-level parameterdatafile_next, see datafile_next.datafile_maxsize: the maximum size of disk space allowed for automatic scale-out for the data file. The default value is0M, which indicates that automatic scale-out is disabled. For more information about the cluster-level parameterdatafile_maxsize, see datafile_maxsize.
Assume that 1 TB of disk space is required for your business volume. You can set datafile_size to 1 TB or calculate the value of datafile_disk_percentage based on the proportion of 1 TB to the total space of the disk. Then, set datafile_maxsize to an acceptable maximum disk size 10 TB, and set datafile_next to 1 TB. This way, the system automatically scales out the disk space for the data file based on the actual situation without wasting the disk space.
Notice
If you do not want to enable automatic scale-out, you can set datafile_next or datafile_maxsize to 0M, or retain the default value 0M. Then, the system will reserve a part of the disk space for the data file based on the original pre-allocation mechanism.
Considerations and limitations
When you configure automatic scale-out of disk space for the data file, we recommend that you set the initial value of
datafile_nextto 20% of the value ofdatafile_maxsize, to avoid frequent scale-out.When you specify a value for
datafile_maxsize, make sure that the value is greater than the current size of the data file in the disk. If you specify a value smaller than the current size of the data file in the disk, automatic scale-out will not be triggered.If the value of
datafile_maxsizeexceeds the maximum available space of the current disk, the actual size of available space is used as the maximum value.After you enable automatic scale-out of disk space for data files, you must make a proper capacity plan for other applications deployed on the same server, and make sure that the maximum size of disk space available for scale-out is no smaller than the value of
datafile_maxsize.At present, automatic scale-in of disk space for data files is not supported.
Configure automatic scale-out of disk space for data files
This section provides three scenarios to describe how to implement automatic scale-out of disk space for the data file on an OBServer node by modifying parameter values.
datafile_size (or datafile_disk_percentage), datafile_next, and datafile_maxsize are set when your cluster is started
For OceanBase Database Community Edition, when you modify the configuration file before you deploy an OceanBase cluster, if you set datafile_size (or datafile_disk_percentage), datafile_next, and datafile_maxsize to nonzero values, and the value of datafile_maxsize is greater than that of datafile_size, or greater than the disk size calculated based on the value of datafile_disk_percentage, automatic scale-out takes effect after the cluster is started. For more information about how to deploy an OceanBase cluster, see Deploy OceanBase Database Community Edition.
For OceanBase Database Enterprise Edition, when you start the observer process for the first time after you deploy an OceanBase cluster, if you set datafile_size (ordatafile_disk_percentage), datafile_next, and datafile_maxsize to nonzero values, and the value of datafile_maxsize is greater than that of datafile_size, or greater than the disk size calculated based on the value of datafile_disk_percentage, automatic scale-out takes effect after the cluster is started. For more information about how to deploy an OceanBase cluster, see Deploy OceanBase Database Enterprise Edition.
Only datafile_size (or datafile_disk_percentage) is set when the cluster is started
For OceanBase Database Community Edition and OceanBase Database Enterprise Edition, if you set only datafile_size or datafile_disk_percentage when you deploy and start an OceanBase cluster, you need to manually set only dataflie_maxsize and datafile_next to implement automatic scale-out.
To do so, perform the following steps:
Log on to the
systenant of the cluster as therootuser.Execute the following statement to query the size of the pre-allocated disk space:
SELECT data_disk_allocated/1024/1024/1024 AS datafile_G FROM oceanbase.GV$OB_SERVERS;A sample query result is as follows:
+-------------------+ | datafile_G | +-------------------+ | 2048.000000000000 | +-------------------+ 1 row in setThe query result shows that the size of the disk space pre-allocated (
data_disk_allocated) to the data file is 2,048 GB (2 TB).For more information about columns in the
GV$OB_SERVERSview, see GV$OB_SERVERS.Set
datafile_maxsizeto a value greater than that ofdata_disk_allocated, and setdatafile_nextto a nonzero value.Here is an example:
ALTER SYSTEM SET datafile_maxsize='4TB';ALTER SYSTEM SET datafile_next='1TB';Execute the following statement to verify whether the automatic scale-out settings take effect:
SELECT data_disk_allocated/1024/1024/1024 AS datafile_G, data_disk_capacity/1024/1024/1024 AS datafile_max_G FROM oceanbase.GV$OB_SERVERS;A sample query result is as follows:
+-------------------+-------------------+ | datafile_G | datafile_max_G | +-------------------+-------------------+ | 2048.000000000000 | 4096.000000000000 | +-------------------+-------------------+ 1 row in setThe query result shows that the maximum capacity (
data_disk_capacity) for automatic scale-out of disk space for the data file is the value ofdatafile_maxsize, and the value ofdata_disk_capacityis greater than that ofdata_disk_allocated. This indicates that the automatic scale-out settings take effect.
datafile_size (or datafile_disk_percentage), datafile_next, and datafile_maxsize are not set when the cluster is started
For OceanBase Database Community Edition and OceanBase Database Enterprise Edition, if you do not set datafile_size (or datafile_disk_percentage), datafile_next, or datafile_maxsize when you deploy and start an OceanBase cluster, you can enable automatic scale-out by referring to the following procedure.
Notice
Whether the datafile_size (or datafile_disk_percentage) parameter is set when the cluster is started for the first time affects the initial size of disk space for the data file.
To do so, perform the following steps:
Log on to the
systenant of the cluster as therootuser.Execute the following statement to query the size of the pre-allocated disk space:
SELECT data_disk_allocated/1024/1024/1024 AS datafile_G FROM oceanbase.GV$OB_SERVERS;A sample query result is as follows:
+-------------------+ | datafile_G | +-------------------+ | 1770.000000000000 | +-------------------+ 1 row in setThe query result shows that the size of the disk space pre-allocated (
data_disk_allocated) to the data file is 1,770 GB (about 1.7 TB).For more information about columns in the
GV$OB_SERVERSview, see GV$OB_SERVERS.Set
datafile_maxsizeto a value greater than that ofdata_disk_allocated, and setdatafile_nextto a nonzero value.Here is an example:
ALTER SYSTEM SET datafile_maxsize='4TB';ALTER SYSTEM SET datafile_next='1TB';Execute the following statement to verify whether the automatic scale-out settings take effect:
SELECT data_disk_allocated/1024/1024/1024 AS datafile_G, data_disk_capacity/1024/1024/1024 AS datafile_max_G FROM oceanbase.GV$OB_SERVERS;A sample query result is as follows:
+-------------------+-------------------+ | datafile_G | datafile_max_G | +-------------------+-------------------+ | 1770.000000000000 | 4096.000000000000 | +-------------------+-------------------+ 1 row in setThe query result shows that the maximum capacity (
data_disk_capacity) for automatic scale-out of disk space for the data file is the value ofdatafile_maxsize, and the value ofdata_disk_capacityis greater than that ofdata_disk_allocated. This indicates that the automatic scale-out settings take effect.
What to do next
After the configuration, the system automatically scales out the disk space for data files based on the actual situation and the values of datafile_next and datafile_maxsize.