This topic describes how to create a table-level hot and cold storage strategy (STORAGE_CACHE_POLICY) by using SQL statements:
- Create a hot and cold storage strategy for a regular table.
- Create a hot and cold storage strategy for an index table.
Considerations
In shared storage mode, you can specify the cold-hot storage strategy when you create a table or an index.
Create a cold-warm storage policy for a regular table
You can specify a table-level cold-warm storage policy when you create a table by adding the STORAGE_CACHE_POLICY option to the CREATE TABLE statement. The syntax is as follows:
CREATE TABLE table_name (table_definition_list)
STORAGE_CACHE_POLICY (storage_cache_policy_option);
storage_cache_policy_option:
GLOBAL = {"hot" | "auto" | "cold"}
| timeline_strategy_list
timeline_strategy_list:
BOUNDARY_COLUMN = column_name
| BOUNDARY_COLUMN_UNIT = {"s" | "ms"}
| HOT_RETENTION = intnum retention_time_unit
| MIXED_RETENTION = intnum retention_time_unit
| GRANULARITY = {"partition" | "block"}
retention_time_unit:
YEAR
| MONTH
| WEEK
| DAY
| HOUR
| MINUTE
Parameters
STORAGE_CACHE_POLICY is an optional parameter. If you do not specify the STORAGE_CACHE_POLICY option, the value of the tenant-level parameter default_storage_cache_policy is used by default. The properties in storage_cache_policy_option are defined in the key-value format. The semantics of each property are as follows:
GLOBAL = {"hot" | "auto" | "cold"}: specifies the cold/hot storage strategy for all data in the table. Valid values:hot,auto, andcold. The semantics of these values are as follows:hot: specifies that all data in the table is hot data. If the cache disk space is sufficient, all data in the table is cached to the local cache disk.auto: specifies that the hot data in the table is automatically identified by the system.cold: specifies that the data in the table is cold data. The data will not be cached to the local cache disk. For data that has been persisted to the cache, the system will actively trigger cache eviction. Subsequent accesses to the table data can be stored in the macroblock memory cache, without being persisted to the macroblock cache and microblock cache.
timeline_strategy_list: specifies the list of time-axis strategy parameters. Parameters are separated by commas. The time-axis strategy supports a mechanism to determine the cold/hot storage of partition/macrobloc data based on time. The system automatically adjusts the data on the local cache disk based on the configured strategy.Note
A time-axis: specifies the range of partitions defined by the Range partitioning. When the specified time condition is met, the data in the partition is considered hot data. When you use the time-axis strategy, note the following points:
- Only Range-partitioned tables (Range partitioning is either at the primary or secondary level) are supported. This is because the system needs to determine whether the data has expired based on time. However, double Range partitioning is not supported, as it is unclear which Range partition to use for time determination.
- When using the time-axis strategy, the partition expression can only contain column names and does not support expression conversion. For example,
PARTITION BY RANGE COLUMNS(expr(col3))is not supported. BOUNDARY_COLUMNmust be a partition key. If the partition key consists of multiple columns,BOUNDARY_COLUMNmust be the first column, used to determine whether the partition has expired.
BOUNDARY_COLUMN = column_name: specifies the column used to determine hot data. Supported data types: integer (BIGINTorINT, in Unix timestamp format) and time (TIMESTAMP,DATE,DATETIME, orYEAR).- If
BOUNDARY_COLUMNis of the integer type, the table supports Range or Range Columns partitioning. - If
BOUNDARY_COLUMNis of the time type, the table must use Range Columns partitioning.
- If
BOUNDARY_COLUMN_UNIT = {"s" | "ms"}: specifies the time unit for theBOUNDARY_COLUMNparameter. This parameter can only be set whenBOUNDARY_COLUMNis of the integer type. When using integer values as timestamps, you must specify the timestamp unit to avoid incorrect parsing. Valid values:If the partition column is of the
INTtype,BOUNDARY_COLUMN_UNITmust bes.If the partition column is of the
BIGINTtype,BOUNDARY_COLUMN_UNITcan besorms.Notice
If the format is not a Unix timestamp, the system cannot correctly identify the time represented by the
INTtype.
HOT_RETENTION = intnum retention_time_unit: specifies the time range for hot data.intnum: specifies an integer.retention_time_unit: specifies the time unit. Valid values:YEAR: specifies years.MONTH: specifies months.WEEK: specifies weeks.DAY: specifies days.HOUR: specifies hours.MINUTE: specifies minutes.
Notice
In the time-axis syntax, at least one of the
HOT_RETENTIONandMIXED_RETENTIONparameters must be specified.MIXED_RETENTION = intnum retention_time_unit:- If you specify
MIXED_RETENTION, data after the time range ofHOT_RETENTION + MIXED_RETENTIONis considered cold data. - If you do not specify
MIXED_RETENTION, data after the time range ofHOT_RETENTIONis considered auto data. - If you do not specify
HOT_RETENTION, data in the time range of [0, MIXED_RETENTION] is considered auto data, and data after the time range ofMIXED_RETENTIONis considered cold data.
- If you specify
GRANULARITY = {"partition" | "block"}: specifies the granularity of data cold/hot storage operations. Valid values:partition: specifies that the cache strategy is applied at the partition level.block: specifies that the cache strategy is applied at the macroblock level.Note
If the table is not partitioned, you can specify
blockto manually specify a period of time during which the data is hot data. For more information about the macroblock cold/hot strategy, see Create a macroblock-level cold/hot storage strategy.
Examples
Manually specify hot data.
obclient> CREATE TABLE tbl1 ( col1 INT, col2 INT) STORAGE_CACHE_POLICY (GLOBAL = "hot");Create a table named
tbl2_dateand specify a column of theDATEtype (DATE,DATETIME,TIMESTAMP,YEAR, orDAY). After the hot data is maintained for 1 month, the data becomes cold data.obclient> CREATE TABLE tbl2_date ( col1 BIGINT NOT NULL, col2 VARCHAR(50), col3 DATE NOT NULL) STORAGE_CACHE_POLICY ( BOUNDARY_COLUMN = col3, HOT_RETENTION = 1 MONTH, MIXED_RETENTION = 0 HOUR) PARTITION BY RANGE COLUMNS(col3) ( PARTITION M202001 VALUES LESS THAN('2020/02/01'), PARTITION M202002 VALUES LESS THAN('2020/03/01'), PARTITION M202003 VALUES LESS THAN('2020/04/01'), PARTITION MMAX VALUES LESS THAN MAXVALUE);Create a table named
tbl2_intand specify a column of theBIGINTtype (INTorBIGINT). You must specifyBOUNDARY_COLUMN_UNIT, which is processed by default as a UTC timestamp. After the hot data is maintained for 1 month, the data becomes auto data. After 1 week, the data becomes cold data.obclient> CREATE TABLE tbl2_int ( col1 BIGINT NOT NULL, col2 VARCHAR(50), col3 BIGINT NOT NULL) STORAGE_CACHE_POLICY ( BOUNDARY_COLUMN = col3, BOUNDARY_COLUMN_UNIT = "ms", HOT_RETENTION = 1 MONTH, MIXED_RETENTION = 1 WEEK) PARTITION BY RANGE COLUMNS(col3) ( PARTITION M202001 VALUES LESS THAN('2020/02/01'), PARTITION M202002 VALUES LESS THAN('2020/03/01'), PARTITION M202003 VALUES LESS THAN('2020/04/01'), PARTITION MMAX VALUES LESS THAN MAXVALUE);
Create a cold-hot storage strategy for an index table
You can specify the cold-hot storage strategy for an index table in the following ways:
- When you create an index, specify the table-level cold-hot storage strategy for the index.
- When you create a table, create an index and specify the table-level cold-hot storage strategy for the index.
- When you add an index to an existing table, specify the table-level cold-hot storage strategy for the index.
For more information about how to create an index, see Create an index.
Specify the table-level hot and cold storage strategy for an index when you create the index
Syntax:
CREATE INDEX index_name ON table_name (column_name)
STORAGE_CACHE_POLICY (storage_cache_policy_option);
storage_cache_policy_option:
GLOBAL = {"hot" | "auto" | "cold" | "none"}
| timeline_strategy_list
timeline_strategy_list:
BOUNDARY_COLUMN = column_name
| BOUNDARY_COLUMN_UNIT = {"s" | "ms"}
| HOT_RETENTION = intnum retention_time_unit
| MIXED_RETENTION = intnum retention_time_unit
| GRANULARITY = {"partition" | "block"}
retention_time_unit:
YEAR
| MONTH
| WEEK
| DAY
| HOUR
| MINUTE
Specify the table-level hot and cold storage strategy for an index when you create a table
Syntax:
CREATE TABLE table_name (
column_name column_definition,
[column_name column_definition,...],
INDEX index_name(column_name) STORAGE_CACHE_POLICY (storage_cache_policy_option)
);
storage_cache_policy_option:
GLOBAL = {"hot" | "auto" | "cold" | "none"}
| timeline_strategy_list
timeline_strategy_list:
BOUNDARY_COLUMN = column_name
| BOUNDARY_COLUMN_UNIT = {"s" | "ms"}
| HOT_RETENTION = intnum retention_time_unit
| MIXED_RETENTION = intnum retention_time_unit
| GRANULARITY = {"partition" | "block"}
retention_time_unit:
YEAR
| MONTH
| WEEK
| DAY
| HOUR
| MINUTE
Specify the table-level hot and cold storage strategy for an index when you add an index to an existing table
Syntax:
ALTER TABLE table_name ADD INDEX index_name(column_name)
STORAGE_CACHE_POLICY (storage_cache_policy_option);
storage_cache_policy_option:
GLOBAL = {"hot" | "auto" | "cold" | "none"}
| timeline_strategy_list
timeline_strategy_list:
BOUNDARY_COLUMN = column_name
| BOUNDARY_COLUMN_UNIT = {"s" | "ms"}
| HOT_RETENTION = intnum retention_time_unit
| MIXED_RETENTION = intnum retention_time_unit
| GRANULARITY = {"partition" | "block"}
retention_time_unit:
YEAR
| MONTH
| WEEK
| DAY
| HOUR
| MINUTE
Parameters
You can configure the hot and cold storage strategy for an index table:
For a local index, you can specify the table-level
STORAGE_CACHE_POLICY. If you do not specify the table-levelSTORAGE_CACHE_POLICY, the default value isnone, which indicates that the hot and cold storage strategy for the partitions of the local index follows that of the data table.For a global index, you can specify the table-level
STORAGE_CACHE_POLICYand, if the index table is partitioned, the partition-levelSTORAGE_CACHE_POLICY. If you do not specify the table-levelSTORAGE_CACHE_POLICY, the default value isnone.Note
Starting from V4.4.1, if you do not specify the table-level
STORAGE_CACHE_POLICYfor a global index, the hot and cold storage strategy for the global index follows that of the primary table. If the primary table is configured with the Global strategy, the hot and cold storage strategy for the global index follows that of the primary table. If the primary table is configured with the TimeLine strategy, the hot and cold storage strategy for the global index ishot.
When you specify the table-level STORAGE_CACHE_POLICY, different attributes in the storage_cache_policy_option parameter are defined in the Key-Value format. The following table describes the attributes.
GLOBAL = {"hot" | "auto" | "cold" | "none"}: specifies the hot and cold storage strategy for the data of the index. Valid values:hot: specifies the index as a hot storage. If the cache disk space is sufficient, all data in the index table is cached to the local cache disk.auto: specifies the hot data of the index table as automatically identified by the system.cold: specifies the data of the index table as cold data. The data will not be cached to the local cache disk. For data that has been persisted to the cache, the system will actively trigger cache eviction. Subsequent accesses to the table data can be stored in macroblock memory cache, but will not be persisted to macroblock cache or microblock cache.none: the default value. The hot and cold storage strategy for the index follows that of the primary tableSTORAGE_CACHE_POLICY.
timeline_strategy_list: specifies the list of time-based strategies. Parameters are separated by commas. The time-based strategy determines the hot and cold storage strategy for partitions or macroblocks based on time. The system automatically adjusts data on the local cache disk based on the configured strategy.Note
A time-based strategy is based on the partitioning range. The system caches hot data in the partition range. When a certain time condition is met, the data in the partition is considered hot data. When you use a time-based strategy, note the following:
- Only range-partitioned tables are supported. You can use either a range partition or a range columns partition. A double range partition is not supported because it is impossible to determine which range partition to use as the time-based strategy.
- When you use a time-based strategy, the partition expression can contain only column names. Expressions are not supported. For example,
PARTITION BY RANGE COLUMNS(expr(col3))is not supported. BOUNDARY_COLUMNmust be a partition key. If the partition key consists of multiple columns,BOUNDARY_COLUMNmust be the first column, which is used to determine whether the partition has expired.
BOUNDARY_COLUMN = column_name: specifies the column used to determine hot data. Supported types include integer types (BIGINTorINT, in the Unix timestamp format) and time types (TIMESTAMP,DATE,DATETIME, orYEAR).- If the
BOUNDARY_COLUMNis of the integer type, the table can be range-partitioned or range columns-partitioned. - If the
BOUNDARY_COLUMNis of the time type, the table can be range columns-partitioned only.
- If the
BOUNDARY_COLUMN_UNIT = {"s" | "ms"}: specifies the time unit of theBOUNDARY_COLUMN. This parameter is supported only when theBOUNDARY_COLUMNis of the integer type. If you use an integer value as a timestamp, you must specify the timestamp unit. Otherwise, the system may incorrectly parse the timestamp. Valid values:- When the partition column is of the
INTtype,BOUNDARY_COLUMN_UNITcan be set tosonly. - When the partition column is of the
BIGINTtype,BOUNDARY_COLUMN_UNITcan be set tosorms.
- When the partition column is of the
HOT_RETENTION = intnum retention_time_unit: specifies the time range for hot data.intnum: specifies an integer.retention_time_unit: specifies the time unit. Valid values:YEAR: specifies years.MONTH: specifies months.WEEK: specifies weeks.DAY: specifies days.HOUR: specifies hours.MINUTE: specifies minutes.
MIXED_RETENTION = intnum retention_time_unit:- If you specify
MIXED_RETENTION, data that is older thanHOT_RETENTION + MIXED_RETENTIONis considered cold data. - If you do not specify
MIXED_RETENTION, data that is older thanHOT_RETENTIONis considered auto data. - If you do not specify
HOT_RETENTION, data that is older thanMIXED_RETENTIONis considered cold data, and data that is within the [0, MIXED_RETENTION] time range is considered auto data.
- If you specify
GRANULARITY = {"partition" | "block"}: specifies the granularity of the hot and cold storage strategy. Valid values:partition: specifies the partition as the granularity.block: specifies the macroblock as the granularity.Note
If you do not partition a table and want to manually specify a period of time as hot data, you can specify
block. For more information, see Create a macroblock-level hot and cold storage strategy.
Examples
Specify the
STORAGE_CACHE_POLICYat the table level when you create an index:Create a table named
tbl3.CREATE TABLE tbl3 (col1 INT, col2 INT, col3 INT);Create an index named
idx1_tbl3and specify that the data in the index table is hot data. The index table is automatically created when you create the index. The index table is namedidx1_tbl3by default. You can change the name of the index table by using theNAMEclause. You can also specify theSTORAGE_CACHE_POLICYfor the index table. For more information, see Create an index.CREATE INDEX idx1_tbl3 ON tbl3 (col1) LOCAL STORAGE_CACHE_POLICY (GLOBAL = "hot");
Specify that the data in the index table is hot data when you create an index for a table. The index table is automatically created when you create the index. The index table is named
idx1_tbl3by default. You can change the name of the index table by using theNAMEclause. You can also specify theSTORAGE_CACHE_POLICYfor the index table. For more information, see Create an index.CREATE TABLE tbl4 (col1 INT, col2 INT, INDEX idx2 ((col1 + 1)) STORAGE_CACHE_POLICY (GLOBAL = "hot"), UNIQUE KEY ((col1 + col2)));Specify that the data in the index table is hot data when you add an index to an existing table. The index table is automatically created when you create the index. The index table is named
idx1_tbl3by default. You can change the name of the index table by using theNAMEclause. You can also specify theSTORAGE_CACHE_POLICYfor the index table. For more information, see Create an index.ALTER TABLE tbl4 ADD INDEX idx3(col1, col2) GLOBAL STORAGE_CACHE_POLICY (GLOBAL = "hot");
References
- Overview of custom hot and cold storage strategies
- Create a partition-level hot and cold storage strategy
- Modify a table-level hot and cold storage strategy
- Modify a partition-level hot and cold storage strategy
- CREATE TABLE
- ALTER TABLE
- CREATE INDEX
- O&M guide of custom hot and cold storage strategies
