Clean up data

2025-02-26 10:30:37  Updated

This topic describes how to clean up archived data by submitting a ticket.

Background information

After you archive data in the source database to the destination database, OceanBase Developer Center (ODC) allows you to delete the data from the source database to improve the query performance and reduce the online storage costs.

The example in this topic describes how to create a data cleanup task to clean up the employee table in the test2 database in ODC.

Note

The data used in this topic are examples. You can replace them with actual data as needed.

Prerequisites

  • The table to be cleaned up has a primary key.

Considerations

  • Pay attention to the following rules:

    • CPU and memory exhaustion prevention is not supported for a MySQL data source.
  • Data sources supported for data cleanup are as follows:

    • OceanBase MySQL data sources

    • OceanBase Oracle data sources

    • MySQL data sources

    • Oracle data sources

    • PostgreSQL data sources

  • Data cleanup is not supported in the following cases:

    • The source table in the MySQL or OceanBase MySQL data source does not have a primary key or non-null unique index.

    • The source table in the OceanBase MySQL data source contains fields of the XMLType data type.

    • The source table in the Oracle, OceanBase Oracle, or PostgreSQL data source does not have a primary key.

    • The source table in the OceanBase Oracle data source contains fields of the JSON or XMLType data type.

    • The archiving condition contains a LIMIT clause.

    • The table contains a foreign key.

Create a data cleanup task

  1. In the SQL window, create a table named employee by using an SQL statement.

    1

    CREATE TABLE `employee` (
    `emp_no` int(120) NOT NULL COMMENT 'employee number',
     `birthday` date DEFAULT NULL COMMENT 'employee birthday',
     `name` varchar(120) DEFAULT NULL COMMENT 'employee name',
     `time` date NOT NULL COMMENT 'time' COMMENT 'create time',
     PRIMARY KEY (`time`)
    ) partition by range columns(time) (
     partition p2023_01
     values
      less than ('2023-01-01'),
      partition p2023_02
    values
      less than ('2023-02-01')
    )
    
  2. On the Tickets tab in the left-side navigation pane of the homepage, click Data Cleanup and then click Create New .

    2

  3. On the Create Data Cleanup Task page, configure the following parameters.

    3

    Parameter Description
    Need Verification Before Cleanup You can select this option to choose to verify whether the data in the table to be cleaned up is consistent with that in the destination table. If not, the cleanup will not be executed.
    Source Database The database to which the table belongs.
    Target Database The database to which the destination table belongs.
    Cleanup Scope
    • Partially Cleaned : specifies to clean up only tables that meet filtering conditions in the database.
      • You can configure filtering conditions by using constants or referencing variables defined in Custom Variable . For example, in time<'${create_time}', create_time is the name of a variable configured in Custom Variable and time is a field in the table to be cleaned up.
      • You can select Specify Partition and specify the partitions to be cleaned up.
    • Whole Library Cleaning: specifies to clean up all tables in the database.
    Custom Variable Optional. You can define variables and set time offsets to filter rows to be cleaned up.
    Execution Mode The execution mode of the task. Valid values: Execute Now , Scheduled Execution , and Periodic Execution .
    Task Settings Configure a throttling strategy.
    • Specify Task Duration : You can select this option and specify a duration for the task. If the task is not completed within the specified duration, it will be suspended and wait for the next scheduling.
    • Search strategy : You can select full table scan or condition matching.
    • Row Limit : limits the total number of data rows operated per second.
    • Data Size Limit : limits the total size of data operated per second.
    • Use Primary Key for Cleanup : specifies whether to use the primary key to clean up data.
    Description Optional. Additional information about the task, which cannot exceed 200 characters in length.
  4. Click Create , preview the SQL statement for data cleanup, and click OK .

    4

  5. After the task is generated, choose Tickets > Data Cleanup to view the task.

    5

View a data cleanup task

Task information

  1. In the data cleanup task list on the Tickets tab, click View in the Actions column of a task.

  2. In the task details panel, click the Task Information tab and view information such as the task type, source database, destination database, variable configurations, cleanup scope, and execution mode.

    6

Execution records

In the task details panel, click the Execution Records tab and view the task status and execution details.

7

Operation records

In the task details panel, click the Operating Records tab and view the task approval status and operation records.

8

References

Contact Us