This topic describes the optimal data archiving configurations for a table with consistent data without other factors affecting the archiving process.
Optimal configuration for archiving a large table
Scenario 1: 4C8G tenant specification (small)
Table size: 70 million rows (about 85 GB), archiving 50 million rows
- Throttling configuration:
- Row throttling: 100,000 rows/s
- Bandwidth throttling: 40 MB/s
- Thread configuration:
- Read threads: 6
- Write threads: 12 (Read:Write = 1:2)
- Performance:
- Time consumed: 35 minutes and 38 seconds (minimum)
- Read performance: 32,580 rows/s
- Write performance: 26,097 rows/s
- Resource usage:
- CPU: 100%
- Memory: 100%
- Target and source clusters:
- Cluster status: Normal, no jitter
Scenario 2: 8C16G tenant specification (medium)
Table size: 110 million rows, archiving 50 million rows
Throttling configuration:
- Row throttling: 800,000 rows/s
- Bandwidth throttling: 80 MB/s
Thread configuration:
- Read threads: 36
- Write threads: 72 (Read:Write = 1:2)
Performance:
- Time consumed: 1 hour and 20 minutes
Note: Further adjustments are not possible due to the tenant specification.
Scenario 3: 16C32G tenant specification (large)
Table size: 110 million rows, archiving 50 million rows
Throttling configuration:
- Row throttling: 300,000 rows/s
- Bandwidth throttling: 80 MB/s
Thread configuration:
- Read threads: 12
- Write threads: 24 (Read:Write = 1:2)
Performance:
- Time consumed: 29 minutes (optimal)
Note: When the source tenant specification remains unchanged, adjusting the target tenant specification to 16C32G can reduce the time by half.
Optimal configuration for archiving a small table
Scenario 1: 4C8G tenant specification
Table size: 110 million rows, archiving 50 million rows
Throttling configuration:
- Row throttling: 100,000 rows/s
- Bandwidth throttling: 10 MB/s
Thread configuration:
- Read threads: 6
- Write threads: 12 (Read:Write = 1:2)
Performance:
- Time consumed: 21 minutes and 5 seconds (minimum)
- Read performance: 101,040 rows/s
- Write performance: 43,968 rows/s
Scenario 2: 8C16G tenant specification
Table size: 110 million rows, archiving 50 million rows
- Throttling configuration:
- Row throttling: 800,000 rows/s
- Bandwidth throttling: 80 MB/s
- Thread configuration:
- Read threads: 24
- Write threads: 56 (Read:Write ≈ 1:2.3)
- Performance:
- Time consumed: less than 10 minutes
- Maximum read performance: 286,000 rows/s
- Maximum write performance: 110,000 rows/s
- Note: Further adjustments are not possible due to the shard speed limit.
Optimal configuration for archiving a table with large fields
Scenario: 8C16G tenant specification
- Table structure: Contains large fields (varchar(4000), varchar(1024), etc.)
- Table size: 110 million rows, archiving 50 million rows
- Performance:
- Maximum read performance: 35,000 rows/s
- Maximum write performance: 20,000 rows/s
- Time consumed: 1 hour and 20 minutes (optimal)
- Note: The performance of a table with large fields is significantly lower than that of a regular table, and is affected by the size of the fields.
Recommended configuration
General configuration principles
- Thread configuration: Read threads > Write threads = 1:2.
- Throttling configuration: Adjust based on the tenant specification and table type.
- Resource usage: CPU and memory usage can reach 100%, but the cluster must remain stable.
Recommended configuration based on tenant specification
| Tenant specification | Table type | Data volume (archived data volume) | Row throttling | Bandwidth throttling | Read threads | Write threads | Expected time consumed |
|---|---|---|---|---|---|---|---|
| 4C8G | Large table | 110 million, 50 million | 100,000 rows/s | 40 MB/s | 6 | 12 | ~1 hour and 25 minutes |
| 4C8G | Small table | 110 million, 50 million | 100,000 rows/s | 40 MB/s | 6 | 12 | ~21 minutes |
| 8C16G | Large table | 110 million, 50 million | 100,000 rows/s | 40 MB/s | 36 | 72 | ~1 hour and 20 minutes |
| 8C16G | Small table | 110 million, 50 million | 800,000 rows/s | 80 MB/s | 24 | 56 | <10 minutes |
Summary
The optimal data archiving configuration requires considering the following factors:
- Tenant specification (4C8G / 8C16G / 16C32G)
- Table type (large table / small table / table with large fields)
- Thread configuration (Read:Write = 1:2)
- Throttling configuration (row throttling + bandwidth throttling)
- ODC recommends the following configuration for a 4C8G cluster: 100,000 rows/s and 40 MB/s for row and bandwidth throttling, respectively, with a thread configuration of (Read:Write) 6:12.
Core principle: Maximize the use of tenant resources while ensuring cluster stability, and achieve optimal archiving performance through reasonable thread and throttling configurations.
