Related views
In the sys tenant, you can query the oceanbase.DBA_OB_ROOTSERVICE_EVENT_HISTORY view to check the migration, replication, and rebuild tasks scheduled by RootService.
obclient(root@sys)[oceanbase]>
select TIMESTAMP,MODULE,EVENT,VALUE1 tenant_id,VALUE2 ls_id,NAME3,VALUE3,NAME4,VALUE4,NAME5,VALUE5,NAME6,VALUE6 from DBA_OB_ROOTSERVICE_EVENT_HISTORY where module like '%disaster%';
Query result:
+----------------------------+-------------------+-----------------------+-----------+-------+---------+-----------------------------------+-------------+------------------+-------------+-------------------+----------------+-----------------------------------------+
| TIMESTAMP | MODULE | EVENT | tenant_id | ls_id | NAME3 | VALUE3 | NAME4 | VALUE4 | NAME5 | VALUE5 | NAME6 | VALUE6 |
+----------------------------+-------------------+-----------------------+-----------+-------+---------+-----------------------------------+-------------+------------------+-------------+-------------------+----------------+-----------------------------------------+
| 2023-06-07 16:12:58.254734 | disaster_recovery | start_add_ls_replica | 1 | 1 | task_id | YB4206013B6A-0005FC571364EFE0-0-0 | destination | "x.x.xx.43:2882" | data_source | "x.x.xx.106:2882" | comment | add paxos replica according to locality |
| 2023-06-07 16:14:04.202678 | disaster_recovery | finish_add_ls_replica | 1 | 1 | task_id | YB4206013B6A-0005FC571364EFE0-0-0 | destination | "x.x.xx.43:2882" | data_source | "x.x.xx.106:2882" | execute_result | ret:0; elapsed:65947960; |
+----------------------------+-------------------+-----------------------+-----------+-------+---------+-----------------------------------+-------------+------------------+-------------+-------------------+----------------+-----------------------------------------+
2 rows in set (0.04 sec)
The table displays the tasks that have been scheduled by RootService. Pay attention to key fields such as event, source, and destination.
You can view the execution result of a task in the value6 field. In the execution result, ret indicates whether the task succeeded; a non-zero value indicates the error code of the failed task. elapsed indicates the execution duration of the task. If the task fails, a ret_comment field may be displayed to indicate the failed step.
The oceanbase.DBA_OB_SERVER_EVENT_HISTORY view allows you to check whether the tasks scheduled by RootService are scheduled and executed on the corresponding nodes. For example, query the task whose task_id is YB4206013B6A-0005FC571364EFE0-0-0:
obclient(root@sys)[oceanbase]> select * from DBA_OB_SERVER_EVENT_HISTORY where VALUE5='YB4206013B6A-0005FC571364EFE0-0-0';
Query result:
+----------------------------+-----------+----------+------------+---------------------------------+-----------+--------+-------+--------+-------+-------------------+-------+------------------+---------+-----------------------------------+-------------------+--------+------------+
| TIMESTAMP | SVR_IP | SVR_PORT | MODULE | EVENT | NAME1 | VALUE1 | NAME2 | VALUE2 | NAME3 | VALUE3 | NAME4 | VALUE4 | NAME5 | VALUE5 | NAME6 | VALUE6 | EXTRA_INFO |
+----------------------------+-----------+----------+------------+---------------------------------+-----------+--------+-------+--------+-------+-------------------+-------+------------------+---------+-----------------------------------+-------------------+--------+------------+
| 2023-06-07 16:12:58.324433 | x.x.xx.43 | 2882 | storage_ha | ls_ha_start | tenant_id | 1 | ls_id | 1 | src | "x.x.xx.106:2882" | dst | "x.x.xx.43:2882" | task_id | YB4206013B6A-0005FC571364EFE0-0-0 | is_failed | 0 | ADD_LS_OP |
| 2023-06-07 16:12:58.325473 | x.x.xx.43 | 2882 | storage_ha | initial_migration_task | tenant_id | 1 | ls_id | 1 | src | "x.x.xx.106:2882" | dst | "x.x.xx.43:2882" | task_id | YB4206013B6A-0005FC571364EFE0-0-0 | is_failed | 0 | ADD_LS_OP |
| 2023-06-07 16:12:58.616840 | x.x.xx.43 | 2882 | storage_ha | start_migration_task | tenant_id | 1 | ls_id | 1 | src | "x.x.xx.106:2882" | dst | "x.x.xx.43:2882" | task_id | YB4206013B6A-0005FC571364EFE0-0-0 | data_tablet_count | 830 | ADD_LS_OP |
| 2023-06-07 16:12:58.642320 | x.x.xx.43 | 2882 | storage_ha | sys_tablets_migration_task | tenant_id | 1 | ls_id | 1 | src | "x.x.xx.106:2882" | dst | "x.x.xx.43:2882" | task_id | YB4206013B6A-0005FC571364EFE0-0-0 | is_failed | 0 | ADD_LS_OP |
| 2023-06-07 16:13:03.754195 | x.x.xx.43 | 2882 | storage_ha | data_tablets_migration_task | tenant_id | 1 | ls_id | 1 | src | "x.x.xx.106:2882" | dst | "x.x.xx.43:2882" | task_id | YB4206013B6A-0005FC571364EFE0-0-0 | is_failed | 0 | ADD_LS_OP |
| 2023-06-07 16:13:04.348244 | x.x.xx.43 | 2882 | storage_ha | tablet_group_migration_task | tenant_id | 1 | ls_id | 1 | src | "x.x.xx.106:2882" | dst | "x.x.xx.43:2882" | task_id | YB4206013B6A-0005FC571364EFE0-0-0 | tablet_count | 101 | ADD_LS_OP |
| 2023-06-07 16:13:04.360507 | x.x.xx.43 | 2882 | storage_ha | tablet_group_migration_task | tenant_id | 1 | ls_id | 1 | src | "x.x.xx.106:2882" | dst | "x.x.xx.43:2882" | task_id | YB4206013B6A-0005FC571364EFE0-0-0 | tablet_count | 101 | ADD_LS_OP |
| 2023-06-07 16:13:04.382052 | x.x.xx.43 | 2882 | storage_ha | tablet_group_migration_task | tenant_id | 1 | ls_id | 1 | src | "x.x.xx.106:2882" | dst | "x.x.xx.43:2882" | task_id | YB4206013B6A-0005FC571364EFE0-0-0 | tablet_count | 101 | ADD_LS_OP |
| 2023-06-07 16:13:04.422866 | x.x.xx.43 | 2882 | storage_ha | tablet_group_migration_task | tenant_id | 1 | ls_id | 1 | src | "x.x.xx.106:2882" | dst | "x.x.xx.43:2882" | task_id | YB4206013B6A-0005FC571364EFE0-0-0 | tablet_count | 101 | ADD_LS_OP |
| 2023-06-07 16:13:04.425482 | x.x.xx.43 | 2882 | storage_ha | tablet_group_migration_task | tenant_id | 1 | ls_id | 1 | src | "x.x.xx.106:2882" | dst | "x.x.xx.43:2882" | task_id | YB4206013B6A-0005FC571364EFE0-0-0 | tablet_count | 101 | ADD_LS_OP |
| 2023-06-07 16:13:04.426223 | x.x.xx.43 | 2882 | storage_ha | tablet_group_migration_task | tenant_id | 1 | ls_id | 1 | src | "x.x.xx.106:2882" | dst | "x.x.xx.43:2882" | task_id | YB4206013B6A-0005FC571364EFE0-0-0 | tablet_count | 101 | ADD_LS_OP |
| 2023-06-07 16:13:04.432973 | x.x.xx.43 | 2882 | storage_ha | tablet_group_migration_task | tenant_id | 1 | ls_id | 1 | src | "x.x.xx.106:2882" | dst | "x.x.xx.43:2882" | task_id | YB4206013B6A-0005FC571364EFE0-0-0 | tablet_count | 101 | ADD_LS_OP |
| 2023-06-07 16:13:04.433154 | x.x.xx.43 | 2882 | storage_ha | tablet_group_migration_task | tenant_id | 1 | ls_id | 1 | src | "x.x.xx.106:2882" | dst | "x.x.xx.43:2882" | task_id | YB4206013B6A-0005FC571364EFE0-0-0 | tablet_count | 101 | ADD_LS_OP |
| 2023-06-07 16:13:04.519100 | x.x.xx.43 | 2882 | storage_ha | tablet_group_migration_task | tenant_id | 1 | ls_id | 1 | src | "x.x.xx.106:2882" | dst | "x.x.xx.43:2882" | task_id | YB4206013B6A-0005FC571364EFE0-0-0 | tablet_count | 22 | ADD_LS_OP |
| 2023-06-07 16:13:20.152015 | x.x.xx.43 | 2882 | storage_ha | migration_finish_task | tenant_id | 1 | ls_id | 1 | src | "x.x.xx.106:2882" | dst | "x.x.xx.43:2882" | task_id | YB4206013B6A-0005FC571364EFE0-0-0 | is_failed | 0 | ADD_LS_OP |
| 2023-06-07 16:13:20.152624 | x.x.xx.43 | 2882 | storage_ha | initial_complete_migration_task | tenant_id | 1 | ls_id | 1 | src | "x.x.xx.106:2882" | dst | "x.x.xx.43:2882" | task_id | YB4206013B6A-0005FC571364EFE0-0-0 | is_failed | 0 | ADD_LS_OP |
| 2023-06-07 16:14:04.198183 | x.x.xx.43 | 2882 | storage_ha | start_complete_migration_task | tenant_id | 1 | ls_id | 1 | src | "x.x.xx.106:2882" | dst | "x.x.xx.43:2882" | task_id | YB4206013B6A-0005FC571364EFE0-0-0 | is_failed | 0 | ADD_LS_OP |
| 2023-06-07 16:14:04.198518 | x.x.xx.43 | 2882 | storage_ha | finish_complete_migration_task | tenant_id | 1 | ls_id | 1 | src | "x.x.xx.106:2882" | dst | "x.x.xx.43:2882" | task_id | YB4206013B6A-0005FC571364EFE0-0-0 | is_failed | 0 | ADD_LS_OP |
| 2023-06-07 16:14:04.203593 | x.x.xx.43 | 2882 | storage_ha | ls_ha_finish | tenant_id | 1 | ls_id | 1 | src | "x.x.xx.106:2882" | dst | "x.x.xx.43:2882" | task_id | YB4206013B6A-0005FC571364EFE0-0-0 | is_failed | 0 | ADD_LS_OP |
+----------------------------+-----------+----------+------------+---------------------------------+-----------+--------+-------+--------+-------+-------------------+-------+------------------+---------+-----------------------------------+-------------------+--------+------------+
19 rows in set (0.40 sec)
For example, use the oceanbase.DBA_OB_SERVER_EVENT_HISTORY view to check whether a task is stuck.
Obtain the list of tasks that have been scheduled by RootService but have not been completed (
tenant_id,ls_id):obclient(root@sys)[oceanbase]> (select value1, value2 from DBA_OB_ROOTSERVICE_EVENT_HISTORY where event like '%add_ls%') except (select value1, value2 from DBA_OB_SERVER_EVENT_HISTORY where module like 'storage_ha' and event like '%finish_complete%');Obtain and analyze the migration or replication tasks that have reported errors:
obclient(root@sys)[oceanbase]> select * from DBA_OB_SERVER_EVENT_HISTORY where event like '%migrat%' and name6 like '%fail%' and value6=1;
The DBA_OB_LS_REPLICA_TASKS view records all running migration, replication, and rebuild tasks. If you suspect an issue with a task, you can use its task_id to search for logs during the corresponding period.
Troubleshoot the error reported by a historical task
You can execute the following SQL statement to view the execution history of migration, replication, and rebuild tasks for a log stream:
obclient(root@sys)[oceanbase]> select * from DBA_OB_ROOTSERVICE_EVENT_HISTORY
where module like "%disaster%" and value1 = $TENANT_ID and value2 = $LS_ID;
The name6 and value6 fields of a finish event record the execution result of the task: a ret value of 0 indicates that the task succeeded; a non-zero value indicates the error code of the failed task. The elapsed field records the execution duration of the task. The ret_comment field describes the cause of the failure. If no failure information is displayed, you can search for relevant logs on the RootService node using the corresponding task_id:
grep "$TASK_ID" rootservice.log*