OceanBase Database's log module generates three types of log files: observer.log, election.log, and rootservice.log. By default, logs with a severity level of WDIAG or higher are recorded. Each log file also has a corresponding WARNING log file with a .wf suffix (such as observer.log.wf, election.log.wf, rootservice.log.wf) that only contains logs with a severity level of WARN or higher.
This topic provides guidance on how to interpret commonly used logs. The log files observer.log, election.log, and rootservice.log are all stored in the $work_dir/log directory on the OBServer node.
Applicable versions
This topic applies to OceanBase Database V2.x and V3.x.
Log levels
OceanBase Database V2.x and V3.2.3
OceanBase Database V2.x and V3.2.3 support six log levels, as shown in the table below. The levels are listed in descending order of severity.
Log level Description ERROR Critical errors. Logs at this level record information about system faults that must be troubleshoot. Otherwise, the system will be unavailable. USER_ERROR Errors caused by user input. WARN Warnings. Logs at this level record potential errors that may occur. INFO General information. Logs at this level record the current operation status of the system. TRACE More detailed event information compared with logs at the INFO level. DEBUG Debugging information. Logs at this level record details about the operation status of the system during debugging, including the names, parameters, variables, and return values of the functions that are called. OceanBase Database V3.2.4
OceanBase Database V3.2.4 supports seven log levels, as shown in the table below. The levels are listed in descending order of severity.
Log level Description ERROR Critical errors. Logs at this level record information about system faults that must be troubleshoot. Otherwise, the system will be unavailable. WARN Warnings. Logs at this level record potential errors that may occur. INFO General information. Logs at this level record the current operation status of the system. EDIAG Error diagnostic information, which usually corresponds to a bug of OceanBase Database. WDIAG Warning diagnostic information indicating expected errors that OceanBase Database can tolerate. TRACE More detailed event information compared with logs at the INFO level. DEBUG Debugging information. Logs at this level record details about the operation status of the system during debugging, including the names, parameters, variables, and return values of the functions that are called.
Query logs
You can use the grep command to query logs. The syntax is as follows:
[admin@OB /home/admin/oceanbase/log] $grep [keyword] observer.log
[admin@OB /home/admin/oceanbase/log] $grep [keyword] rootservice.log
[admin@OB /home/admin/oceanbase/log] $grep [keyword] election.log
Log formats
The log data format is as follows. Note that the exact format may vary depending on the context.
[time] log_level [module_name] (file_name:fine_no) [thread_id][Y_trace_id0-trace_id1] [lt=last_log_print_time] [dc=dropped_log_count] log_data_
[time] log_level [module_name] function_name (file_name:fine_no) [thread_id][Y_trace_id0-trace_id1] [lt=last_log_print_time] [dc=] log_data_
Here are some examples:
observer.log
[admin@OB /home/admin/oceanbase/log] $tail -f -n 1 observer.logThe return result is as follows:
[2016-07-17 14:18:04.845802] INFO [RPC.OBMYSQL] obsm_handler.cpp:191 [9543][Y0-0] [lt=47] [dc=0] connection close(easy_connection_str(c)="xxx.xxx.xxx.xxx:56854_-1_0x7fb8a9171b68", version=0, sessid=2147562562, tenant_id=1, server_id=1, is_need_clear_sessid_=true, ret=0)observer.log.wf
[admin@OB /home/admin/oceanbase/log] $tail -f -n 1 observer.log.wfThe return result is as follows:
[2016-07-17 14:18:28.431351] WARN [SQL.SESSION] set_conn (ob_basic_session_info.cpp:2568) [8541][xxxxxxxxxxxx-xxxxxxxx] [lt=16] [dc=0] debug for set_conn(conn=0x7fb8a9171b68, lbt()="0x4efe71 0x818afd 0xe9ea5b 0x721fc8 0x13747bc 0x2636db0 0x2637d68 0x5054e9 0x7fb98705aaa1 0x7fb9852cc93d ", magic_num_=324478056, sessid_=2147562617, version_=0)rootservice.log
[admin@OB /home/admin/oceanbase/log] $tail -f -n 1 rootservice.logThe return result is as follows:
[2016-07-17 14:18:53.701463] INFO [RS] ob_server_table_operator.cpp:345 [8564][Y0-0] [lt=11] [dc=0] svr_status(svr_status="active", display_status=1)rootservice.log.wf
[admin@OB /home/admin/oceanbase/log] $tail -f -n 1 rootservice.log.wfThe return result is as follows:
[2016-07-16 02:02:12.847602] WARN [RS] choose_leader (ob_leader_coordinator.cpp:2067) [8570][xxxxxxxxxxxx-xxxxxxxx] [lt=8] [dc=0] choose leader info with not same candidate num(tenant_id=1005, server="xxx.xxx.xxx.xxx:2882", info={original_leader_count:0, primary_zone_count:0, cur_leader_count:1, candidate_count:1, in_normal_unit_count:1})
The following table describes the parameters in logs.
| Parameter | Description |
|---|---|
| time | The time when the log was generated. |
| log_level | The level of the log. |
| module_name | The module that contains the statement corresponding to the log. |
| function_name | The function that contains the statement corresponding to the log. |
| file_name | The file that contains the statement corresponding to the log. |
| file_no | The row number of the statement corresponding to the log. |
| thread_id | The ID of the thread corresponding to the log. |
| trace_id0-trace_id1 | The trace ID of the log, which consists of trace_id0 and trace_id1. A trace ID can be transmitted between OBServer nodes through remote procedure calls (RPCs). Therefore, you can obtain associated logs of OBServer nodes based on trace IDs. |
| last_log_print_time | OBServer nodes support synchronous and asynchronous logs. When synchronous logs are enabled, this parameter indicates the amount of time spent on recording the previous log. When asynchronous logs are enabled, this parameter indicates the amount of time spent on formatting the current log. |
| dropped_log_count | The number of dropped logs. |
| log_data | The detailed log data. |
Examples of log interpretation
observer.log
Search keyword: [NOTICE]
Log description: key log information during the startup of an OBServer node.
[admin@OB /home/admin/oceanbase/log] $grep "NOTICE" observer.log
The return result is as follows:
[2023-05-11 14:19:09.703272] INFO [SERVER] ob_server.cpp:533 [95303][0][Y0-0000000000000000-0-0] [lt=9] [dc=0] [NOTICE] server instance start succeed
[2023-05-11 14:19:25.555836] INFO [SERVER] ob_server.cpp:609 [95303][0][Y0-0000000000000000-0-0] [lt=16] [dc=0] [NOTICE] observer start service(start_service_time=1683785965555827)
The following table describes the information in the logs.
| Log information | Description |
|---|---|
| server instance start succeed | The observer process is successfully started. |
| observer start service | The OBServer node starts to provide services. start_service_time indicates the timestamp when the OBServer node starts to provide services. |
Process memory statistics
Search keyword:
[CHUNK_MGR]Log description: memory usage statistics of the observer process from the system perspective, which are printed every 10s.
[admin@OB /home/admin/oceanbase/log] $grep "CHUNK_MGR" observer.logThe return result is as follows:
[2023-05-15 16:24:19.728623] INFO [COMMON] ob_tenant_mgr.cpp:643 [87101][0][Y0-0000000000000000-0-0] [lt=7] [dc=0] [CHUNK_MGR] free=40593 pushes=8989641 pops=8949048 limit=107,978,014,720 hold=106,384,326,656 used= 21,254,635,520 freelist_hold= 85,129,691,136 maps=602,804 unmaps=587,963 large_maps=599,637 large_unmaps=587,963 memalign=0 virtual_memory_used=111,853,719,552Log interpretation
Log information Description limit The total memory available for the OBServer node, in bytes. hold The total memory that the OBServer node obtains from the operating system, in bytes. used The memory that is used by the OBServer node, including the cache memory, in bytes. freelist_hold The size of unused memory on the OBServer node, in bytes. maps and unmaps The mapsandunmapsvalues, in bytes, monitored by the OBServer node.large_maps and large_unmaps The mapsandunmapsvalues, in bytes, monitored by the OBServer node for data blocks sized 4 MB to 20 MB.Tenant memory statistics
Search keyword:
[MEMORY]orob_malloc_allocator.cppLog description: memory usage statistics of the tenant from the system perspective, which are printed every 10s.
[admin@OB /home/admin/oceanbase/log] $grep "MEMORY" observer.logor
[admin@OB /home/admin/oceanbase/log] $grep "ob_malloc_allocator" observer.logThe return result is as follows:
[2023-06-05 12:09:21.701235] INFO [LIB] ob_malloc_allocator.cpp:495 [88518][0][Y0-0000000000000000-0-0] [lt=5] [dc=0] [MEMORY] tenant: 1, limit: 12,884,901,888 hold: 8,524,922,880 rpc_hold: 0 cache_hold: 6,773,800,960 cache_used: 6,773,800,960 cache_item_count: 3,230 [MEMORY] ctx_id= DEFAULT_CTX_ID hold_bytes= 658,505,728 [MEMORY] ctx_id= MEMSTORE_CTX_ID hold_bytes= 383,778,816 [MEMORY] ctx_id= EXECUTE_CTX_ID hold_bytes= 16,777,216 [MEMORY] ctx_id= SQL_EXEC_CTX_ID hold_bytes= 77,594,624 [MEMORY] ctx_id= TRANS_CTX_MGR_ID hold_bytes= 8,388,608 [MEMORY] ctx_id= REPLAY_STATUS_CTX_ID hold_bytes= 6,291,456 [MEMORY] ctx_id= PLAN_CACHE_CTX_ID hold_bytes= 553,648,128 [MEMORY] ctx_id= WORK_AREA hold_bytes= 27,262,976 [MEMORY] ctx_id= TX_CALLBACK_CTX_ID hold_bytes= 18,874,368 [2023-06-05 12:09:21.701307] INFO [LIB] ob_tenant_ctx_allocator.cpp:167 [88518][0][Y0-0000000000000000-0-0] [lt=16] [dc=0] [MEMORY] tenant_id= 1 ctx_id= DEFAULT_CTX_ID hold= 658,505,728 used= 553,301,808 [MEMORY] hold= 371,125,248 used= 368,276,480 count= 139 avg_used= 2,649,471 mod=MysqlRequesReco [MEMORY] hold= 52,416,000 used= 52,403,200 count= 25 avg_used= 2,096,128 mod=OB_KVSTORE_CACHE [MEMORY] hold= 49,029,760 used= 48,846,080 count= 24 avg_used= 2,035,253 mod=SqlPlanMon [MEMORY] hold= 35,543,040 used= 35,192,202 count= 534 avg_used= 65,903 mod=SqlDtl [MEMORY] hold= 14,696,192 used= 12,591,876 count= 14 avg_used= 899,419 mod=LOCALDEVICE [MEMORY] hold= 8,386,560 used= 8,384,512 count= 4 avg_used= 2,096,128 mod=Election [MEMORY] hold= 6,298,560 used= 6,281,216 count= 257 avg_used= 24,440 mod=BlockMap [MEMORY] hold= 6,289,920 used= 6,288,384 count= 3 avg_used= 2,096,128 mod=ElectionGroup [MEMORY] hold= 2,154,240 used= 408,600 count= 24,952 avg_used= 16 mod=NumberLog interpretation
Log information Description tenant The ID of the tenant. limit The maximum memory for the tenant, in bytes. hold The total memory that the tenant obtains from the operating system, in bytes. rpc_hold The memory occupied by RPC threads, in bytes. cache_hold The memory occupied by KV caches, in bytes. cache_hold The memory used by KV cache, in bytes. cache_item_count The number of KV caches. Tenant disk statistics
Search keyword:
ob_tenant_disk_usage_mgr.cppordump tenant disk usage update infoLog description: disk usage information of the tenant, which is printed every 10s.
[admin@OB /home/admin/oceanbase/log] $grep "ob_tenant_disk_usage_mgr.cpp" observer.logor
[admin@OB /home/admin/oceanbase/log] $grep "dump tenant disk usage update info" observer.logThe return result is as follows:
[2023-05-15 17:30:59.055858] INFO [STORAGE] ob_tenant_disk_usage_mgr.cpp:69 [87424][0][Y0-0000000000000000-0-0] [lt=11] [dc=0] dump tenant disk usage update info(disk_usage={tenant_id:1, max_disk_size:0, used_data_size:935329792, used_meta_size:56623104})Log interpretation
Log information Description tenant_id The ID of the tenant. max_disk_size The maximum disk usage allowed for the tenant. It is not limited in OceanBase Database of versions earlier than V4.x. used_data_size The size of disk space occupied by user data in the tenant. used_meta_size The size of disk space occupied by management data in the tenant. Tenant thread queue statistics
Search keyword:
ob_multi_tenant.cppordump tenant infoLog description: thread queue statistics of the tenant, which are printed every 10s.
[admin@OB /home/admin/oceanbase/log] $grep "ob_multi_tenant.cpp" observer.logor
[admin@OB /home/admin/oceanbase/log] $grep "dump tenant info" observer.logThe return result is as follows:
[2023-05-15 17:31:43.710008] INFO [SERVER.OMT] ob_multi_tenant.cpp:812 [88299][0][Y0-0000000000000000-0-0] [lt=8] [dc=0] dump tenant info(tenant={id:1001, compat_mode:1, unit_min_cpu:"3.000000000000000000e+00", unit_max_cpu:"3.000000000000000000e+00", slice:"0.000000000000000000e+00", slice_remain:"0.000000000000000000e+00", token_cnt:12, ass_token_cnt:12, lq_tokens:1, used_lq_tokens:0, stopped:false, idle_us:9509657, recv_hp_rpc_cnt:0, recv_np_rpc_cnt:0, recv_lp_rpc_cnt:0, recv_mysql_cnt:0, recv_task_cnt:0, recv_large_req_cnt:0, tt_large_quries:0, actives:12, workers:12, nesting workers:7, lq waiting workers:0, req_queue:total_size=0 queue[0]=0 queue[1]=0 queue[2]=0 queue[3]=0 queue[4]=0 queue[5]=0 , large queued:0, multi_level_queue:total_size=0 queue[0]=0 queue[1]=0 queue[2]=0 queue[3]=0 queue[4]=0 queue[5]=0 queue[6]=0 queue[7]=0 })Log interpretation
Log information Description compat_mode The compatibility mode of the tenant. unit_min_cpu The minimum number of CPU cores. unit_max_cpu The maximum number of CPU cores. slice The size of slices for resource units of the tenant. slice_remain The size of remaining slices. token_cnt The number of tokens allocated by the scheduler. Each token is converted into a worker thread. ass_token_cnt The number of tokens confirmed by the tenant. You can confirm the number of tokens based on the token_cntparameter. Typically, the values of the two parameters are the same.lq_tokens The number of large query tokens. used_lq_tokens The number of worker threads that hold large query tokens. stopped Indicates whether resource units of the tenant are being deleted. idle_us The total idle time of the worker threads in one round (10s). The idle time includes only the wait time in queues. recv_hp_rpc_cnt
recv_np_rpc_cnt
recv_lp_rpc_cntThe cumulative numbers of RPC requests received by the tenant at different levels, including hp(High), np(Normal), and lp(Low). recv_mysql_cnt The cumulative number of MySQL requests received by the tenant. recv_task_cnt The cumulative number of internal tasks received by the tenant. recv_large_req_cnt The cumulative number of large requests predicted by the tenant. The value of this parameter is only incremented and will not be cleared. The value is incremented upon retries. tt_large_quries The cumulative number of large requests processed by the tenant. The value of this parameter is only incremented and will not be cleared. The value is incremented upon checks. actives The number of active worker threads. workers The number of SQL worker threads of the tenant. nesting workers The number of dedicated threads held by the tenant for nested requests. A total of seven threads are available, which correspond to seven nesting levels. lq waiting workers The number of worker threads waiting for scheduling. req_queue The total number of requests waiting in the queue. This field is equivalent to total_size.large queued The number of large query requests waiting in the queue. multi_level_queue The worker queues to accommodate nested requests. queue[1]toqueue[7]correspond to the seven nesting levels and each indicate the number of queued requests at the corresponding level.queue[0]is not used at present.RPC thread statistics
Search keyword:
ob_net_easy.cppor[RPC EASY STAT]Log description: RPC thread statistics, which are printed every 1s. Each log represents the statistics of one RPC thread.
[admin@OB /home/admin/oceanbase/log] $grep "ob_net_easy.cpp" observer.logor
[admin@OB /home/admin/oceanbase/log] $grep "RPC EASY STAT" observer.logThe return result is as follows:
[2023-05-15 18:28:17.681483] INFO [RPC.FRAME] ob_net_easy.cpp:601 [88307][0][Y0-0000000000000000-0-0] [lt=18] [dc=0] [RPC EASY STAT](log_str="conn count=3/4, request done=7404635/5179373, request doing=0/0") [2023-05-15 18:28:17.682056] INFO [RPC.FRAME] ob_net_easy.cpp:601 [88310][0][Y0-0000000000000000-0-0] [lt=8] [dc=0] [RPC EASY STAT](log_str="conn count=3/3, request done=7402021/5162953, request doing=0/0") [2023-05-15 18:28:17.682108] INFO [RPC.FRAME] ob_net_easy.cpp:601 [88309][0][Y0-0000000000000000-0-0] [lt=11] [dc=0] [RPC EASY STAT](log_str="conn count=3/3, request done=7405199/5164208, request doing=0/0") [2023-05-15 18:28:17.682108] INFO [RPC.FRAME] ob_net_easy.cpp:601 [88308][0][Y0-0000000000000000-0-0] [lt=9] [dc=0] [RPC EASY STAT](log_str="conn count=3/3, request done=7402609/5163322, request doing=0/0")Log interpretation
Log information Description conn count The number of inbound and outbound connections served by the RPC thread. request done The number of inbound and outbound requests completed by the RPC thread. request doing The number of inbound and outbound requests being processed by the RPC thread. If no RPC thread is available (request doing=0/0), all RPC threads are busy.
Transaction context statistics
Search keyword:
ObTransCtx statisticsLog description: statistics of the numbers of allocated transaction contexts, which are printed every 5s.
[admin@OB /home/admin/oceanbase/log] $grep "ObTransCtx statistics" observer.logThe return result is as follows:
[2023-05-15 17:29:07.955826] INFO [STORAGE.TRANS] ob_trans_factory.cpp:175 [88157][0][YB420BA6506F-0005FB92633E1702-0-0] [lt=11] [dc=0] ObTransCtx statistics(active_sche_ctx_count=0, active_coord_ctx_count=0, active_part_ctx_count=1, total_release_part_ctx_count=156)Log interpretation
Log information Description active_sche_ctx_count The number of active transaction contexts of the scheduler. active_coord_ctx_count The number of active transaction contexts of the coordinator. active_part_ctx_count The number of active transaction contexts of the participant. total_release_part_ctx_count The number of released transaction contexts. Each OBServer node allows for a maximum of 0.1 million scheduler contexts and 0.7 million participant contexts.
Disk I/O statistics
Search keyword:
Current io stat(in OceanBase Database V2.x) orcurrent io status(in OceanBase Database V3.x)Log description: I/O statistics, which are printed every 10s.
Log interpretation in OceanBase Database V2.x
[admin@OB /home/admin/oceanbase/log] $grep "Current io stat" observer.logThe return result is as follows:
[2022-12-27 17:28:19.298670] INFO [COMMON] ob_io_disk.cpp:497 [47844][406][Y0-0000000000000000] [lt=59] [dc=0] Current io stat, (fd={fd:875, disk_id:{disk_idx:0, install_seq:0}}, io_conf={sys_io_low_percent:90, sys_io_high_percent:90, user_iort_up_percent:2000, cpu_high_water_level:7680, write_failure_detect_interval:60000000, read_failure_black_list_interval:300000000, data_storage_warning_tolerance_time:10000000, data_storage_error_tolerance_time:300000000, disk_io_thread_count:8, callback_thread_count:8, large_query_io_percent:0, data_storage_io_timeout_ms:10000}, sys_io_percent=90, sys_iops_up_limit=266, user_max_rt=1.151363996887207031e+04, cpu_estimator={avg_usage:378}, user_read_io_stat={average_size:18245, average_rt_us:"4.766969696969696884e+02", old_stat:{io_cnt:5435154700, io_bytes:92266120699904, io_rt_us:2896942021380}, new_stat:{io_cnt:5435154733, io_bytes:92266121302016, io_rt_us:2896942037111}}, user_write_io_stat={average_size:0, average_rt_us:"0.000000000000000000e+00", old_stat:{io_cnt:790, io_bytes:1568509952, io_rt_us:1369408}, new_stat:{io_cnt:790, io_bytes:1568509952, io_rt_us:1369408}}, sys_read_io_stat={average_size:0, average_rt_us:"0.000000000000000000e+00", old_stat:{io_cnt:29673855, io_bytes:62197411557376, io_rt_us:62528978207}, new_stat:{io_cnt:29673855, io_bytes:62197411557376, io_rt_us:62528978207}}, sys_write_io_stat={average_size:0, average_rt_us:"0.000000000000000000e+00", old_stat:{io_cnt:24420687, io_bytes:51213892583424, io_rt_us:56376648298}, new_stat:{io_cnt:24420687, io_bytes:51213892583424, io_rt_us:56376648298}}, large_query_read_io_stat={average_size:0, average_rt_us:"0.000000000000000000e+00", old_stat:{io_cnt:0, io_bytes:0, io_rt_us:0}, new_stat:{io_cnt:0, io_bytes:0, io_rt_us:0}}, large_query_write_io_stat={average_size:0, average_rt_us:"0.000000000000000000e+00", old_stat:{io_cnt:0, io_bytes:0, io_rt_us:0}, new_stat:{io_cnt:0, io_bytes:0, io_rt_us:0}})Log interpretation
Log information Description fd and disk_id This part includes the file descriptor number ( fd), the disk index number (disk_idx), and the installation serial number (install_seq).io_config This part includes system I/O parameters, such as sys_io_low_percent(minimum system I/O percentage),sys_io_high_percent(maximum system I/O percentage),user_iort_up_percent(maximum user I/O),cpu_high_water_level(maximum CPU utilization),disk_io_thread_count(number of disk I/O threads),callback_thread_count(number of callback threads),large_query_io_percent(percentage of large query I/Os), anddata_storage_io_timeout_ms(timeout period for data storage I/Os, in ms).sys_io_percent The maximum I/O resource usage. sys_iops_up_limit The maximum IOPS of I/O operations on data blocks sized 2 MB, which must be consistent with io_resource.confin the I/O bench test result.user_max_rt The maximum response time of user I/O operations. cpu_estimator(avg_usage) The average CPU utilization of user I/O operations. - user_read_io_stat
- user_write_io_stat
- sys_read_io_stat sys_write_io_stat
- large_query_read_io_stat
- large_query_write_io_stat
This part includes different types of I/O statistics, such as the average I/O size ( average_size), average response time (average_rt_us), old and new statistics, including the number of I/O operations (io_cnt), number of bytes involved in I/O operations (io_bytes), and response time of I/O operations (io_rt_us).io_ability This part includes the I/O performance test results, such as the IOPS and response time of I/O operations in different modes (read or write) and of different sizes. - log_io
- user_io sys_io
- prewarm_io
- large_query_io
This part includes the status and device capabilities of different types of I/O operations, such as whether the system is initialized ( is_inited_), device capabilities (device_ability_), the percentage of shared I/O resources (share_percentage_), and the deadline timestamp (last_deadline_ts_).- log_read
- log_write user_read
- user_write sys_read
- sys_write prewarm_read
- prewarm_write
- large_query_read
- large_query_write
This part includes statistics of different types of I/O operations, such as the start IOPS ( start_iops), IOPS of canceled I/O operations (cancel_iops), IOPS of failed I/O operations (fail_iops), IOPS of successful I/O operations (succ_iops), average byte size (avg_byte), device response time (device_rt_us), and queue response time (queue_rt_us).avg_cpu The average CPU utilization of the system. OceanBase Database allows you to control I/O throttling by using the following two parameters:
sys_bkgd_io_low_percentageandsys_bkgd_io_high_percentage, which respectively specify the lower and upper limits of thesys_io_percentfield.You can determine whether I/O operations are throttled based on the preceding I/O statistics as follows:
If the values of
sys_io_high_percentandsys_io_low_percentare inconsistent with those ofsys_bkgd_io_high_percentageandsys_bkgd_io_low_percentage, I/O throttling is enabled.If the
average_sizevalue inuser_read_io_statis0, I/O operations are throttled.
Log interpretation in OceanBase Database V3.x
[admin@OB /home/admin/oceanbase/log] $grep "current io status" observer.logThe return result is as follows:
[2023-06-21 16:54:14.212314] INFO [COMMON] ob_io_struct.cpp:2129 [89617][0][Y0-0000000000000000-0-0] [lt=19] [dc=0] current io status(io_config={sys_io_low_percent_:70, sys_io_high_percent_:90, user_iort_up_percent_:100, cpu_high_water_level_:2560, write_failure_detect_interval_:60000000, read_failure_black_list_interval_:300000000, data_storage_warning_tolerance_time_:30000000, data_storage_error_tolerance_time_:300000000, disk_io_thread_count_:8, callback_thread_count_:8, large_query_io_percent_:0, data_storage_io_timeout_ms_:120000, io_memory_limit_:2147483648, enable_io_tracer_:false}, io_ability={measure_items:[[{mode:0, size:4096, iops:120191.00, rt_us:180.58}, {mode:0, size:8192, iops:155470.00, rt_us:198.67}, {mode:0, size:16384, iops:122040.00, rt_us:259.74}, {mode:0, size:32768, iops:76429.00, rt_us:416.30}, {mode:0, size:65536, iops:41826.00, rt_us:762.89}, {mode:0, size:131072, iops:21994.00, rt_us:1452.21}, {mode:0, size:262144, iops:11247.00, rt_us:2841.93}, {mode:0, size:524288, iops:5649.00, rt_us:5658.40}, {mode:0, size:1048576, iops:2828.00, rt_us:11302.68}], [{mode:1, size:4096, iops:127929.00, rt_us:247.71}, {mode:1, size:8192, iops:120343.00, rt_us:263.40}, {mode:1, size:16384, iops:115208.00, rt_us:275.45}, {mode:1, size:32768, iops:76882.00, rt_us:414.05}, {mode:1, size:65536, iops:39596.00, rt_us:805.95}, {mode:1, size:131072, iops:20989.00, rt_us:1522.00}, {mode:1, size:262144, iops:10634.00, rt_us:3004.64}, {mode:1, size:524288, iops:5355.00, rt_us:5970.53}, {mode:1, size:1048576, iops:2692.00, rt_us:11877.67}, {mode:1, size:2097152, iops:1309.00, rt_us:24402.57}]]}, log_io={is_inited_:true, device_ability_:null, share_percentage_:100, last_deadline_ts_:0}, user_io={is_inited_:true, device_ability_:null, share_percentage_:100, last_deadline_ts_:0}, sys_io={is_inited_:true, device_ability_:0x32bb7ba8, share_percentage_:90, last_deadline_ts_:1687334746495760}, prewarm_io={is_inited_:true, device_ability_:0x32bb7ba8, share_percentage_:100, last_deadline_ts_:0}, large_query_io={is_inited_:true, device_ability_:0x32bb7ba8, share_percentage_:100, last_deadline_ts_:0}, log_read={}, log_write={start_iops:29.93, cancel_iops:0.00, fail_iops:0.00, succ_iops:29.93, avg_byte:5461.33, device_rt_us:140.00, queue_rt_us:7.33}, user_read={}, user_write={}, sys_read={}, sys_write={}, prewarm_read={}, prewarm_write={}, large_query_read={}, large_query_write={}, avg_cpu=2.126034912718204453e+02)Log interpretation
Log information Description io_config This part includes system I/O parameters, such as sys_io_low_percent_(minimum system I/O percentage),sys_io_high_percent_(maximum system I/O percentage),user_iort_up_percent_(maximum user I/O percentage),cpu_high_water_level_(maximum CPU utilization),disk_io_thread_count_(number of disk I/O threads),callback_thread_count_(number of callback threads),large_query_io_percent_(percentage of large query I/Os),data_storage_io_timeout_ms_(timeout period for data storage I/Os, in ms), andio_memory_limit_(maximum memory for I/O operations).io_ability This part includes the I/O performance test results, such as the IOPS and response time of I/O operations in different modes (read or write) and of different sizes. - log_io
- user_io sys_io
- prewarm_io
- large_query_io
This part includes the status and device capabilities of different types of I/O operations, such as whether the system is initialized ( is_inited_), device capabilities (device_ability_), the percentage of shared I/O resources (share_percentage_), and the deadline timestamp (last_deadline_ts_).- log_read
- log_write user_read
- user_write sys_read
- sys_write prewarm_read
- prewarm_write
- large_query_read
- large_query_write
This part includes statistics of different types of I/O operations, such as the start IOPS ( start_iops), IOPS of canceled I/O operations (cancel_iops), IOPS of failed I/O operations (fail_iops), IOPS of successful I/O operations (succ_iops), average byte size (avg_byte), device response time (device_rt_us), and queue response time (queue_rt_us).avg_cpu The average CPU utilization of the system. You can determine whether I/O operations are throttled based on the preceding I/O statistics as follows:
If the values of
sys_io_high_percentandsys_io_low_percentare inconsistent with those ofsys_bkgd_io_high_percentageandsys_bkgd_io_low_percentage, I/O throttling is enabled.If the value of
last_deadline_ts_is later than the time when the log is printed, I/O operations are throttled.
Archiving thread statistics
Search keyword:
print_archive_statusLog description: log archiving statistics, which are printed every 10s.
[admin@OB /home/admin/oceanbase/log] $grep "print_archive_status" observer.logThe return result is as follows:
[2023-08-24 13:59:53.565584] INFO [ARCHIVE] ob_archive_mgr.cpp:830 [11357][0][Y0-0000000000000000-0-0] [lt=13] [dc=0] print_archive_status(clog_split_task_num=0, send_task_num=2, pre_send_task_capacity=458752, pg_archive_task_count=903, archive_incarnation=1, archive_round=6, piece_id=0, piece_create_date=20230810, log_archive_status=2, has_encount_error=false)Log interpretation
Log information Description clog_split_task_num The number of ongoing clog splitting tasks. send_task_num The number of ongoing clog sending tasks. pre_send_task_capacity The amount of data to be archived. pg_archive_task_count The number of archiving tasks associated with the partition group. log_archive_status The archiving status. Valid values: 0: an invalid state. </li><li>1: Archiving is being started. </li><li>2: Archiving is in progress. </li><li>3: Archiving is being stopped. </li><li>4`: Archiving has been stopped.
Search keyword:
archive_clog_split_engine statisticsLog description: statistics of clog splitting threads, which are printed every 10s.
[admin@OB /home/admin/oceanbase/log] $grep "archive_clog_split_engine statistics" observer.logThe return result is as follows:
[2023-08-24 14:46:13.179165] INFO [ARCHIVE] ob_archive_clog_split_engine.cpp:1120 [11389][0][Y0-0000000000000000-0-0] [lt=11] [dc=0] archive_clog_split_engine statistics(avg_split_task_used=51, avg_read_log_used=40, avg_read_log_size=3058, avg_get_send_task_used=11, avg_send_task_per_task=1, split_task_count=17)Log interpretation
Log information Description avg_split_task_used The average time consumed by a clog splitting task, in microseconds. avg_read_log_used The average time spent in reading log data at a time, in microseconds. avg_read_log_size The average size of log data read at a time. avg_get_send_task_used The average time spent in obtaining a clog sending task, in microseconds. avg_send_task_per_task The average number of clog sending tasks corresponding to a clog splitting task. split_task_count The number of clog splitting tasks. Search keyword:
archive_sender statisticLog description: statistics of each clog sending thread, which are printed every 10s.
[admin@OB /home/admin/oceanbase/log] $grep "archive_sender statistic" observer.logThe return result is as follows:
[2023-06-12 16:15:50.311863] INFO [ARCHIVE] ob_archive_sender.cpp:1589 [25879][0][Y0-0000000000000000-0-0] [lt=5] [dc=0] archive_sender statistic in 10s(total_send_buf_size=29384, total_send_log_count=157, avg_log_size=187, total_send_task_count=2, total_send_cost_ts=2809, avg_send_task_cost_ts=1404)Log interpretation
Log information Description total_send_buf_size The size of log data sent by the archiving thread within 10s. If log compression is enabled, the size of sent log data is one third that of uncompressed logs. total_send_log_count The number of log records sent by the archiving thread within 10s. total_send_task_count The total number of network I/Os of the archiving thread within 10s. total_send_cost_ts The total time consumed by the archiving thread within 10s, in microseconds. If the value approaches or exceeds 10s, the maximum I/O processing capability of the system has been reached. avg_send_task_cost_ts The average time consumed by a network I/O, in microseconds. The value ranges from several milliseconds to tens of milliseconds.
DAG thread statistics
Search keyword:
dump_dag_statusLog description: statistics of directed acyclic graph (DAG) threads, which are printed every 10s.
[admin@OB /home/admin/oceanbase/log] $grep "dump_dag_status" observer.logThe return result is as follows:
[2023-09-04 17:15:25.494000] INFO [COMMON] ob_dag_scheduler.cpp:1314 [9777][0][Y0-0000000000000000-0-0] [lt=5] [dc=0] dump_dag_status(priority="DAG_PRIO_SSTABLE_MINI_MERGE", low_limit=4, running_task=0) [2023-09-04 17:15:25.494065] INFO [COMMON] ob_dag_scheduler.cpp:1318 [9777][0][Y0-0000000000000000-0-0] [lt=3] [dc=0] dump_dag_status(up_limit_type="DAG_ULT_MINI_MERGE", up_limit=8, running_task_per_ult=0) [2023-09-04 17:15:25.494121] INFO [COMMON] ob_dag_scheduler.cpp:1322 [9777][0][Y0-0000000000000000-0-0] [lt=4] [dc=0] dump_dag_status(type="DAG_MINOR_MERGE", dag_count=0) [2023-09-04 17:15:25.494125] INFO [COMMON] ob_dag_scheduler.cpp:1322 [9777][0][Y0-0000000000000000-0-0] [lt=3] [dc=0] dump_dag_status(type="DAG_MAJOR_MERGE", dag_count=0) [2023-09-04 17:15:25.494210] INFO [COMMON] ob_dag_scheduler.cpp:1324 [9777][0][Y0-0000000000000000-0-0] [lt=3] [dc=0] dump_dag_status(total_worker_cnt=52, total_running_task_cnt=0, work_thread_num=113)Log interpretation
Log information Description priority The priority of background tasks. OceanBase Database preferentially schedules tasks with a high priority. low_limit The minimum number of threads for background tasks. up_limit The maximum number of threads for background tasks. type The type of background tasks. dag_count The total number of requests being queued. total_worker_cnt The total number of active threads for background tasks of various types. total_running_task_cnt The total number of ongoing background tasks of various types. work_thread_num The maximum number of threads for background tasks of various types.
Memory allocation failure
Search keyword:
[OOPS],alloc failed,-4013, or-4030Log description: memory allocation failure.
[admin@OB /home/admin/oceanbase/log] $grep "OOPS|alloc failed|-4013|-4030" observer.logThe return result is as follows:
[2019-11-29 13:32:19.702557] WARN alloc_chunk (achunk_mgr.cpp:156) [116413][1492][XXXXXXXXXXXX-XXXXXXXXXXXXXXXX] [lt=10] [dc=0] oops, over total memory limit, hold=241078108160 limit=243230035968 [2019-11-29 13:32:19.703452] WARN alloc (ob_tenant_ctx_allocator.h:111) [116413][1492][XXXXXXXXXXXX-XXXXXXXXXXXXXXXX] [lt=15] [dc=0] oops, alloc failed, tenant_id=500 ctx_id=0 ctx_name=DEFAULT_CTX_ID ctx_hold=13245612032 ctx_limit=9223372036854775807 tenant_hold=26003955712 tenant_limit=9223372036854775807 [2019-11-29 13:32:19.710755] ERROR [LIB] ob_malloc (ob_malloc.h:49) [116413][1492][XXXXXXXXXXXX-XXXXXXXXXXXXXXXX] [lt=364] [dc=0] allocate memory fail(attr=tenant_id=500, mod_id=253, ctx_id=0, prio=0, nbyte=2097152)Log interpretation
Log information Description ctx_name The memory module with memory overrun. ctx_hold The size of memory held by the memory module. ctx_limit The maximum size of memory for the memory module. alloc_size The size of memory allocated to the current memory allocation request. MemStore throttling
Search keyword:
report write throttle infoLog description: statistics of thresholds for triggering MemStore write throttling.
[admin@OB /home/admin/oceanbase/log] $grep "report write throttle info" observer.logThe return result is as follows:
2022-12-29 04:46:44.841893] INFO [COMMON] ob_fifo_arena.cpp:333 [28948][0][XXXXXXXXXXXX-XXXXXXXXXXXXXXXXXX-X-X] [lt=16] [dc=0] report write throttle info(alloc_size=24, throttling_interval=1000, attr_=tenant_id=1005, label=15, ctx_id=1, prio=0, freed memory(MB):=0, last_base_ts=1672260404842042, cur_mem_hold=11970543616, trigger_mem_limit=21990232512, trigger_mem_remain=5497558128, tenant_memory_remain=5448400896, throttle_info_={decay_factor_:"3.002618459977128736e-04", alloc_duration_:3600000000, trigger_percentage_:80, memstore_threshold_:27487790640, period_throttled_count_:45, period_throttled_time_:315460, total_throttled_count_:181984128, total_throttled_time_:72058595126473244}, mem_overused=49157232)Log interpretation
Log information Description cur_mem_hold The current MemStore usage. trigger_mem_limit The Memstore usage threshold that triggers throttling. trigger_mem_remain The remaining space of the MemStore that triggers throttling. tenant_memory_remain The current remaining space of the MemStore. mem_overused The size of overused space of the MemStore. Lock contention
Search keyword
Write-write conflict:
on_wlock_retryorlock_for_write conflictRead-write conflict:
lock_for_read retryorlock_for_read fail
Log description: records of row lock contention and retries.
[2022-09-03 11:33:06.624063] WARN [STORAGE.TRANS] on_wlock_retry (ob_memtable_context.cpp:345) [4198][0][XXXXXXXXXXXX-XXXXXXXXXXXXXXXXXX-X-X] [lt=16] [dc=0] lock_for_write conflict(*this=alloc_type=0 ctx_descriptor=45479 … abs_expired_time=1662179586523672 stmt_timeout=3599900482 … pkey={tid:1100611139453777, partition_id:0, part_cnt:0} trans_id={hash:16499054864673064020, inc:284227, addr:"xxx.xxx.xxx.xxx:2882", t:1662175982313772} checksum_log_ts=0, key=table_id=1100611139453777 rowkey_object=[{"NUMBER":"1"}] , conflict_ctx="alloc_type=0 ctx_descriptor=45354 … stmt_start_time=1662175972255438 abs_expired_time=1662179572155922 stmt_timeout=3599900484 … pkey={tid:1100611139453777, partition_id:0, part_cnt:0} trans_id={hash:15878544715862172232, inc:283677, addr:"xxx.xxx.xxx.xxx:2882", t:1662175972255438} checksum_log_ts=0")Log interpretation
Log information Description *this The waiter, namely, the transaction waiting for a lock. conflict_ctx The blocker, which can be the lock holder. pkey The table ID and partition ID. rowkey_object The row key. trans_id The transaction ID. Slow queries
Search keyword:
[slow query]Log description: detailed SQL trace information.
[admin@OB /home/admin/oceanbase/log] $grep "slow query" observer.logThe return result is as follows:
[2022-02-18 11:49:32.557473] TRACE [TRACE]obmp_base.cpp:948 [1837144][3090][XXXXXXXXXX-XXXXXXXXXXXXXX] [lt=2] [dc=0] [slow query] (TRACE=begin_ts=1645156047499895 2022-02-18 03:47:27.499895 |[process_begin] u=0 in_queue_time:16, receive_ts:1645156047499878, enqueue_ts:1645156047499879 |[query_begin] u=1 trace_id:XXXXXXXXXX-XXXXXXXXXXXXXX |[session] u=4 sid:3221502589, tenant_id:1| ...... |[parse_begin] u=17 stmt:"select count(*) from TEST1 fetch first 1000 rows only", stmt_len:58 |[cache_get_plan_begin] u=3 |[cache_get_plan_end] u=18 ...... |[cg_begin] u=1 |[cg_end] u=137 |[plan_id] u=96 plan_id:91455 |[exec_begin] u=2 arg1:false, end_trans_cb:false |[do_open_plan_begin] u=2 plan_id:91455 ...... |[result_set_close] u=0 ret:-5066, arg1:-5066, arg2:0, arg3:-5066, async:false |[session] u=34 sid:3226669564, tenant_id:1034 |[exec_end] u=21 |[query_end] u=125 |[process_end] u=38 run_ts:1645156047499900 |total_timeu=125057508)Log interpretation
Log information Description [eventbegin] and [eventend] The start and end of internal events during the execution of the SQL statement. u=xxx The interval between two trace events. in_queue_time The queuing time before the SQL statement starts to be executed. trace_id The trace ID of the SQL statement. sid The ID of the session where the SQL statement is executed. stmt The SQL statement. plan_id The ID of the execution plan used for the SQL statement. ret The return code (error code). total_timeu The total time spent in executing the SQL statement. Slow transactions
Search keyword:
[slow trans]Log description: the call link and time consumption information of the execution of the transaction.
[admin@OB /home/admin/oceanbase/log] $grep "slow trans" observer.logThe return result is as follows:
observer.log:[2021-10-08 11:45:44.313295] TRACE [TRACE]:0 [9382][2334][XXXXXXXXXX-XXXXXXXXXXXXXXXXX] [lt=23] [dc=39] [slow trans] (TRACE=begin_ts=1633664744167732 2021-10-08 03:45:44.167732 [init] u=0 arg1:140542173395392, ctx_type:"scheduler", trans_type:2, trans_id:{hash:7103234888693285108, inc:532846, addr:{ip:"192.168.1.1", port:40000}, t:1633664744167658}, pkey:{tid:1000, partition_id:1000, part_cnt:1000}, arg1:false, uref:1073741824 [set_stc] u=1 stc:1633664744167733 [start_trans] u=2 ret:0, uref:1073741825 [start_stmt] u=7 ret:0, tenant_id:1003, sql_no:4294967297, phy_plan_type:3, stmt_type:2, sql_id:"AA2155E24549E97", trace_id:Y9C4064586A6B-0005CDCF253FBE70, start_time:1633664744166256, uref:1073741825 [end_stmt] u=30832 ret:0, is_rollback:false, rollback_to:4294967297, rollback_from:4294967297, uref:1073741825 [start_stmt] u=50000 ret:0, tenant_id:1003, sql_no:4294967298, phy_plan_type:3, stmt_type:2, sql_id:"AA2155E24549E97", trace_id:Y9C4064586A6B-0005CDCF253FBE70, start_time:1633664744166256, uref:1073741825 [end_stmt] u=10 ret:0, is_rollback:false, rollback_to:4294967298, rollback_from:4294967298, uref:1073741825 [submit_commit] u=52 ret:0, uref:1073741824 [end_trans] u=35 ret:0, is_rollback:false, uref:1073741825 [exiting] u=114618 uref:1073741824, arg1:3221531139, arg2:7230647234361492284 [handle_message] u=1 ret:0, msg_type:1, uref:1 [end_trans_cb] u=13 ret:0, param:0, param:true [destroy] u=1 uref:0 total_timeu=195572)Log interpretation
Log information Description ctx_type The type of the transaction context. Valid values: scheduler,coordinator, andparticipant.trans_type The type of the transaction. Valid values: 0: SP_TRANS (single partition)1: MINI_SP_TRANS (read-only single partition)2: DIST_TRANS (distributed)
trans_id The ID of the transaction. pkey The table ID and partition ID of the object being modified in the current transaction context. [start_trans]/[end_trans] Information about the start or end of the transaction context. [start_stmt]/[end_stmt] Information about the start or end of the execution of the SQL statement, including the SQL ID and trace ID. u=xxx The interval between two trace events. total_timeu The total time spent in processing the transaction. Slow network transmission
Search keyword:
packet fly cost too much timeLog description: too much time spent by the RPC thread in sending the packet.
[admin@OB /home/admin/oceanbase/log] $grep "packet fly cost too much time" observer.logThe return result is as follows:
[2022-10-27 23:09:58.317505] WARN [RPC.OBRPC] decode (ob_rpc_net_handler.cpp:119) [31273] [Y00000000000000000] [lt=8] packet fly cost too much time(pcode=2311, fly_ts=769195, send_timestamp=1666883397548305)Log interpretation
Log information Description pcode The processing event code. You can query __all_virtual_obrpc_statfor the meaning of the code.fly_ts The duration from when the source encodes and sends the packet to when the destination receives and decodes the packet. send_timestamp The timestamp when the source sends the packet. Note
Slow RPC transmission may be caused by the network latency, clock offset, system load, and busy status of RPC threads.
Network throttling
Search keyword:
network speed changed,reload_bandwidth_throttle_limit, orinit_bandwidth_throttleLog description: network throttling settings for system tasks on an OBServer node.
[admin@OB /home/admin/oceanbase/log] $grep "network speed changed|reload_bandwidth_throttle_limit|init_bandwidth_throttle" observer.logThe return result is as follows:
[2021-06-10 11:10:49.397019] INFO [SERVER] ob_server.cpp:2018 [72580][0][Y0-0000000000000000-0-0] [lt=8] [dc=0] succeed to init_bandwidth_throttle(sys_bkgd_net_percentage_=60, network_speed=1310720000, rate=786432000): [2021-06-10 11:14:44.905396] INFO [SERVER] ob_server.cpp:2385 [72582][0][Y0-0000000000000000-0-0] [lt=11] [dc=0] network speed changed(from=1310720000, to=1048576000) [2021-06-10 11:14:44.905421] INFO [SERVER] ob_server.cpp:2055 [72582][0][Y0-0000000000000000-0-0] [lt=3] [dc=0] succeed to reload_bandwidth_throttle_limit(old_percentage=60, new_percentage=60, network_speed=1048576000, rate=629145600)Log interpretation
Log information Description sys_bkgd_net_percentage_ The percentage of network bandwidth resources for background threads. The value is sourced from the system parameter sys_bkgd_net_percentage. By default, 60% of the bandwidth resources of the network interface card (NIC) are available for background threads.network_speed The NIC bandwidth, which is specified based on the related configuration file. rate The network bandwidth for background threads, which is the value of network_speedmultiplied bysys_bkgd_net_percentage.Log disk write failure
Search keyword:
submit_log failedLog description: information about log disk write failure.
[admin@OB /home/admin/oceanbase/log] $grep "submit_log failed" observer.logThe return result is as follows:
[2023-02-14 16:51:10.255801] ERROR [CLOG] submit_log (ob_partition_log_service.cpp:468) [893][0][Y0-0000000000000000-0-0] [lt=19] submit_log failed(ret=-4264, partition_key={tid:1099511627916, partition_id:0, part_cnt:0}, buff=0xfff80a4aef70, size=805, base_timestamp=1676065578191066, is_trans_related_log=true, role=1, state=4)Log interpretation
Log information Description ret The error code. partition_key The table ID and partition ID of the related object. Errors requiring attention of DBAs
Search keyword:
ERROR issue_dba_errorLog description: errors that require the attention of database administrators (DBAs).
[admin@OB /home/admin/oceanbase/log] $grep "ERROR issue_dba_error" observer.logThe return result is as follows:
2023-05-11 16:05:27.966611] ERROR issue_dba_error (ob_log.cpp:2322) [79165][0][Y0-0000000000000000-0-0] [lt=18] [dc=0][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=0, file="ob_log_file_group.cpp", line_no=412, info="clog disk is almost full") [2023-05-11 16:45:58.135783] ERROR issue_dba_error (ob_log.cpp:2322) [77332][0][Y0-0000000000000000-0-0] [lt=8] [dc=0][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=-4264, file="ob_partition_log_service.cpp", line_no=468, info="submit_log failed") [2023-05-11 16:45:59.448092] ERROR issue_dba_error (ob_log.cpp:2322) [77584][0][Y0-0000000000000000-0-0] [lt=10] [dc=0][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=0, file="ob_election.cpp", line_no=2077, info="leader_revoke, please attention!")Log interpretation
Log information Description internal errcode The error code. file The source code with the error. info The error information.
rootservice.log
| Log keyword | Description |
|---|---|
| Rebalance task manager | The statistics of load balancing tasks. |
| [DDL] execute ddl like stmt | Information about an executed DDL statement. |
| choose leader info | Information about leader switching. |
| partition can't replicate | Information about a partition replication failure. |
| check merge progress success | The major compaction progress. |
| replica not merged | Information about a failed major compaction for a partition. |
Load balancing task statistics
Search keyword:
Rebalance task manager statisticsLog description: statistics of load balancing tasks.
[admin@OB /home/admin/oceanbase/log] $grep "Rebalance task manager statistics" observer.logThe return result is as follows:
[2023-05-10 05:22:50.862554] INFO [RS] ob_rebalance_task_mgr.cpp:1887 [65048][0][XXXXXXXXXX-XXXXXXXXXXXXXX-X-X] [lt=14] [dc=0] Rebalance task manager statistics(waiting_replication_task_cnt=0, executing_replication_task_cnt=0, waiting_migration_task_cnt=0, executing_migration_task_cnt=0)Log interpretation
Log information Description waiting_replication_task_cnt The number of pending replication tasks. executing_replication_task_cnt The number of ongoing replication tasks. waiting_migration_task_cnt The number of pending migration tasks. executing_migration_task_cnt The number of ongoing replication tasks. Major compaction progress
Search keyword:
check merge progress successorsucc to finish zone mergeLog description: the major compaction progress.
[admin@OB /home/admin/oceanbase/log] $grep "check merge progress success" observer.logor
[admin@OB /home/admin/oceanbase/log] $grep "succ to finish zone merge" observer.logThe return result is as follows:
[2021-04-07 10:37:16.451302] INFO [RS] ob_daily_merge_scheduler.cpp:1840 [4532][XXXXXXXXXX-XXXXXXXXXXXXXX] [lt=34] succeed to add_finish_zone_merge_op(zone=bank.ea133_8, last_merged_version=10044, all_merged_version=10044) [2021-04-07 10:37:18.114716] INFO [RS] ob_daily_merge_scheduler.cpp:1986 [4532][XXXXXXXXXX-XXXXXXXXXXXXXX] [lt=14] succ to finish zone merge(zone=bank.ea133_8, last_merged_version=10044, all_merged_version=10044)or
[2023-05-16 20:04:48.270903] INFO [RS] ob_partition_table_util.cpp:317 [87321][0][XXXXXXXXXX-XXXXXXXXXXXXXX] [lt=10] [dc=0] check merge progress success(version=38, all_majority_merged=false, all_progress=[{zone:"zone1", unmerged_partition_cnt:2301, unmerged_data_size:126884638603, merged_partition_cnt:0, merged_data_size:0, smallest_data_version:37, merged_partition_percentage:0, merged_data_percentage:0}, {zone:"zone2", unmerged_partition_cnt:0, unmerged_data_size:0, merged_partition_cnt:2301, merged_data_size:126884950766, smallest_data_version:38, merged_partition_percentage:100, merged_data_percentage:100}, {zone:"zone3", unmerged_partition_cnt:2301, unmerged_data_size:126884638603, merged_partition_cnt:0, merged_data_size:0, smallest_data_version:37, merged_partition_percentage:0, merged_data_percentage:0}], checking_time_us=202635)Log interpretation
Log information Description version The current major compaction version. all_majority_merged Indicates whether the major compaction is completed for all zones. Valid values: trueandfalse.all_progress The major compaction progress of each zone.
DDL execution
Search keyword:
[DDL] execute ddl like stmtLog description: information about an executed DDL statement.
[admin@OB /home/admin/oceanbase/log] $grep "[DDL] execute ddl like stmt" observer.logThe return result is as follows:
[2023-05-16 10:50:10.834911] INFO [RS] ob_rs_rpc_processor.h:249 [87188][0][XXXXXXXXXX-XXXXXXXXXXXXXX] [lt=17] [dc=0] [DDL] execute ddl like stmt(ret=0, cost=44039, ddl_arg={ddl_stmt_str:"CREATE INDEX "IND_PA_ACCTSENIOR" on "T_PA_ACCTSENIOR" ( "FSENIOR_CODE" ) GLOBAL", exec_tenant_id:1005, ddl_id_str:"", is_replay_schema:false, based_schema_object_infos:[{schema_id:1105009185964905, schema_type:5, schema_version:1684205410682568}]})Log interpretation
Log information Description ret The return code of the DDL statement. cost The execution time. ddl_arg The DDL statement. schema_type The type of the object created. Leader switching
Search keyword:
choose leader infoLog description: information about leader switching performed for a partition.
[admin@OB /home/admin/oceanbase/log] $grep "choose leader info" observer.logThe return result is as follows:
[2023-05-10 17:41:28.833956] INFO [RS] ob_leader_coordinator.cpp:5004 [65037][0][XXXXXXXXXX-XXXXXXXXXXXXXX-X-X] [lt=27] [dc=0] choose leader info(tenant_id=1005, tablegroup_id=18446744073709551615, server="192.168.1.1:2882", info={server_addr:"192.168.1.1:2882", zone:"zone2", balance_group_score:9223372036854775807, region_score:{region_:"beijing", region_score_:0}, not_merging:true, start_service:true, zone_candidate_count:1, candidate_count:1, is_candidate:true, not_excluded:true, migrate_out_or_transform_count:0, zone_migrate_out_or_transform_count:0, in_normal_unit_count:1, zone_score:{zone_:"zone2", zone_score_:0}, original_leader_count:0, random_score:3, cur_leader_count:1, in_revoke_blacklist_count:0, partition_id:0}, is_stopped=false, is_zone_active=true)Log interpretation
Log information Description tablegroup_id The ID of the partition or table group for which leader switching is performed. server The OBServer node where the new leader resides. is_stopped The status of the OBServer node where the new leader resides. is_zone_active The status of the zone where the new leader resides. Major compaction failure of partitions
Search keyword:
replica not mergedLog description: information about a failed major compaction for a partition.
[admin@OB /home/admin/oceanbase/log] $grep "replica not merged" observer.logThe return result is as follows:
[2021-04-07 10:47:36.835217] INFO [RS] ob_partition_table_util.cpp:180 [4532][XXXXXXXXXX-XXXXXXXXXXXXXX] [lt=5] replica not merged to version(version=10044, replica={table_id:{value:1106108697592732, first:1006, second:50076}, partition_id:0, partition_cnt:1, zone:"bank.mub_2", server:"xxx.xxx.xxx.xxx:2882", sql_port:2881, unit_id:1017, role:1, member_list:[{server:"xxx.xxx.xxx.xxx:2882", timestamp:1557500921627412}, {server:"xxx.xxx.xxx.xxx:2882", timestamp:1560866621699092}, {server:"xxx.xxx.xxx.xxx:2882", timestamp:1592984233874976}, {server:"xxx.xxx.xxx.xxx:2882", timestamp:1614240480439532}, {server:"xxx.xxx.xxx.xxx:2882", timestamp:1614911835988424}], row_count:8635155074, data_size:2061390087129, data_version:10043, data_checksum:3763772143, row_checksum:row:[0], modify_time_us:1617652580558918, create_time_us:1560880959641988, member_time_us:1560866621699092, is_original_leader:false, in_member_list:true, rebuild:false, to_leader_time:1603047186742602, replica_status:"REPLICA_STATUS_NORMAL", replica_type:0, required_size:2295137828864, status:"REPLICA_STATUS_NORMAL", is_restore:0, partition_checksum:14438911093, quorum:5})Log interpretation
Log information Description version The global major compaction version. table_id:{value:v0, first:v1, second:v2} v0representstable_id,v1representstenant_id, andv2representspure table_id.data_version The major version of the current partition. replica_type The replica type. 0indicates a full-featured replica,5indicates a log-only replica, and16indicates a read-only replica.quorum The number of Paxos replicas.
election.log
| Log keyword | Description |
|---|---|
| leader lease is expired | Indicates that the leader lease has expired and a new leader needs to be elected. |
| leader takeover success | Indicates that a new leader is successfully elected. |
| leader revoke | Indicates that the leader is revoked. |
| current leader is invalid | A warning log indicating that the leader is invalid. |
| leader_revoke, please attention! | An error log indicating that the leader has been revoked. |
Search keyword: leader revoke or leader_revoke
Log description: information about leader revocation.
[admin@OB /home/admin/oceanbase/log] $grep "leader revoke|leader_revoke" observer.log
The return result is as follows:
[2022-05-08 22:41:38.827689] INFO [ELECT] ob_election.cpp:2072 [201777][0][Y0-0000000000000000-0-0] [lt=1526] take the initiative leader revoke(partition={tid:1100611139404082, partition_id:0, part_cnt:0}, revoke reason="clog disk full")
[2022-05-08 22:41:38.827723] ERROR [ELECT] leader_revoke (ob_election.cpp:2077) [201777][0][Y0-0000000000000000-0-0] [lt=8] leader_revoke, please attention!(revoke reason="clog disk full", election={partition:{tid:1100611139404082, partition_id:0, part_cnt:0}, is_running:true, is_changing_leader:false, self:"xxx.xxx.xxx.xxx:2882", proposal_leader:"0.0.0.0", cur_leader:"xxx.xxx.xxx.xxx:2882", curr_candidates:3{server:"xxx.xxx.xxx.xxx:2882", timestamp:1650642269999783, flag:0, need_encrypt:false}{server:"xxx.xxx.xxx.xxx:2882", timestamp:1651730035655503, flag:0, need_encrypt:false}{server:"xxx.xxx.xxx.xxx:2882", timestamp:1650642269999783, flag:0, need_encrypt:false}, curr_membership_version:1651730068793522, leader_lease:[1651816304570000, 1651816418400000], election_time_offset:5570000, active_timestamp:1651808137937449, T1_timestamp:1651816404000000, leader_epoch:1651816299000000, state:1, role:1, stage:-1, type:-1, replica_num:3, unconfirmed_leader:"xxx.xxx.xxx.xxx:2882", unconfirmed_leader_lease:[1651816304570000, 1651816419000000], takeover_t1_timestamp:1651816402600000, is_need_query:false, valid_candidates:0, change_leader_timestamp:0, ignore_log:false, leader_revoke_timestamp:1651816301770881, vote_period:4, lease_time:9800000, move_out_timestamp:1651816300405697, eg_part_array_idx:40, eg_id_hash:6130129687449640250}) BACKTRACE:0xf5c6965 0xf45eeca 0xf44866a 0x530ea4d 0x9bbea6b 0x9b2c627 0x9b2c0cc 0x9b77a60 0x987a46b 0xb472a62 0x93845b5 0x9384519 0x5260906 0x66daf5d 0xf33ee22 0xf33ec7e 0xf5da8ce
Log interpretation
| Log information | Description |
|---|---|
| revoke reason | The reason why the leader revocation is triggered. |
| is_running | Indicates whether an election is in progress. |
| is_changing_leader | Indicates whether the leader is being changed. |
| self | The current node. |
| proposal_leader | The proposed leader. |
| cur_leader | The current leader. |
| curr_candidates | The list of candidate nodes. |
| leader_lease | The start time and end time of the leader lease. |
| unconfirmed_leader | The unconfirmed leader node. |
| leader_revoke_timestamp | The time when the leader revocation occurs. |