This topic describes the log interpretation guide of OceanBase Database, including log interpretation, log location, log structure, and log interpretation in key scenarios.
observer.log
The log file for the main service process, which records the operations of all modules except for elections and RootService, including SQL parsing, transaction execution, and storage engine behavior.
Observer startup log
The Observer startup log records key events during the initialization and service readiness of the process, which is used for diagnosing issues during the startup phase.
Method to locate logs
Use the keyword
[NOTICE]to locate the logs.Log location command
[admin@OB /home/admin/oceanbase/log] $grep "NOTICE" observer.logTypical output example
[2023-05-11 14:19:09.703272] INFO [SERVER] ob_server.cpp:533 [95303][0][Y0-0000000000000000-0-0] [lt=9] [dc=0] [NOTICE] server instance start succeed [2023-05-11 14:19:25.555836] INFO [SERVER] ob_server.cpp:609 [95303][0][Y0-0000000000000000-0-0] [lt=16] [dc=0] [NOTICE] observer start service(start_service_time=1683785965555827)Key fields explanation
Log information Description server instance start succeed Indicates that the Observer process has started successfully. observer start service Indicates that the Observer has completed startup and is now providing services; start_service_timeis the timestamp when the Observer started providing services.
Statistics Log
Tenant disk usage statistics
The tenant disk usage is printed every 10 seconds.
Log location method
You can search for the keywords
ob_tenant_disk_usage_mgr.cppordump tenant disk usage update infoto locate the log.Log location command
[admin@OB /home/admin/oceanbase/log] $grep "ob_tenant_disk_usage_mgr.cpp" observer.logor
[admin@OB /home/admin/oceanbase/log] $grep "dump tenant disk usage update info" observer.logSample output
[2023-05-15 17:30:59.055858] INFO [STORAGE] ob_tenant_disk_usage_mgr.cpp:69 [87424][0][Y0-0000000000000000-0-0] [lt=11] [dc=0] dump tenant disk usage update info(disk_usage={tenant_id:1, max_disk_size:0, used_data_size:935329792, used_meta_size:56623104})Key fields
Field Description tenant_id The ID of the tenant. max_disk_size The maximum disk size of the tenant. In OceanBase Database V4.x and earlier, there is no limit on the disk usage of a tenant. used_data_size The disk space occupied by user data in the tenant. used_meta_size The disk space occupied by management data in the tenant.
Disk I/O statistics
Basic I/O throughput and performance bottleneck statistics, printed every 10 seconds.
Log location method
Locate the log by searching for the keyword
Current io stat(V2.x) orcurrent io status(V3.x).Log location command
Current io stat command
[admin@OB /home/admin/oceanbase/log] $grep "Current io stat" observer.logSample output
[2022-12-27 17:28:19.298670] INFO [COMMON] ob_io_disk.cpp:497 [47844][406][Y0-0000000000000000] [lt=59] [dc=0] Current io stat, (fd={fd:875, disk_id={disk_idx:0, install_seq:0}}, io_conf={sys_io_low_percent:90, sys_io_high_percent:90, user_iort_up_percent:2000, cpu_high_water_level:7680, write_failure_detect_interval:60000000, read_failure_black_list_interval:300000000, data_storage_warning_tolerance_time:10000000, data_storage_error_tolerance_time:300000000, disk_io_thread_count:8, callback_thread_count:8, large_query_io_percent:0, data_storage_io_timeout_ms:10000}, sys_io_percent=90, sys_iops_up_limit=266, user_max_rt=1.151363996887207031e+04, cpu_estimator={avg_usage:378}, user_read_io_stat={average_size:18245, average_rt_us:"4.766969696969696884e+02", old_stat:{io_cnt:5435154700, io_bytes:92266120699904, io_rt_us:2896942021380}, new_stat:{io_cnt:5435154733, io_bytes:92266121302016, io_rt_us:2896942037111}}, user_write_io_stat={average_size:0, average_rt_us:"0.000000000000000000e+00", old_stat:{io_cnt:790, io_bytes:1568509952, io_rt_us:1369408}, new_stat:{io_cnt:790, io_bytes:1568509952, io_rt_us:1369408}}, sys_read_io_stat={average_size:0, average_rt_us:"0.000000000000000000e+00", old_stat:{io_cnt:29673855, io_bytes:62197411557376, io_rt_us:62528978207}, new_stat:{io_cnt:29673855, io_bytes:62197411557376, io_rt_us:62528978207}}, sys_write_io_stat={average_size:0, average_rt_us:"0.000000000000000000e+00", old_stat:{io_cnt:24420687, io_bytes:51213892583424, io_rt_us:56376648298}, new_stat:{io_cnt:24420687, io_bytes:51213892583424, io_rt_us:56376648298}}, large_query_read_io_stat={average_size:0, average_rt_us:"0.000000000000000000e+00", old_stat:{io_cnt:0, io_bytes:0, io_rt_us:0}, new_stat:{io_cnt:0, io_bytes:0, io_rt_us:0}}, large_query_write_io_stat={average_size:0, average_rt_us:"0.000000000000000000e+00", old_stat:{io_cnt:0, io_bytes:0, io_rt_us:0}, new_stat:{io_cnt:0, io_bytes:0, io_rt_us:0}})Key fields
Log information Description fd and disk_id The file descriptor number fd, the disk index numberdisk_idx, and the installation sequence numberinstall_seq.io_config The I/O configuration parameters of the system: - sys_io_low_percent: the low percentage of system I/O
- sys_io_high_percent: the high percentage of system I/O
- user_iort_up_percent: the upper limit of user I/O
- cpu_high_water_level: the high water level of CPU
- disk_io_thread_count: the number of disk I/O threads
- callback_thread_count: the number of callback threads
- large_query_io_percent: the percentage of large query I/O
- data_storage_io_timeout_ms: the timeout in milliseconds for data storage I/O
sys_io_percent The upper limit of the percentage of system I/O usage. sys_iops_up_limit The upper limit of the 2 MB IOPS of the system, which should be consistent with the IO bench test results in the io_resource.conffile.user_max_rt The maximum response time of user I/O. cpu_estimator(avg_usage) The average CPU usage of user I/O. user_read_io_stat The read I/O statistics generated by normal user requests (such as SQL queries) - average_size: the average size
- average_rt_us: the average response time
- old_stat: the old statistics data
- io_cnt: the number of I/O operations
- io_bytes: the number of I/O bytes
- io_rt_us: the I/O response time
- new_stat: the new statistics data
- io_cnt: the number of I/O operations
- io_bytes: the number of I/O bytes
- io_rt_us: the I/O response time
user_write_io_stat The write I/O statistics generated by normal user requests (such as DML operations - average_size: the average size
- average_rt_us: the average response time
- old_stat: the old statistics data
- io_cnt: the number of I/O operations
- io_bytes: the number of I/O bytes
- io_rt_us: the I/O response time
- new_stat: the new statistics data
- io_cnt: the number of I/O operations
- io_bytes: the number of I/O bytes
- io_rt_us: the I/O response time
sys_read_io_stat The read I/O statistics of system background tasks (such as major compactions and data migration) - average_size: the average size
- average_rt_us: the average response time
- old_stat: the old statistics data
- io_cnt: the number of I/O operations
- io_bytes: the number of I/O bytes
- io_rt_us: the I/O response time
- new_stat: the new statistics data
- io_cnt: the number of I/O operations
- io_bytes: the number of I/O bytes
- io_rt_us: the I/O response time
sys_write_io_stat The write I/O statistics of system background tasks (such as data dumps and log synchronization - average_size: the average size
- average_rt_us: the average response time
- old_stat: the old statistics data
- io_cnt: the number of I/O operations
- io_bytes: the number of I/O bytes
- io_rt_us: the I/O response time
- new_stat: the new statistics data
- io_cnt: the number of I/O operations
- io_bytes: the number of I/O bytes
- io_rt_us: the I/O response time
large_query_read_io_stat The read I/O statistics generated by large queries (executed beyond the threshold) - average_size: the average size
- average_rt_us: the average response time
- old_stat: the old statistics data
- io_cnt: the number of I/O operations
- io_bytes: the number of I/O bytes
- io_rt_us: the I/O response time
- new_stat: the new statistics data
- io_cnt: the number of I/O operations
- io_bytes: the number of I/O bytes
- io_rt_us: the I/O response time
large_query_write_io_stat The write I/O statistics generated by large queries (such as batch writes) - average_size: the average size
- average_rt_us: the average response time
- old_stat: the old statistics data
- io_cnt: the number of I/O operations
- io_bytes: the number of I/O bytes
- io_rt_us: the I/O response time
- new_stat: the new statistics data
- io_cnt: the number of I/O operations
- io_bytes: the number of I/O bytes
- io_rt_us: the I/O response time
io_ability This section contains the I/O performance measurement results of the system, including the IOPS (input/output operations per second) and response time rt_usfor different modes (read/write) and different sizes of I/O operations.- log_io
- user_io sys_io
- prewarm_io
- large_query_io
These sections contain the status and device capabilities of different types of I/O, including whether they are initialized is_inited_, device capabilitiesdevice_ability_, the shared percentageshare_percentage_, and the last deadline timestamplast_deadline_ts_.- log_read
- log_write user_read
- user_write sys_read
- sys_write prewarm_read
- prewarm_write
- large_query_read
- large_query_write
This section contains the statistics of different types of I/O operations, including the starting IOPS start_iops, canceled IOPScancel_iops, failed IOPSfail_iops, successful IOPSsucc_iops, average bytesavg_byte, device response timedevice_rt_us, and queue response timequeue_rt_us.avg_cpu This section displays the average CPU usage of the system. The I/O speed limit for OceanBase Database is controlled by
sys_bkgd_io_low_percentageandsys_bkgd_io_high_percentage, which represent the lower and upper limits ofsys_io_percent, respectively.To determine whether I/O throttling has occurred, you can use the following methods based on the io stat information:
If
sys_io_high_percent,sys_io_low_percent,sys_bkgd_io_high_percentage, andsys_bkgd_io_low_percentageare inconsistent, I/O throttling is occurring.If
user_read_io_stathas an average_size of 0, I/O throttling is occurring.
current io status command
[admin@OB /home/admin/oceanbase/log] $grep "current io status" observer.logSample output
[2023-06-21 16:54:14.212314] INFO [COMMON] ob_io_struct.cpp:2129 [89617][0][Y0-0000000000000000-0-0] [lt=19] [dc=0] current io status(io_config={sys_io_low_percent_:70, sys_io_high_percent_:90, user_iort_up_percent_:100, cpu_high_water_level_:2560, write_failure_detect_interval_:60000000, read_failure_black_list_interval_:300000000, data_storage_warning_tolerance_time_:30000000, data_storage_error_tolerance_time_:300000000, disk_io_thread_count_:8, callback_thread_count_:8, large_query_io_percent_:0, data_storage_io_timeout_ms_:120000, io_memory_limit_:2147483648, enable_io_tracer_:false}, io_ability={measure_items:[[{mode:0, size:4096, iops:120191.00, rt_us:180.58}, {mode:0, size:8192, iops:155470.00, rt_us:198.67}, {mode:0, size:16384, iops:122040.00, rt_us:259.74}, {mode:0, size:32768, iops:76429.00, rt_us:416.30}, {mode:0, size:65536, iops:41826.00, rt_us:762.89}, {mode:0, size:131072, iops:21994.00, rt_us:1452.21}, {mode:0, size:262144, iops:11247.00, rt_us:2841.93}, {mode:0, size:524288, iops:5649.00, rt_us:5658.40}, {mode:0, size:1048576, iops:2828.00, rt_us:11302.68}], [{mode:1, size:4096, iops:127929.00, rt_us:247.71}, {mode:1, size:8192, iops:120343.00, rt_us:263.40}, {mode:1, size:16384, iops:115208.00, rt_us:275.45}, {mode:1, size:32768, iops:76882.00, rt_us:414.05}, {mode:1, size:65536, iops:39596.00, rt_us:805.95}, {mode:1, size:131072, iops:20989.00, rt_us:1522.00}, {mode:1, size:262144, iops:10634.00, rt_us:3004.64}, {mode:1, size:524288, iops:5355.00, rt_us:5970.53}, {mode:1, size:1048576, iops:2692.00, rt_us:11877.67}, {mode:1, size:2097152, iops:1309.00, rt_us:24402.57}]]}, log_io={is_inited_:true, device_ability_:null, share_percentage_:100, last_deadline_ts_:0}, user_io={is_inited_:true, device_ability_:null, share_percentage_:100, last_deadline_ts_:0}, sys_io={is_inited_:true, device_ability_:0x32bb7ba8, share_percentage_:90, last_deadline_ts_:1687334746495760}, prewarm_io={is_inited_:true, device_ability_:0x32bb7ba8, share_percentage_:100, last_deadline_ts_:0}, large_query_io={is_inited_:true, device_ability_:0x32bb7ba8, share_percentage_:100, last_deadline_ts_:0}, log_read={}, log_write={start_iops:29.93, cancel_iops:0.00, fail_iops:0.00, succ_iops:29.93, avg_byte:5461.33, device_rt_us:140.00, queue_rt_us:7.33}, user_read={}, user_write={}, sys_read={}, sys_write={}, prewarm_read={}, prewarm_write={}, large_query_read={}, large_query_write={}, avg_cpu=2.126034912718204453e+02)Key fields
Log information Description io_config This section contains the system's I/O configuration parameters, such as the low and high percentage thresholds for system I/O ( sys_io_low_percent_andsys_io_high_percent_), the upper limit for user I/O (user_iort_up_percent_), the CPU high watermark (cpu_high_water_level_), the number of disk I/O threads (disk_io_thread_count_), the number of callback threads (callback_thread_count_), the percentage for large query I/O (large_query_io_percent_), the timeout for data storage I/O in milliseconds (data_storage_io_timeout_ms_), and the I/O memory limit (io_memory_limit_).io_ability This section contains the system's I/O performance measurement results, including IOPS (Input/Output Operations Per Second) and response time ( rt_us) for different modes (read/write) and different sizes of I/O operations.- log_io
- user_io sys_io
- prewarm_io
- large_query_io
These sections contain the status and device capabilities for different types of I/O, including whether the device is initialized ( is_inited_), device capabilities (device_ability_), shared percentage (share_percentage_), and the last deadline timestamp (last_deadline_ts_).- log_read
- log_write user_read
- user_write sys_read
- sys_write prewarm_read
- prewarm_write
- large_query_read
- large_query_write
This section contains statistics for different types of I/O operations, including the starting IOPS ( start_iops), canceled IOPS (cancel_iops), failed IOPS (fail_iops), successful IOPS (succ_iops), average bytes (avg_byte), device response time (device_rt_us), and queue response time (queue_rt_us).avg_cpu This section displays the system's average CPU usage.
To determine if I/O throttling is occurring based on the io stat information provided:
* If `sys_io_high_percent` and `sys_io_low_percent` do not match the configuration parameters `sys_bkgd_io_high_percentage` and `sys_bkgd_io_low_percentage`, it indicates that I/O throttling is in effect.
* If `last_deadline_ts_` is greater than the time when the log was printed, it means that I/O has been throttled.
Tenant memory statistics
The system prints tenant memory usage statistics every 10 seconds.
Method for locating logs
You can search for the keywords
[MEMORY]orob_malloc_allocator.cppto locate the logs.Commands for locating logs
[admin@OB /home/admin/oceanbase/log] $grep "MEMORY" observer.logor
[admin@OB /home/admin/oceanbase/log] $grep "ob_malloc_allocator" observer.logSample output
[2023-06-05 12:09:21.701235] INFO [LIB] ob_malloc_allocator.cpp:495 [88518][0][Y0-0000000000000000-0-0] [lt=5] [dc=0] [MEMORY] tenant: 1, limit: 12,884,901,888 hold: 8,524,922,880 rpc_hold: 0 cache_hold: 6,773,800,960 cache_used: 6,773,800,960 cache_item_count: 3,230 [MEMORY] ctx_id= DEFAULT_CTX_ID hold_bytes= 658,505,728 [MEMORY] ctx_id= MEMSTORE_CTX_ID hold_bytes= 383,778,816 [MEMORY] ctx_id= EXECUTE_CTX_ID hold_bytes= 16,777,216 [MEMORY] ctx_id= SQL_EXEC_CTX_ID hold_bytes= 77,594,624 [MEMORY] ctx_id= TRANS_CTX_MGR_ID hold_bytes= 8,388,608 [MEMORY] ctx_id= REPLAY_STATUS_CTX_ID hold_bytes= 6,291,456 [MEMORY] ctx_id= PLAN_CACHE_CTX_ID hold_bytes= 553,648,128 [MEMORY] ctx_id= WORK_AREA hold_bytes= 27,262,976 [MEMORY] ctx_id= TX_CALLBACK_CTX_ID hold_bytes= 18,874,368 [2023-06-05 12:09:21.701307] INFO [LIB] ob_tenant_ctx_allocator.cpp:167 [88518][0][Y0-0000000000000000-0-0] [lt=16] [dc=0] [MEMORY] tenant_id= 1 ctx_id= DEFAULT_CTX_ID hold= 658,505,728 used= 553,301,808 [MEMORY] hold= 371,125,248 used= 368,276,480 count= 139 avg_used= 2,649,471 mod=MysqlRequesReco [MEMORY] hold= 52,416,000 used= 52,403,200 count= 25 avg_used= 2,096,128 mod=OB_KVSTORE_CACHE [MEMORY] hold= 49,029,760 used= 48,846,080 count= 24 avg_used= 2,035,253 mod=SqlPlanMon [MEMORY] hold= 35,543,040 used= 35,192,202 count= 534 avg_used= 65,903 mod=SqlDtl [MEMORY] hold= 14,696,192 used= 12,591,876 count= 14 avg_used= 899,419 mod=LOCALDEVICE [MEMORY] hold= 8,386,560 used= 8,384,512 count= 4 avg_used= 2,096,128 mod=Election [MEMORY] hold= 6,298,560 used= 6,281,216 count= 257 avg_used= 24,440 mod=BlockMap [MEMORY] hold= 6,289,920 used= 6,288,384 count= 3 avg_used= 2,096,128 mod=ElectionGroup [MEMORY] hold= 2,154,240 used= 408,600 count= 24,952 avg_used= 16 mod=NumberKey fields
Log information Description tenant The tenant ID. limit The memory upper limit of the tenant, in bytes. hold The total memory obtained by the tenant from the operating system, in bytes. rpc_hold The memory occupied by RPC threads, in bytes. cache_hold The memory occupied by KVCache, in bytes. cache_used The memory used by KVCache, in bytes. cache_item_count The number of KVCache items.
Process memory statistics
The memory usage statistics of the observer process are printed every 10 seconds.
Method for locating logs
Search for the keyword
[CHUNK_MGR]to locate the logs.Log location command
[admin@OB /home/admin/oceanbase/log] $grep "CHUNK_MGR" observer.logTypical output example
[2023-05-15 16:24:19.728623] INFO [COMMON] ob_tenant_mgr.cpp:643 [87101][0][Y0-0000000000000000-0-0] [lt=7] [dc=0] [CHUNK_MGR] free=40593 pushes=8989641 pops=8949048 limit=107,978,014,720 hold=106,384,326,656 used= 21,254,635,520 freelist_hold= 85,129,691,136 maps=602,804 unmaps=587,963 large_maps=599,637 large_unmaps=587,963 memalign=0 virtual_memory_used=111,853,719,552Key field description
Log information Description limit The total memory limit of the OBServer, in bytes. hold The total memory allocated by the OBServer from the operating system, in bytes. used The actual memory used by the OBServer (including cache), in bytes. freelist_hold The unused memory of the OBServer, in bytes. maps, unmaps The number of Mmap operations for each chunk_size monitored by the OBServer, in bytes. large_maps, large_unmaps The number of Maps and Unmaps monitored by the OBServer, in bytes.
Tenant thread queue statistics
The tenant thread queue statistics are printed every 10 seconds.
Method to locate the log
Search for the keywords
ob_multi_tenant.cppordump tenant info.Commands to locate the log
[admin@OB /home/admin/oceanbase/log] $grep "ob_multi_tenant.cpp" observer.logor
[admin@OB /home/admin/oceanbase/log] $grep "dump tenant info" observer.logSample output
[2023-05-15 17:31:43.710008] INFO [SERVER.OMT] ob_multi_tenant.cpp:812 [88299][0][Y0-0000000000000000-0-0] [lt=8] [dc=0] dump tenant info(tenant={id:1001, compat_mode:1, unit_min_cpu:"3.000000000000000000e+00", unit_max_cpu:"3.000000000000000000e+00", slice:"0.000000000000000000e+00", slice_remain:"0.000000000000000000e+00", token_cnt:12, ass_token_cnt:12, lq_tokens:1, used_lq_tokens:0, stopped:false, idle_us:9509657, recv_hp_rpc_cnt:0, recv_np_rpc_cnt:0, recv_lp_rpc_cnt:0, recv_mysql_cnt:0, recv_task_cnt:0, recv_large_req_cnt:0, tt_large_quries:0, actives:12, workers:12, nesting workers:7, lq waiting workers:0, req_queue:total_size=0 queue[0]=0 queue[1]=0 queue[2]=0 queue[3]=0 queue[4]=0 queue[5]=0 , large queued:0, multi_level_queue:total_size=0 queue[0]=0 queue[1]=0 queue[2]=0 queue[3]=0 queue[4]=0 queue[5]=0 queue[6]=0 queue[7]=0 })Key fields
Log information Description compat_mode The compatibility mode of the tenant - 0: MySQL-compatible mode
- 1: Oracle-compatible mode
unit_min_cpu The minimum number of CPU cores. unit_max_cpu The maximum number of CPU cores. slice The size of the tenant unit slice. slice_remain The remaining size of the tenant unit slice. token_cnt The number of tokens allocated by the scheduler. One token corresponds to one worker thread. ass_token_cnt The number of tokens currently confirmed by the tenant (generally equal to token_cnt). lq_tokens The number of large query tokens. used_lq_tokens The number of workers holding LQ tokens. stopped Indicates whether the tenant unit is being deleted. idle_us The total idle time of worker threads in one cycle (10s). Idle time is calculated as the time spent waiting in the queue. - recv_hp_rpc_cnt
- recv_np_rpc_cnt
- recv_lp_rpc_cnt
The number of RPC requests received by the tenant at different levels: HP (High), NP (Normal), and LP (Low). recv_mysql_cnt The number of MySQL requests received by the tenant. recv_task_cnt The number of internal tasks received by the tenant. recv_large_req_cnt The number of large requests predicted by the tenant, which only increases and never resets. It increases when retries occur. tt_large_quries The number of large requests processed by the tenant, which only increases and never resets. It increases when checkpoints are performed. actives The number of active worker threads. workers The number of worker threads of the tenant. nesting workers The number of worker threads dedicated to nested requests. Seven threads correspond to seven nested levels. lq waiting workers The number of worker threads waiting for scheduling. req_queue total_size indicates the total number of requests currently in the waiting queue. large queued The number of large query requests currently in the waiting queue. multi_level_queue The work queue for nested requests, corresponding to 7 nested levels (queue [0] is temporarily unused). The field indicates the number of queued requests at each level.
RPC thread statistics
The usage of RPC threads is printed every 1 second. Each log entry provides statistics for a specific RPC thread.
How to locate the logs
Search for the keywords
ob_net_easy.cppor[RPC EASY STAT].Commands to locate the logs
[admin@OB /home/admin/oceanbase/log] $grep "ob_net_easy.cpp" observer.logor
[admin@OB /home/admin/oceanbase/log] $grep "RPC EASY STAT" observer.logSample output
[2023-05-15 18:28:17.681483] INFO [RPC.FRAME] ob_net_easy.cpp:601 [88307][0][Y0-0000000000000000-0-0] [lt=18] [dc=0] [RPC EASY STAT](log_str="conn count=3/4, request done=7404635/5179373, request doing=0/0") [2023-05-15 18:28:17.682056] INFO [RPC.FRAME] ob_net_easy.cpp:601 [88310][0][Y0-0000000000000000-0-0] [lt=8] [dc=0] [RPC EASY STAT](log_str="conn count=3/3, request done=7402021/5162953, request doing=0/0") [2023-05-15 18:28:17.682108] INFO [RPC.FRAME] ob_net_easy.cpp:601 [88309][0][Y0-0000000000000000-0-0] [lt=11] [dc=0] [RPC EASY STAT](log_str="conn count=3/3, request done=7405199/5164208, request doing=0/0") [2023-05-15 18:28:17.682108] INFO [RPC.FRAME] ob_net_easy.cpp:601 [88308][0][Y0-0000000000000000-0-0] [lt=9] [dc=0] [RPC EASY STAT](log_str="conn count=3/3, request done=7402609/5163322, request doing=0/0")Key fields
Field Description conn count The number of connections handled by the RPC thread (in/out). request done The number of requests completed by the RPC thread (in/out). request doing The number of requests being processed by the RPC thread (in/out).
Thread statistics for archiving
Common thread statistics for archiving include statistics for log archiving status, clog split threads, and clog send threads. These statistics are printed every 10 seconds.
Log archiving status statistics
Log location method
Use the keyword
print_archive_statusto locate the log.Log location command
[admin@OB /home/admin/oceanbase/log] $grep "print_archive_status" observer.logSample output
[2023-08-24 13:59:53.565584] INFO [ARCHIVE] ob_archive_mgr.cpp:830 [11357][0][Y0-0000000000000000-0-0] [lt=13] [dc=0] print_archive_status(clog_split_task_num=0, send_task_num=2, pre_send_task_capacity=458752, pg_archive_task_count=903, archive_incarnation=1, archive_round=6, piece_id=0, piece_create_date=20230810, log_archive_status=2, has_encount_error=false)Key fields
Log information Description clog_split_task_num The number of clog split tasks currently being executed. send_task_num The number of clog send tasks currently being executed. pre_send_task_capacity The amount of data to be archived. pg_archive_task_count The number of archive tasks related to the partition group (PG). log_archive_status The archiving status, with the following values: - 0: Invalid status
- 1: Starting archiving status
- 2: Archiving in progress
- 3: Stopping archiving status
- 4: Archiving stopped
clog split thread statistics
- Log location method
You can search for the keyword archive_clog_split_engine statistics to locate the log.
Log location command
[admin@OB /home/admin/oceanbase/log] $grep "archive_clog_split_engine statistics" observer.logExample output
[2023-08-24 14:46:13.179165] INFO [ARCHIVE] ob_archive_clog_split_engine.cpp:1120 [11389][0][Y0-0000000000000000-0-0] [lt=11] [dc=0] archive_clog_split_engine statistics(avg_split_task_used=51, avg_read_log_used=40, avg_read_log_size=3058, avg_get_send_task_used=11, avg_send_task_per_task=1, split_task_count=17)Key fields
Field Description avg_split_task_used The average time consumed by each clog split task (in microseconds). avg_read_log_used The average time consumed by each log read (in microseconds). avg_read_log_size The average size of each log read. avg_get_send_task_used The average time consumed by each clog send task (in microseconds). avg_send_task_per_task The average number of clog send tasks corresponding to each clog split task. split_task_count The number of clog split tasks. Statistics of clog send threads
Log location method
You can search for the keyword
archive_sender statisticto locate the log.Log location command
[admin@OB /home/admin/oceanbase/log] $grep "archive_sender statistic" observer.logExample output
[2023-06-12 16:15:50.311863] INFO [ARCHIVE] ob_archive_sender.cpp:1589 [25879][0][Y0-0000000000000000-0-0] [lt=5] [dc=0] archive_sender statistic in 10s(total_send_buf_size=29384, total_send_log_count=157, avg_log_size=187, total_send_task_count=2, total_send_cost_ts=2809, avg_send_task_cost_ts=1404)Example output
Log information Description total_send_buf_size The size of logs sent by the archiving thread within 10 seconds. If log transmission compression is enabled, the sent volume is generally 1/3 of the original uncompressed logs. total_send_log_count The number of log records sent by the archiving thread within 10 seconds. total_send_task_count The total number of network I/O operations performed by the archiving thread within 10 seconds. total_send_cost_ts The total time (in microseconds) spent by the archiving thread within 10 seconds. If this value is close to or exceeds 10 seconds, it indicates that the I/O has reached its maximum capacity. avg_send_task_cost_ts The average time (in microseconds) spent on a single network I/O operation, typically ranging from a few milliseconds to tens of milliseconds.
DAG thread statistics
The DAG thread statistics are printed every 10 seconds.
Method to locate the log
Search for the keyword
dump_dag_status.Command to locate the log
[admin@OB /home/admin/oceanbase/log] $grep "dump_dag_status" observer.logExample of typical output
[2023-09-04 17:15:25.494000] INFO [COMMON] ob_dag_scheduler.cpp:1314 [9777][0][Y0-0000000000000000-0-0] [lt=5] [dc=0] dump_dag_status(priority="DAG_PRIO_SSTABLE_MINI_MERGE", low_limit=4, running_task=0) [2023-09-04 17:15:25.494065] INFO [COMMON] ob_dag_scheduler.cpp:1318 [9777][0][Y0-0000000000000000-0-0] [lt=3] [dc=0] dump_dag_status(up_limit_type="DAG_ULT_MINI_MERGE", up_limit=8, running_task_per_ult=0) [2023-09-04 17:15:25.494121] INFO [COMMON] ob_dag_scheduler.cpp:1322 [9777][0][Y0-0000000000000000-0-0] [lt=4] [dc=0] dump_dag_status(type="DAG_MINOR_MERGE", dag_count=0) [2023-09-04 17:15:25.494125] INFO [COMMON] ob_dag_scheduler.cpp:1322 [9777][0][Y0-0000000000000000-0-0] [lt=3] [dc=0] dump_dag_status(type="DAG_MAJOR_MERGE", dag_count=0) [2023-09-04 17:15:25.494210] INFO [COMMON] ob_dag_scheduler.cpp:1324 [9777][0][Y0-0000000000000000-0-0] [lt=3] [dc=0] dump_dag_status(total_worker_cnt=52, total_running_task_cnt=0, work_thread_num=113)Key fields
Log information Description priority The priority of the background task. OceanBase Database prioritizes scheduling tasks with higher priorities. low_limit The minimum number of threads for background tasks. up_limit The maximum number of threads for background tasks. type The type of the background task. dag_count The total number of requests that have entered the background task queue. total_worker_cnt The total number of active threads for all types of background tasks. total_running_task_cnt The number of background tasks currently being executed. work_thread_num The maximum number of threads for all types of background tasks.
Transaction context statistics
The number of transaction context allocations is printed every 5 seconds.
Method for locating logs
Search for the keyword
ObTransCtx statistics.Log location command
[admin@OB /home/admin/oceanbase/log] $grep "ObTransCtx statistics" observer.logExample output
[2023-05-15 17:29:07.955826] INFO [STORAGE.TRANS] ob_trans_factory.cpp:175 [88157][0][YB420BA6506F-0005FB92633E1702-0-0] [lt=11] [dc=0] ObTransCtx statistics(active_sche_ctx_count=0, active_coord_ctx_count=0, active_part_ctx_count=1, total_release_part_ctx_count=156)Key fields
Log information Description active_sche_ctx_count The number of active transaction contexts for the Scheduler role. active_coord_ctx_count The number of active transaction contexts for the Coordinator role. active_part_ctx_count The number of active transaction contexts for the Participant role. total_release_part_ctx_count The number of released transaction contexts. Note
The maximum number of Scheduler CTXs allowed on each OBServer node is 100,000, and the maximum number of Participant CTXs allowed is 700,000.
Diagnostic log
Memory allocation failed
This error occurs when a system or application dynamically allocates memory. It is commonly found in database or program operation logs.
Log location method
Search for keywords such as
[OOPS],alloc failed,-4013, and-4030.Log location command
[admin@OB /home/admin/oceanbase/log] $grep "OOPS|alloc failed|-4013|-4030" observer.logTypical output example
[2019-11-29 13:32:19.702557] WARN alloc_chunk (achunk_mgr.cpp:156) [116413][1492][XXXXXXXXXXXX-XXXXXXXXXXXXXXXX] [lt=10] [dc=0] oops, over total memory limit, hold=241078108160 limit=243230035968 [2019-11-29 13:32:19.703452] WARN alloc (ob_tenant_ctx_allocator.h:111) [116413][1492][XXXXXXXXXXXX-XXXXXXXXXXXXXXXX] [lt=15] [dc=0] oops, alloc failed, tenant_id=500 ctx_id=0 ctx_name=DEFAULT_CTX_ID ctx_hold=13245612032 ctx_limit=9223372036854775807 tenant_hold=26003955712 tenant_limit=9223372036854775807 [2019-11-29 13:32:19.710755] ERROR [LIB] ob_malloc (ob_malloc.h:49) [116413][1492][XXXXXXXXXXXX-XXXXXXXXXXXXXXXX] [lt=364] [dc=0] allocate memory fail(attr=tenant_id=500, mod_id=253, ctx_id=0, prio=0, nbyte=2097152)Key fields
Log information Description ctx_name The memory module (context) that exceeded the memory limit. ctx_hold The amount of memory currently occupied by the memory module. ctx_limit The memory limit of the memory module. alloc_size The size of the current memory allocation request.
Write failure
This error occurs when the system attempts to write data to a storage device, such as a hard disk or SSD, and typically includes an error message.
Method to locate logs
Search for the keyword
submit_log failed.Log location command
[admin@OB /home/admin/oceanbase/log] $grep "submit_log failed" observer.logSample output
[2023-02-14 16:51:10.255801] ERROR [CLOG] submit_log (ob_partition_log_service.cpp:468) [893][0][Y0-0000000000000000-0-0] [lt=19] submit_log failed(ret=-4264, partition_key={tid:1099511627916, partition_id:0, part_cnt:0}, buff=0xfff80a4aef70, size=805, base_timestamp=1676065578191066, is_trans_related_log=true, role=1, state=4)Key fields
Field Description ret The error code. partition_key The object (table_id + partition_id).
MemStore throttling
This log entry indicates that MemStore write throttling is triggered.
Method to locate the log
Search for the keyword
report write throttle info.Command to locate the log
[admin@OB /home/admin/oceanbase/log] $grep "report write throttle info" observer.logSample output
2022-12-29 04:46:44.841893] INFO [COMMON] ob_fifo_arena.cpp:333 [28948][0][XXXXXXXXXXXX-XXXXXXXXXXXXXXXXXX-X-X] [lt=16] [dc=0] report write throttle info(alloc_size=24, throttling_interval=1000, attr_=tenant_id=1005, label=15, ctx_id=1, prio=0, freed memory(MB):=0, last_base_ts=1672260404842042, cur_mem_hold=11970543616, trigger_mem_limit=21990232512, trigger_mem_remain=5497558128, tenant_memory_remain=5448400896, throttle_info_={decay_factor_:"3.002618459977128736e-04", alloc_duration_:3600000000, trigger_percentage_:80, memstore_threshold_:27487790640, period_throttled_count_:45, period_throttled_time_:315460, total_throttled_count_:181984128, total_throttled_time_:72058595126473244}, mem_overused=49157232)Key fields
Field Description cur_mem_hold The current memory usage of MemStore. trigger_mem_limit The memory threshold that triggers throttling. trigger_mem_remain The remaining memory of MemStore when throttling is triggered. tenant_memory_remain The current remaining memory of MemStore. mem_overused The memory usage exceeding the throttling threshold.
Lock conflict
This log record indicates a row lock conflict and a retry.
Log location method
Write-write conflict: Search for the keywords
on_wlock_retryandlock_for_write conflict.Read-write conflict: Search for the keywords
lock_for_read retryandlock_for_read fail.
Log location command
Write-write conflict
[admin@OB /home/admin/oceanbase/log] $grep "on_wlock_retry" observer.logRead-write conflict
[admin@OB /home/admin/oceanbase/log] $grep "lock_for_read retry" observer.log
Sample output
[2022-09-03 11:33:06.624063] WARN [STORAGE.TRANS] on_wlock_retry (ob_memtable_context.cpp:345) [4198][0][XXXXXXXXXXXX-XXXXXXXXXXXXXXXXXX-X-X] [lt=16] [dc=0] lock_for_write conflict(*this=alloc_type=0 ctx_descriptor=45479 … abs_expired_time=1662179586523672 stmt_timeout=3599900482 … pkey={tid:1100611139453777, partition_id:0, part_cnt:0} trans_id={hash:16499054864673064020, inc:284227, addr:"xxx.xxx.xxx.xxx:2882", t:1662175982313772} checksum_log_ts=0, key=table_id=1100611139453777 rowkey_object=[{"NUMBER":"1"}] , conflict_ctx="alloc_type=0 ctx_descriptor=45354 … stmt_start_time=1662175972255438 abs_expired_time=1662179572155922 stmt_timeout=3599900484 … pkey={tid:1100611139453777, partition_id:0, part_cnt:0} trans_id={hash:15878544715862172232, inc:283677, addr:"xxx.xxx.xxx.xxx:2882", t:1662175972255438} checksum_log_ts=0")Key field description
Log information Description *this The waiter (the transaction waiting for the lock). conflict_ctx The blocker (the transaction holding the lock, which could be the holder). pkey Information about the table_id and partition_id. rowkey_object The primary key. trans_id Information about the transaction.
Slow query
The slow query log contains detailed information about SQLTrace.
Method to locate the log
Search for the keyword
[slow query].Command to locate the log
[admin@OB /home/admin/oceanbase/log] $grep "slow query" observer.logSample output
[2022-02-18 11:49:32.557473] TRACE [TRACE]obmp_base.cpp:948 [1837144][3090][XXXXXXXXXX-XXXXXXXXXXXXXX] [lt=2] [dc=0] [slow query] (TRACE=begin_ts=1645156047499895 2022-02-18 03:47:27.499895 |[process_begin] u=0 in_queue_time:16, receive_ts:1645156047499878, enqueue_ts:1645156047499879 |[query_begin] u=1 trace_id:XXXXXXXXXX-XXXXXXXXXXXXXX |[session] u=4 sid:3221502589, tenant_id:1| ...... |[parse_begin] u=17 stmt:"select count(*) from TEST1 fetch first 1000 rows only", stmt_len:58 |[cache_get_plan_begin] u=3 |[cache_get_plan_end] u=18 ...... |[cg_begin] u=1 |[cg_end] u=137 |[plan_id] u=96 plan_id:91455 |[exec_begin] u=2 arg1:false, end_trans_cb:false |[do_open_plan_begin] u=2 plan_id:91455 ...... |[result_set_close] u=0 ret:-5066, arg1:-5066, arg2:0, arg3:-5066, async:false |[session] u=34 sid:3226669564, tenant_id:1034 |[exec_end] u=21 |[query_end] u=125 |[process_end] u=38 run_ts:1645156047499900 |total_timeu=125057508)Key fields
Log information Description [eventbegin] and [eventend] The start and end of internal events during SQL execution. u=xxx The time interval between two trace events. in_queue_time The time SQL spent waiting in the queue before execution. trace_id The trace ID for SQL execution. sid The session ID where SQL execution occurred. stmt The SQL statement. plan_id The plan ID for SQL execution. ret The return code (error code). total_timeu The total time spent processing SQL.
Slow transactions
This topic describes the call chains and execution time of slow transactions.
Method for locating logs
Search for the keyword
[slow trans].Command for locating logs
[admin@OB /home/admin/oceanbase/log] $grep "slow trans" observer.logExample output
observer.log:[2021-10-08 11:45:44.313295] TRACE [TRACE]:0 [9382][2334][XXXXXXXXXX-XXXXXXXXXXXXXXXXX] [lt=23] [dc=39] [slow trans] (TRACE=begin_ts=1633664744167732 2021-10-08 03:45:44.167732 [init] u=0 arg1:140542173395392, ctx_type:"scheduler", trans_type:2, trans_id:{hash:7103234888693285108, inc:532846, addr:{ip:"192.168.1.1", port:40000}, t:1633664744167658}, pkey:{tid:1000, partition_id:1000, part_cnt:1000}, arg1:false, uref:1073741824 [set_stc] u=1 stc:1633664744167733 [start_trans] u=2 ret:0, uref:1073741825 [start_stmt] u=7 ret:0, tenant_id:1003, sql_no:4294967297, phy_plan_type:3, stmt_type:2, sql_id:"AA2155E24549E97", trace_id:Y9C4064586A6B-0005CDCF253FBE70, start_time:1633664744166256, uref:1073741825 [end_stmt] u=30832 ret:0, is_rollback:false, rollback_to:4294967297, rollback_from:4294967297, uref:1073741825 [start_stmt] u=50000 ret:0, tenant_id:1003, sql_no:4294967298, phy_plan_type:3, stmt_type:2, sql_id:"AA2155E24549E97", trace_id:Y9C4064586A6B-0005CDCF253FBE70, start_time:1633664744166256, uref:1073741825 [end_stmt] u=10 ret:0, is_rollback:false, rollback_to:4294967298, rollback_from:4294967298, uref:1073741825 [submit_commit] u=52 ret:0, uref:1073741824 [end_trans] u=35 ret:0, is_rollback:false, uref:1073741825 [exiting] u=114618 uref:1073741824, arg1:3221531139, arg2:7230647234361492284 [handle_message] u=1 ret:0, msg_type:1, uref:1 [end_trans_cb] u=13 ret:0, param:0, param:true [destroy] u=1 uref:0 total_timeu=195572)Key fields
Log information Description ctx_type The type of the transaction context: scheduler, coordinator, or participant. trans_type The type of the transaction. - 0: SP_TRANS (single partition)
- 1: MINI_SP_TRANS (read-only single partition)
- 2: DIST_TRANS (distributed)
trans_id The ID of the transaction. pkey The object modified by the current transaction context (table_id + partition_id). [start_trans] /[end_trans] The start and end of the transaction context. [start_stmt] /[end_stmt] The start and end of SQL execution, including the SQL ID, trace ID, and other information. u=xxx The time interval between two trace events. total_timeu The total time spent on transaction processing.
Slow network transmission
The RPC thread takes too long to send the packet.
Method to locate the log
Search for the keyword
packet fly cost too much time.Command to locate the log
[admin@OB /home/admin/oceanbase/log] $grep "packet fly cost too much time" observer.logExample output
[2022-10-27 23:09:58.317505] WARN [RPC.OBRPC] decode (ob_rpc_net_handler.cpp:119) [31273] [Y00000000000000000] [lt=8] packet fly cost too much time(pcode=2311, fly_ts=769195, send_timestamp=1666883397548305)Key fields
Log information Description pcode The processing event code. You can query its meaning from __all_virtual_obrpc_stat.fly_ts The time taken for the packet to be sent and encoded from the source to the destination, and received and decoded at the destination. send_timestamp The timestamp when the remote end sends the packet. Note
Slow RPC transmission is likely due to network latency, clock skew, and high system load and busy RPC threads.
Bandwidth throttling
OBServer sets the network bandwidth throttling for system tasks.
Log search method
Search for the keywords
network speed changed,reload_bandwidth_throttle_limit, andinit_bandwidth_throttle.Log search command
[admin@OB /home/admin/oceanbase/log] $grep "network speed changed|reload_bandwidth_throttle_limit|init_bandwidth_throttle" observer.logSample output
[2021-06-10 11:10:49.397019] INFO [SERVER] ob_server.cpp:2018 [72580][0][Y0-0000000000000000-0-0] [lt=8] [dc=0] succeed to init_bandwidth_throttle(sys_bkgd_net_percentage_=60, network_speed=1310720000, rate=786432000): [2021-06-10 11:14:44.905396] INFO [SERVER] ob_server.cpp:2385 [72582][0][Y0-0000000000000000-0-0] [lt=11] [dc=0] network speed changed(from=1310720000, to=1048576000) [2021-06-10 11:14:44.905421] INFO [SERVER] ob_server.cpp:2055 [72582][0][Y0-0000000000000000-0-0] [lt=3] [dc=0] succeed to reload_bandwidth_throttle_limit(old_percentage=60, new_percentage=60, network_speed=1048576000, rate=629145600)Key fields
Field Description sys_bkgd_net_percentage_ The value of the sys_bkgd_net_percentage parameter, which specifies the percentage of the network bandwidth allocated to background threads. The default value is 60% of the network card bandwidth. network_speed The network card bandwidth, which is set based on the relevant configuration files. rate The network bandwidth allocated to background threads, calculated as network_speed * sys_bkgd_net_percentage.
High-risk failure
The system has encountered an error that requires the DBA to investigate and resolve.
Method for locating logs
Search for the keyword
ERROR issue_dba_error.Log location command
[admin@OB /home/admin/oceanbase/log] $grep "ERROR issue_dba_error" observer.logSample output
2023-05-11 16:05:27.966611] ERROR issue_dba_error (ob_log.cpp:2322) [79165][0][Y0-0000000000000000-0-0] [lt=18] [dc=0][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=0, file="ob_log_file_group.cpp", line_no=412, info="clog disk is almost full") [2023-05-11 16:45:58.135783] ERROR issue_dba_error (ob_log.cpp:2322) [77332][0][Y0-0000000000000000-0-0] [lt=8] [dc=0][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=-4264, file="ob_partition_log_service.cpp", line_no=468, info="submit_log failed") [2023-05-11 16:45:59.448092] ERROR issue_dba_error (ob_log.cpp:2322) [77584][0][Y0-0000000000000000-0-0] [lt=10] [dc=0][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=0, file="ob_election.cpp", line_no=2077, info="leader_revoke, please attention!")Key fields
Log information Description internal errcode The error code. file The source code where the error was detected. info The error message.
rootservice.log
The cluster management service log, specifically for recording control operations of the RootService (RS) module, including node scheduling, DDL execution, replica balancing, etc.
Statistics Log
Statistics of load balancing tasks
The statistics of load balancing tasks.
Log search method
Search for the keyword
Rebalance task manager statistics.Log search command
[admin@OB /home/admin/oceanbase/log] $grep "Rebalance task manager statistics" observer.logSample output
[2023-05-10 05:22:50.862554] INFO [RS] ob_rebalance_task_mgr.cpp:1887 [65048][0][XXXXXXXXXX-XXXXXXXXXXXXXX-X-X] [lt=14] [dc=0] Rebalance task manager statistics(waiting_replication_task_cnt=0, executing_replication_task_cnt=0, waiting_migration_task_cnt=0, executing_migration_task_cnt=0)Key fields
Field Description waiting_replication_task_cnt The number of replication tasks in the waiting state. executing_replication_task_cnt The number of replication tasks in the executing state. waiting_migration_task_cnt The number of migration tasks in the waiting state. executing_migration_task_cnt The number of migration tasks in the executing state.
Merge progress statistics
This topic describes the logs related to the merge progress.
Log search method
You can search for the logs by using the
check merge progress successorsucc to finish zone mergekeyword.Log search command
[admin@OB /home/admin/oceanbase/log] $grep "check merge progress success" observer.logor
[admin@OB /home/admin/oceanbase/log] $grep "succ to finish zone merge" observer.logSample output
[2021-04-07 10:37:16.451302] INFO [RS] ob_daily_merge_scheduler.cpp:1840 [4532][XXXXXXXXXX-XXXXXXXXXXXXXX] [lt=34] succeed to add_finish_zone_merge_op(zone=bank.ea133_8, last_merged_version=10044, all_merged_version=10044) [2021-04-07 10:37:18.114716] INFO [RS] ob_daily_merge_scheduler.cpp:1986 [4532][XXXXXXXXXX-XXXXXXXXXXXXXX] [lt=14] succ to finish zone merge(zone=bank.ea133_8, last_merged_version=10044, all_merged_version=10044)or
[2023-05-16 20:04:48.270903] INFO [RS] ob_partition_table_util.cpp:317 [87321][0][XXXXXXXXXX-XXXXXXXXXXXXXX] [lt=10] [dc=0] check merge progress success(version=38, all_majority_merged=false, all_progress=[{zone:"zone1", unmerged_partition_cnt:2301, unmerged_data_size:126884638603, merged_partition_cnt:0, merged_data_size:0, smallest_data_version:37, merged_partition_percentage:0, merged_data_percentage:0}, {zone:"zone2", unmerged_partition_cnt:0, unmerged_data_size:0, merged_partition_cnt:2301, merged_data_size:126884950766, smallest_data_version:38, merged_partition_percentage:100, merged_data_percentage:100}, {zone:"zone3", unmerged_partition_cnt:2301, unmerged_data_size:126884638603, merged_partition_cnt:0, merged_data_size:0, smallest_data_version:37, merged_partition_percentage:0, merged_data_percentage:0}], checking_time_us=202635)Key fields
Field Description version The major version of the current merge. all_majority_merged Indicates whether all zones have completed the merge: true/false. all_progress The merge progress of each zone.
Diagnostic logs
DDL execution
Log search method
Search for the keyword
[DDL] execute ddl like stmt.Log search command
[admin@OB /home/admin/oceanbase/log] $grep "[DDL] execute ddl like stmt" observer.logSample output
[2023-05-16 10:50:10.834911] INFO [RS] ob_rs_rpc_processor.h:249 [87188][0][XXXXXXXXXX-XXXXXXXXXXXXXX] [lt=17] [dc=0] [DDL] execute ddl like stmt(ret=0, cost=44039, ddl_arg={ddl_stmt_str:"CREATE INDEX "IND_PA_ACCTSENIOR" on "T_PA_ACCTSENIOR" ( "FSENIOR_CODE" ) GLOBAL", exec_tenant_id:1005, ddl_id_str:"", is_replay_schema:false, based_schema_object_infos:[{schema_id:1105009185964905, schema_type:5, schema_version:1684205410682568}]})Key fields
Log information Description ret The return code of the DDL statement. cost The execution time. ddl_arg The DDL statement. schema_type The type of the created object.
Partition leader change
Log information generated when the leader of a partition is changed.
How to locate the log
Search for the keyword
choose leader info.Command
[admin@OB /home/admin/oceanbase/log] $grep "choose leader info" observer.logSample output
[2023-05-10 17:41:28.833956] INFO [RS] ob_leader_coordinator.cpp:5004 [65037][0][XXXXXXXXXX-XXXXXXXXXXXXXX-X-X] [lt=27] [dc=0] choose leader info(tenant_id=1005, tablegroup_id=18446744073709551615, server="192.168.1.1:2882", info={server_addr:"192.168.1.1:2882", zone:"zone2", balance_group_score:9223372036854775807, region_score:{region_:"beijing", region_score_:0}, not_merging:true, start_service:true, zone_candidate_count:1, candidate_count:1, is_candidate:true, not_excluded:true, migrate_out_or_transform_count:0, zone_migrate_out_or_transform_count:0, in_normal_unit_count:1, zone_score:{zone_:"zone2", zone_score_:0}, original_leader_count:0, random_score:3, cur_leader_count:1, in_revoke_blacklist_count:0, partition_id:0}, is_stopped=false, is_zone_active=true)Key fields
Field Description tablegroup_id The ID of the partition or table group for which the leader is changed. server The location of the new leader. is_stopped The status of the observer where the new leader is located. is_zone_active The status of the zone where the new leader is located.
Partition merge
Log information for partition merge.
Log search method
Search for the keyword
replica not merged.Log search command
[admin@OB /home/admin/oceanbase/log] $grep "replica not merged" observer.logTypical output example
[2021-04-07 10:47:36.835217] INFO [RS] ob_partition_table_util.cpp:180 [4532][XXXXXXXXXX-XXXXXXXXXXXXXX] [lt=5] replica not merged to version(version=10044, replica={table_id:{value:1106108697592732, first:1006, second:50076}, partition_id:0, partition_cnt:1, zone:"bank.mub_2", server:"xxx.xxx.xxx.xxx:2882", sql_port:2881, unit_id:1017, role:1, member_list:[{server:"xxx.xxx.xxx.xxx:2882", timestamp:1557500921627412}, {server:"xxx.xxx.xxx.xxx:2882", timestamp:1560866621699092}, {server:"xxx.xxx.xxx.xxx:2882", timestamp:1592984233874976}, {server:"xxx.xxx.xxx.xxx:2882", timestamp:1614240480439532}, {server:"xxx.xxx.xxx.xxx:2882", timestamp:1614911835988424}], row_count:8635155074, data_size:2061390087129, data_version:10043, data_checksum:3763772143, row_checksum:row:[0], modify_time_us:1617652580558918, create_time_us:1560880959641988, member_time_us:1560866621699092, is_original_leader:false, in_member_list:true, rebuild:false, to_leader_time:1603047186742602, replica_status:"REPLICA_STATUS_NORMAL", replica_type:0, required_size:2295137828864, status:"REPLICA_STATUS_NORMAL", is_restore:0, partition_checksum:14438911093, quorum:5})Key fields
Log information Description version The global version number of the merge. table_id:{value:v0, first:v1, second:v2} v0 table_id; v1 tenant_id; v2 pure table_id data_version The major version number of the partition. replica_type The replica type: 0 - full; 5 logonly; 16 readonly quorum The number of replicas.
election.log
The distributed election transaction log tracks the execution status of the Paxos election protocol, including leader elections, heartbeat detection, and replica consistency negotiations.
Leader revoked
High-risk Leader revoked event (requires immediate intervention).
Log location method
Search for the keywords
leader_revokeandplease attention.Log location command
[admin@OB /home/admin/oceanbase/log] $grep "leader revoke|leader_revoke" observer.logTypical output example
[2022-05-08 22:41:38.827689] INFO [ELECT] ob_election.cpp:2072 [201777][0][Y0-0000000000000000-0-0] [lt=1526] take the initiative leader revoke(partition={tid:1100611139404082, partition_id:0, part_cnt:0}, revoke reason="clog disk full") [2022-05-08 22:41:38.827723] ERROR [ELECT] leader_revoke (ob_election.cpp:2077) [201777][0][Y0-0000000000000000-0-0] [lt=8] leader_revoke, please attention!(revoke reason="clog disk full", election={partition:{tid:1100611139404082, partition_id:0, part_cnt:0}, is_running:true, is_changing_leader:false, self:"xxx.xxx.xxx.xxx:2882", proposal_leader:"0.0.0.0", cur_leader:"xxx.xxx.xxx.xxx:2882", curr_candidates:3{server:"xxx.xxx.xxx.xxx:2882", timestamp:1650642269999783, flag:0, need_encrypt:false}{server:"xxx.xxx.xxx.xxx:2882", timestamp:1651730035655503, flag:0, need_encrypt:false}{server:"xxx.xxx.xxx.xxx:2882", timestamp:1650642269999783, flag:0, need_encrypt:false}, curr_membership_version:1651730068793522, leader_lease:[1651816304570000, 1651816418400000], election_time_offset:5570000, active_timestamp:1651808137937449, T1_timestamp:1651816404000000, leader_epoch:1651816299000000, state:1, role:1, stage:-1, type:-1, replica_num:3, unconfirmed_leader:"xxx.xxx.xxx.xxx:2882", unconfirmed_leader_lease:[1651816304570000, 1651816419000000], takeover_t1_timestamp:1651816402600000, is_need_query:false, valid_candidates:0, change_leader_timestamp:0, ignore_log:false, leader_revoke_timestamp:1651816301770881, vote_period:4, lease_time:9800000, move_out_timestamp:1651816300405697, eg_part_array_idx:40, eg_id_hash:6130129687449640250}) BACKTRACE:0xf5c6965 0xf45eeca 0xf44866a 0x530ea4d 0x9bbea6b 0x9b2c627 0x9b2c0cc 0x9b77a60 0x987a46b 0xb472a62 0x93845b5 0x9384519 0x5260906 0x66daf5d 0xf33ee22 0xf33ec7e 0xf5da8ceKey fields
Log information Description cur_leader The current leader. curr_candidates The list of current candidate nodes. is_changing_leader Whether a leader change is in progress. is_running Whether an election is in progress. leader_lease The lease of the leader (start and end time). leader_revoke_timestamp The time when the leader revoke occurred. proposal_leader The proposed leader. revoke reason The reason for the leader revoke. self The current node. unconfirmed_leader The current unconfirmed leader node.
Leader lease expired
When the lease of the leader expires, a leader re-election event is triggered.
How to locate the log
Search for the keywords
lease is expiredandleader lease is expired.Log location command
[admin@OB /home/admin/oceanbase/log] $grep "lease is expired|leader lease is expired" election.logSample output
[2022-05-31 15:32:19.849628] INFO [ELECT] ob_election.cpp:1890 [44778][T1010][Y0-000000-0-0] [lt=21] leader lease expired(partition={tid:1106108697592658, partition_id:12}, lease_end=1652078088200000, current_ts=1652078139849628, election={self:"10.xx.xx.xx:2882", cur_leader:"10.xx.xx.xx:2882", state:ACTIVE, role:LEADER, lease:[1652078078000000, 1652078088200000]}) [2022-05-31 15:32:19.849849] ERROR [ELECT] handle_lease_expired (ob_election.cpp:1905) [44778][T1010][Y0-000000-0-0] [lt=8] leader_revoke triggered by lease expire!(revoke_reason="lease expire", partition_idx=12, leader="10.xx.xx.xx:2882", next_election_time=1652078139849849)Key fields
Field Description lease_end The lease expiration timestamp, in microseconds. current_ts The timestamp when the lease was checked, in microseconds. revoke_reason The reason for revoking the leader (fixed as "lease expire"). next_election_time The time when the next election is triggered. partition_idx The partition index (related to system table queries).
Leader takeover success
The new leader is elected.
How to locate the log
Search for the keywords
leader takeoverandleader takeover success.Command to locate the log
[admin@OB /home/admin/oceanbase/log] $grep "leader takeover|leader takeover success" observer.logSample output
[2022-05-31 15:32:24.851006] INFO [ELECT] ob_election.cpp:2153 [44801][T1012][Y0-000000-0-0] [lt=11] leader takeover success!(partition={tid:1106108697592658, partition_id:12}, new_leader="12.xx.xx.xx:2882", takeover_type="HANDLE_DEVOTE_SUCC", election={prev_leader:"0.0.0.0", curr_candidates:3{server:"11.xx.xx.xx:2882"}{server:"12.xx.xx.xx:2882"}{server:"10.xx.xx.xx:2882"}, lease:[1652078144851006, 1652078250000000]})Key fields
Field Description new_leader The IP address of the new leader. takeover_type The type of the leader change (e.g., HANDLE_DEVOTE_SUCC). prev_leader The previous leader node. "0.0.0.0" indicates no leader. lease The lease period of the new leader.
Leader invalid status alert
An alert event for abnormal leader status.
How to locate the log
Search for the keyword
current leader is invalid.Command to locate the log
[admin@OB /home/admin/oceanbase/log] $grep "current leader is invalid" rootservice.logSample output
[2022-07-16 02:02:12.847602] WARN [RS] ob_leader_coordinator.cpp:2067 [8570][YB420AF4005E-4626EDFC] [lt=8] [dc=0] current leader is invalid!(tenant_id=1005, ls_id=1100611139404082, invalid_reason="location cache mismatch", current_leader="11.xx.xx.xx:2882", actual_leader="11.xx.xx.xx:2882")Key fields
Field Description invalid_reason The reason for the invalid leader (e.g., cache mismatch or heartbeat loss). current_leader The IP address of the current leader. actual_leader The IP address of the actual leader. tenant_id/ls_id The identifiers of the tenant and log stream.