OceanBase logo

OceanBase

A unified distributed database ready for your transactional, analytical, and AI workloads.

Product Overview
DEPLOY YOUR WAY

OceanBase Cloud

The best way to deploy and scale OceanBase

OceanBase Enterprise

Run and manage OceanBase on your infra

TRY OPEN SOURCE

OceanBase Community Edition

The free, open-source distributed database

OceanBase seekdb

Open source AI native search database

Customer Stories

Real-world success stories from enterprises across diverse industries.

View All
BY USE CASES

Mission-Critical Transactions

Global & Multicloud Application

Elastic Scaling for Peak Traffic

Real-time Analytics

Active Geo-redundancy

Database Consolidation

Resources

Comprehensive knowledge hub for OceanBase.

Blog

Live Demos

Training & Certification

Documentation

Official technical guides, tutorials, API references, and manuals for all OceanBase products.

View All
PRODUCTS

OceanBase Cloud

OceanBase Database

Tools

Connectors and Middleware

QUICK START

OceanBase Cloud

OceanBase Database

BEST PRACTICES

Practical guides for utilizing OceanBase more effectively and conveniently

Company

Learn more about OceanBase – our company, partnerships, and trust and security initiatives.

About OceanBase

Partner

Trust Center

Contact Us

International - English
中国站 - 简体中文
日本 - 日本語
Sign In
Start on Cloud

OceanBase

A unified distributed database ready for your transactional, analytical, and AI workloads.

Product Overview
DEPLOY YOUR WAY

OceanBase Cloud

The best way to deploy and scale OceanBase

OceanBase Enterprise

Run and manage OceanBase on your infra

TRY OPEN SOURCE

OceanBase Community Edition

The free, open-source distributed database

OceanBase seekdb

Open source AI native search database

Customer Stories

Real-world success stories from enterprises across diverse industries.

View All
BY USE CASES

Mission-Critical Transactions

Global & Multicloud Application

Elastic Scaling for Peak Traffic

Real-time Analytics

Active Geo-redundancy

Database Consolidation

Comprehensive knowledge hub for OceanBase.

Blog

Live Demos

Training & Certification

Documentation

Official technical guides, tutorials, API references, and manuals for all OceanBase products.

View All
PRODUCTS
OceanBase CloudOceanBase Database
ToolsConnectors and Middleware
QUICK START
OceanBase CloudOceanBase Database
BEST PRACTICES

Practical guides for utilizing OceanBase more effectively and conveniently

Learn more about OceanBase – our company, partnerships, and trust and security initiatives.

About OceanBase

Partner

Trust Center

Contact Us

Start on Cloud
编组
All Products
    • Databases
    • iconOceanBase Database
    • iconOceanBase Cloud
    • iconOceanBase Tugraph
    • iconInteractive Tutorials
    • iconOceanBase Best Practices
    • Tools
    • iconOceanBase Cloud Platform
    • iconOceanBase Migration Service
    • iconOceanBase Developer Center
    • iconOceanBase Migration Assessment
    • iconOceanBase Admin Tool
    • iconOceanBase Loader and Dumper
    • iconOceanBase Deployer
    • iconKubernetes operator for OceanBase
    • iconOceanBase Diagnostic Tool
    • iconOceanBase Binlog Service
    • Connectors and Middleware
    • iconOceanBase Database Proxy
    • iconEmbedded SQL in C for OceanBase
    • iconOceanBase Call Interface
    • iconOceanBase Connector/C
    • iconOceanBase Connector/J
    • iconOceanBase Connector/ODBC
    • iconOceanBase Connector/NET
icon

OceanBase Database

SQL - V4.1.0

    Download PDF

    OceanBase logo

    The Unified Distributed Database for the AI Era.

    Follow Us
    Products
    OceanBase CloudOceanBase EnterpriseOceanBase Community EditionOceanBase seekdb
    Resources
    DocsBlogWhite PaperLive DemosTraining & CertificationTicket
    Company
    About OceanBaseTrust CenterLegalPartnerContact Us
    Follow Us

    © OceanBase 2026. All rights reserved

    Cloud Service AgreementPrivacy PolicySecurity
    Contact Us
    Document Feedback
    1. Documentation Center
    2. OceanBase Database
    3. SQL
    4. V4.1.0
    iconOceanBase Database
    SQL - V 4.1.0
    Databases
    • OceanBase Database
    • OceanBase Cloud
    • OceanBase Tugraph
    • Interactive Tutorials
    • OceanBase Best Practices
    Tools
    • OceanBase Cloud Platform
    • OceanBase Migration Service
    • OceanBase Developer Center
    • OceanBase Migration Assessment
    • OceanBase Admin Tool
    • OceanBase Loader and Dumper
    • OceanBase Deployer
    • Kubernetes operator for OceanBase
    • OceanBase Diagnostic Tool
    • OceanBase Binlog Service
    Connectors and Middleware
    • OceanBase Database Proxy
    • Embedded SQL in C for OceanBase
    • OceanBase Call Interface
    • OceanBase Connector/C
    • OceanBase Connector/J
    • OceanBase Connector/ODBC
    • OceanBase Connector/NET
    SQL
    KV
    • V 4.6.0
    • V 4.4.2
    • V 4.3.5
    • V 4.3.3
    • V 4.3.1
    • V 4.3.0
    • V 4.2.5
    • V 4.2.2
    • V 4.2.1
    • V 4.2.0
    • V 4.1.0
    • V 4.0.0
    • V 3.1.4 and earlier

    Failures of a minority of nodes

    Last Updated:2023-08-01 06:02:28  Updated
    Share
    What is on this page
    Background
    Procedure

    folded

    Share

    This topic describes the troubleshooting procedure when a minority of nodes fail.

    Background

    OceanBase Database is a distributed database that is typically built based on a multi-replica architecture. This topic describes the node failure that does not affect the majority of nodes in any log stream. For example, in the three-replica deployment architecture, the failure of any node does not affect the majority of nodes. In the five-replica deployment architecture, the failure of any two nodes does not affect the majority of nodes. If some log streams already lack replicas, the failures of a minority of nodes may affect the majority of nodes. For more information about this scenario, see Failure of the majority of nodes.

    General node failures:

    • A node fails due to a hardware exception.

    • A node encounters a hardware exception but does not fail. However, the observer process on this node exits abnormally. For example, a memory failure on the host causes the observer process to crash.

    • A node encounters a hardware exception but does not fail. However, the status of the observer process on this node becomes abnormal. For example, an I/O exception causes the read/write requests of the observer process to time out.

    • A node encounters a hardware exception but does not fail. However, errors occur in the data written by the observer process on the node into the SSTable or REDO log, and are detected in the subsequent checksum verification.

    • A bug in OceanBase Database causes an exception in the observer process, such as stuck threads, memory overrun, and process crashes.

    Procedure

    If a bug in OceanBase Database causes the observer process to become abnormal, you must first isolate the corresponding node. For more information about how to isolate a node, see Isolate a node. If services are not affected, you can contact OceanBase Technical Support for help without taking any emergency actions. If services are affected, restart or pull up the process and contact OceanBase Technical Support for help.

    If a node becomes abnormal due to a hardware exception or a suspicious hardware exception is detected, you must perform the following steps to isolate and replace this node.

    1. Log on to the sys tenant of the cluster as the root user.

      Run the following command to log on. You must replace the related information in the command based on the actual database environment.

      obclient -h10.xx.xx.xx -P2883 -uroot@sys -p***** -A
      

      For more information about how to connect to OceanBase Database, see Overview of the Connect to OceanBase Database chapter in Develop Applications in MySQL Mode and Overview of the Connect to OceanBase Database chapter in Develop Applications in Oracle Mode.

    2. Perform the Stop Server or Force Stop Server operation on the abnormal node as soon as possible and make sure that the operation succeeds to avoid affecting business traffic.

      Statement of the Stop Server operation:

      obclient [(none)]> ALTER SYSTEM STOP SERVER 'svr_ip:svr_port';
      

      Sample code:

      obclient [(none)]> ALTER SYSTEM STOP SERVER '172.xx.xx.xx:2882';
      

      Statement of the Force Stop Server operation:

      obclient [(none)]> ALTER SYSTEM FORCE STOP SERVER 'svr_ip:svr_port';
      

      Sample code:

      obclient [(none)]> ALTER SYSTEM FORCE STOP SERVER 172.xx.xx.xx:2882';
      

      For more information about the internal execution procedures of the Stop Server and Force Stop Server operations, see Isolate a node.

    3. Add a node to the zone where the abnormal node is located.

      For more information about how to add a node, see Add a node.

    4. Replicate the data of replicas on the abnormal node to the new node.

      Three cases exist:

      • The observer process has exited.

        You can initiate a resource unit migration task or wait for Root Service to automatically trigger a resource unit migration task. When the observer process reaches the permanent offline time specified by the server_permanent_offline_time parameter, Root Service triggers resource unit migration. Make sure that the enable_rereplication parameter is enabled.

        For more information about how to manually initiate a resource unit migration task, see Unit migration.

      • Data errors occur on the abnormal node, the resource unit migration task is stuck due to the abnormal status of the observer process, or the global status is abnormal (such as a stuck major compaction) due to the abnormal status of the observer process.

        You can kill the observer process and take actions described in the first case where the observer process has exited.

      • Potential hardware risks or scenarios that will not cause a cluster to become globally abnormal.

        In this case, you do not need to kill the observer process. Killing the observer process may cause the risks to spread. You can initiate a resource unit migration task to migrate all resource units from the abnormal node to a new node.

        For more information about how to manually initiate a resource unit migration task, see Unit migration.

      Note

      We recommend that you initiate a resource unit migration task.

      • If you initiate a resource unit migration task, you can migrate all resource units from the abnormal node to a new node while retaining the original unit topology, thereby avoiding affecting the load balancing among the nodes.
      • Root Service automatically triggers a resource unit migration task only when the observer process reaches the permanent offline time. This will spread the risks of the node exception.
      • By initiating a resource unit migration task, you can check the resource usage of the migration traffic, thereby avoiding affecting the application traffic.
      • If the observer process on a node has exited due to an exception and reached the permanent offline time, and Root Service has triggered a resource unit migration task, you must check the topology of the migrated resource units and the resource usage by the migration traffic, and provide this information to business personnel.

      • Query the oceanbase.DBA_OB_UNIT_JOBS view for the data replication progress.

        obclient [(none)]> SELECT * FROM oceanbase.DBA_OB_UNIT_JOBS WHERE JOB_TYPE = 'MIGRATE_UNIT';
        +--------+--------------+------------+-------------+----------+----------------------------+----------------------------+-----------+---------+----------+------------+------------+-------------+
        | JOB_ID | JOB_TYPE     | JOB_STATUS | RESULT_CODE | PROGRESS | START_TIME                 | MODIFY_TIME                | TENANT_ID | UNIT_ID | SQL_TEXT | EXTRA_INFO | RS_SVR_IP  | RS_SVR_PORT |
        +--------+--------------+------------+-------------+----------+----------------------------+----------------------------+-----------+---------+----------+------------+------------+-------------+
        |      4 | MIGRATE_UNIT | INPROGRESS |        NULL |        0 | 2023-01-04 17:22:02.208219 | 2023-01-04 17:22:02.208219 |      1004 |    1006 | NULL     | NULL       |xx.xx.xx.238|        2882 |
        +--------+--------------+------------+-------------+----------+----------------------------+----------------------------+-----------+---------+----------+------------+------------+-------------+
        

        If the query result is empty, the resource units are migrated and data replication succeeds.

      • Delete the abnormal node.

        For more information about how to delete a node, see Delete a node.

    Previous topic

    Check upgrade results
    Last

    Next topic

    Failures of the majority of nodes
    Next
    What is on this page
    Background
    Procedure