Replace a node

2025-01-02 01:58:39  Updated

You can replace a node that encounters a hardware fault, or replace a node with one that uses different specifications to adjust the capacity of the cluster. Replacing a node does not change the number of nodes in the cluster. To replace a node, you must first add a new node, migrate the resource units from the old node, and finally delete the old node. This process takes a long time.

Procedure

  1. Log in to the sys tenant of the cluster as the root user.

    Note that you must specify the corresponding parameters in the following sample code based on your actual database configurations.

    obclient -h10.xx.xx.xx -P2883 -uroot@sys#obdemo -p***** -A
    

    For more information about how to connect to a database, see Connection methods (MySQL mode) or Connection methods (Oracle mode).

  2. Add a new node to the zone to which the node to be replaced belongs.

    For more information about how to add a node, see Add a node.

  3. Migrate the resource units from the old node to the new node.

    1. Query the DBA_OB_SERVERS view for information about the node to be replaced.

      Here is an example:

      obclient [(none)]> SELECT * FROM oceanbase.DBA_OB_SERVERS;
      +-------------+----------+----+--------------+----------+-----------------+--------+----------------------------+-----------+-----------------------+----------------------------+----------------------------+-------------------------------------------------------------------------------------------+
      | SVR_IP      | SVR_PORT | ID | ZONE         | SQL_PORT | WITH_ROOTSERVER | STATUS | START_SERVICE_TIME         | STOP_TIME | BLOCK_MIGRATE_IN_TIME | CREATE_TIME                | MODIFY_TIME                | BUILD_VERSION                                                                             |
      +-------------+----------+----+--------------+----------+-----------------+--------+----------------------------+-----------+-----------------------+----------------------------+----------------------------+-------------------------------------------------------------------------------------------+
      | 172.xx.xx.xx|     2882 |  6 | sa128_obv4_2 |     2881 | NO              | ACTIVE | 2022-12-30 16:17:03.173519 | NULL      | NULL                  | 2022-12-30 16:08:04.749100 | 2023-01-04 11:48:36.589270 | 4.0.0.0_101000022022120716-0d7927892ad6d830e28437af099f018b0ad9a322(Dec  7 2022 16:22:15) |
      | 172.xx.xx.xx|     2882 |  4 | sa128_obv4_3 |     2881 | NO              | ACTIVE | 2022-12-30 16:36:35.567437 | NULL      | NULL                  | 2022-12-30 16:08:02.755200 | 2023-01-04 14:13:36.976548 | 4.0.0.0_101000022022120716-0d7927892ad6d830e28437af099f018b0ad9a322(Dec  7 2022 16:22:15) |
      | 172.xx.xx.xx|     2882 |  3 | sa128_obv4_3 |     2881 | NO              | ACTIVE | 2022-12-12 12:42:00.054759 | NULL      | NULL                  | 2022-11-03 15:37:09.530894 | 2022-12-22 14:43:26.717736 | 4.0.0.0_101000022022120716-0d7927892ad6d830e28437af099f018b0ad9a322(Dec  7 2022 16:22:15) |
      | 172.xx.xx.xx|     2882 |  1 | sa128_obv4_1 |     2881 | NO              | ACTIVE | 2022-12-12 12:25:17.555651 | NULL      | NULL                  | 2022-11-03 15:37:08.990683 | 2022-12-12 12:25:18.553763 | 4.0.0.0_101000022022120716-0d7927892ad6d830e28437af099f018b0ad9a322(Dec  7 2022 16:22:15) |
      | 172.xx.xx.xx|     2882 |  2 | sa128_obv4_2 |     2881 | YES             | ACTIVE | 2022-12-12 11:46:37.222980 | NULL      | NULL                  | 2022-11-03 15:37:09.490511 | 2022-12-12 11:47:31.075335 | 4.0.0.0_101000022022120716-0d7927892ad6d830e28437af099f018b0ad9a322(Dec  7 2022 16:22:15) |
      | 172.xx.xx.xx|     2882 |  5 | sa128_obv4_1 |     2881 | NO              | ACTIVE | 2022-12-30 16:25:45.420996 | NULL      | NULL                  | 2022-12-30 16:08:03.928478 | 2023-01-04 11:48:36.578231 | 4.0.0.0_101000022022120716-0d7927892ad6d830e28437af099f018b0ad9a322(Dec  7 2022 16:22:15) |
      +-------------+----------+----+--------------+----------+-----------------+--------+----------------------------+-----------+-----------------------+----------------------------+----------------------------+-------------------------------------------------------------------------------------------+
      6 rows in set
      
    2. Query the list of resource units on the node to be replaced based on its IP address.

      Here is an example:

      obclient [(none)]> SELECT  unit_id FROM  oceanbase.DBA_OB_UNITS WHERE SVR_IP = 'svr_ip';
      

      In the statement, svr_ip is the IP address of the node to be replaced, which must be specified based on the actual situation.

      Here is an example:

      obclient [(none)]> SELECT unit_id FROM  oceanbase.DBA_OB_UNITS WHERE SVR_IP = '172.xx.xx.xx';
      +---------+
      | unit_id |
      +---------+
      |       1 |
      |    1002 |
      |    1016 |
      |    1024 |
      +---------+
      4 rows in set
      
    3. Submit a resource unit migration task to migrate the resource units from the old node to other nodes in the same zone.

      Here is an example:

      obclient [(none)]> ALTER SYSTEM MIGRATE UNIT unit_id DESTINATION 'svr_ip:svr_port';
      

      where:

      • unit_id specifies the ID of the resource unit to be migrated.

      • svr_ip:svr_port specifies the IP address and RPC port number of the new node to which the resource unit is to be migrated. The default port number is 2882.

      Only one resource unit can be migrated at a time. To migrate multiple resource units, execute the statement multiple times.

      Here is an example:

      obclient [(none)]> ALTER SYSTEM MIGRATE UNIT 1016 DESTINATION '172.xx.xx.xx:2882';
      
    4. Query the oceanbase.DBA_OB_UNIT_JOBS view for the migration progress of the resource unit.

      obclient [(none)]> SELECT * FROM oceanbase.DBA_OB_UNIT_JOBS WHERE JOB_TYPE = 'MIGRATE_UNIT';
      +--------+--------------+------------+-------------+----------+----------------------------+----------------------------+-----------+---------+----------+------------+------------+-------------+
      | JOB_ID | JOB_TYPE     | JOB_STATUS | RESULT_CODE | PROGRESS | START_TIME                 | MODIFY_TIME                | TENANT_ID | UNIT_ID | SQL_TEXT | EXTRA_INFO | RS_SVR_IP  | RS_SVR_PORT |
      +--------+--------------+------------+-------------+----------+----------------------------+----------------------------+-----------+---------+----------+------------+------------+-------------+
      |      4 | MIGRATE_UNIT | INPROGRESS |        NULL |        0 | 2023-01-04 17:22:02.208219 | 2023-01-04 17:22:02.208219 |      1004 |    1006 | NULL     | NULL       |xx.xx.xx.106|        2882 |
      +--------+--------------+------------+-------------+----------+----------------------------+----------------------------+-----------+---------+----------+------------+------------+-------------+
      

      If the query result is empty, the resource unit is migrated.

    For more information about how to migrate a resource unit, see Migrate units.

  4. After all resource units on the old node are migrated, delete the old node.

    To do so, perform the following steps:

    1. Execute the following statement to delete the old node:

      obclient [(none)]> ALTER SYSTEM DELETE SERVER 'svr_ip:svr_port' [,'svr_ip:svr_port'...] [ZONE [=] 'zone_name']
      

      where:

      • svr_ip specifies the IP address of the old node to be deleted.

      • port specifies the RPC port of the old node to be deleted. The default value is 2882.

      • zone_name specifies the name of the zone from which the old node is to be deleted.

      Here is an example:

      obclient [(none)]> ALTER SYSTEM DELETE SERVER "172.xx.xx.xx:2882" zone='zone1'
      
    2. Query the oceanbase.DBA_OB_SERVERS view to verify whether the old node is successfully deleted.

      obclient [(none)]> SELECT * FROM oceanbase.DBA_OB_SERVERS;
      

      If this node no longer appears in the list, it is successfully deleted. If it still appears in the list and its status is DELETING, this node is still being deleted.

References

For more information about node-related O&M operations, see the following topics:

Contact Us