Before we learn about the routing principles of ODP, let's first look at the factors affecting data routing, so as to better understand and evaluate the routing feature. This topic describes the following factors: functionality, performance, and high availability.
Functionality
This section takes the PreparedStatement feature as an example to illustrate how functionality affects routing. A PreparedStatement object is executed in two steps:
Perform the PREPARE operation, for example, sending the
select * from t1 where c1 = ?statement.Perform the EXECUTE operation to pass the data used by the SELECT statement and execute the SELECT statement.
Step 2 depends on step 1. Assume that the execution is as shown in the following figure. When OBServer2 receives an EXECUTE request in Step 2 and does not know the content of the PREPARE request in Step 1, OBServer2 gracefully reports an error or forcibly disconnects the connection.

In this case, two solutions are available:
Solution 1: Record the node OBServer1 to which the PREPARE request is routed, and also route the EXECUTE request to OBServer1. This method is simple to implement, but does not fully utilize the advantage of a distributed system.
Solution 2: Synchronize the status of the PREPARE request to OBServer2 before executing the EXECUTE request. For more information about status synchronization, see related description in Connection management. This method requires ODP to synchronize connection status, which is complex to implement, but fully leverages the advantage of a distributed system. ODP adopts this method.

Performance
High performance is an important characteristic of OceanBase Database. Routing affects performance mainly in terms of latency, namely network communication overheads. ODP reduces network communication overheads and improves overall performance by perceiving data distribution and geographic locations of servers.
Data distribution
Data distribution determines the number of hops on an execution link. In best cases, ODP directly hits the node where the data resides in routing. The section uses the
select c1 from teststatement as an example to illustrate the impact of data distribution on performance. As shown in the following figure, data in Table t1 is distributed on OBServer1. In Routing method 1, the statement is directly routed to OBServer1. This is the most efficient method. In Routing method 2, the SQL statement is sent to OBServer2, and OBServer2 then routes the statement to OBServer1. Compared with Routing method 1, this routing method is less efficient. To implement Routing method 1, ODP must be aware of the distribution of SQL statements and table data.
Geographic location of servers
The geographic location of a server affects the network latency. When a remote node is selected, SQL execution slows down. Sometimes, the network latency is much longer than the database execution time. Therefore, ODP selects servers in different geographic locations in the following priority: a server in the same IDC > a server in a different IDC in the same city > a server in a different city.
High availability
High availability means that OceanBase Database is tolerant of server faults, which makes the faults transparent and imperceptible to applications. High availability is implemented by using fault detection, the blacklist mechanism, and the retry logic. When ODP finds that an OBServer node is faulty, it excludes the faulty node and selects a healthy node during routing. ODP also allows failed SQL statements to be retried.
For more information about high availability, see High availability mechanism.