Blog编组 28
Integrate Monolithic with Distributed Architecture: Making OceanBase Work for Small & Medium Businesses

Integrate Monolithic with Distributed Architecture: Making OceanBase Work for Small & Medium Businesses

右侧logo

oceanbase database

Photo by JJ Ying on Unsplash

We’ve shared how OceanBase Database helped Alipay cope with the traffic flood during the Double 11 shopping festival. If you are interested in it, refer to 61M QPS Challenge in Alipay: How did we do it? That is pretty much a story about the birth of the OceanBase Database.

As OceanBase Database gains more market shares in industries other than the financial sector, we have realized that not all users need to handle such large amounts of data and concurrency as Alipay. In fact, standalone databases are just enough to tick all the boxes of many users in the early days of their business, when the data volume is rather small.

Therefore, it is a great help to provide a light version database for users to begin with. In this way, users are able to break in at very low costs. Also, with the great scalability of OceanBase Database, users can flexibly scale out their database systems later to take care of the increasing data volume and performance requirements.

OceanBase Database V4.0, with an integrated architecture that combines monolithic and distributed systems, is an attempt to meet the changing needs of growing companies.

In this article, we will share our thoughts on the integrated architecture that supports both standalone and distributed deployment from the following three perspectives:

  1. What databases can always meet the requirements of a business over its whole lifecycle?
  2. How to implement a distributed database with small specifications?
  3. The performance of OceanBase databases with small specifications.

From small to large: Basic database requirements in a business that grows

The latest OceanBase Database V4.0 supports a minimum deployment specification of 4C8G.

What does 4C8G mean?

It’s just a typical configuration of a nice laptop. In other words, OceanBase Database V4.0 can be deployed and stably run on a personal computer.

As your business grows, OceanBase Database V4.0 can be scaled out to support changing needs over the entire lifecycle of your business.

  1. In its early days, your business handles a small amount of data and has few requirements for disaster recovery. You can deploy and run OceanBase Database V4.0 on a single server and perform cold backup regularly to protect your data system from possible disasters.
  2. As the business grows, you can vertically scale up the specifications of the existing server. To meet the requirements for disaster recovery, you can add another server to build a primary/standby architecture, which provides online disaster recovery capability. Manual intervention is still required during disaster recovery due to the limits of the primary/standby architecture.
  3. When the business expands to a certain size and data becomes more important, you can simply upgrade to the three-replica architecture, which ensures high availability with three servers and supports automatic disaster recovery. When a server fails, the three-replica architecture of OceanBase Database V4.0 guarantees business recovery in 8s with zero data loss. In other words, the recovery time objective (RTO) is less than 8s and the recovery point objective (RPO) is zero.
  4. When the business experiences even greater growth and each server has been upgraded to the highest configurations, you have to deal with the same trouble as Taobao and Alipay once did. In this case, the transparent distributed scalability of OceanBase Database allows you to scale the cluster out from 3 to 6, 9, or even thousands of servers.

oceanbase database

Figure 1 Deployment evolution: OceanBase vs conventional databases

Since the very first day when we started the development of OceanBase Database V4.0, we’ve been thinking about how to run a distributed database on small-specification hardware with higher performance to meet user requirements for cost efficiency and high availability in different business scenarios. We’ve designed OceanBase Database V4.0 so that it not only supports standalone deployment with all features but more importantly achieves better performance at the same hardware specifications.

Challenges in the small-specification deployment: Taking CPU and memory usage as an example

In the infrastructure sector, it is very hard to make a database system large because the system will be increasingly vulnerable to failures as more nodes are added to it. In our second Transaction Processing Performance Council Benchmark C (TPC-C) test, for example, we built an OceanBase cluster of 1,554 Elastic Compute Service (ECS) servers. In such a cluster, the frequency of a single-server failure is about once a day or every other day. The point is we have to make the product sufficiently stable and highly available to keep such a jumbo-sized cluster up and running. Click here to learn more about our second TPC-C benchmark test.

It is equally hard to make a database system small because it requires getting down to every detail, much like using a microscope to arrange the usage of every slice of the resource. Not only that, some proper designs or configurations in a large system may be totally unacceptable in a smaller one.

What’s more challenging is that we must make the system suitable for both large and small hardware specifications. This requires us to weigh up between large and small specifications when designing the database system, so as to minimize the additional overhead of a distributed architecture while allowing the database system to make adaptive responses according to hardware specifications in many scenarios.

Now, let’s talk about the technical solution of OceanBase Database V4.0 by taking CPU consumption and memory occupation, the two major challenges, as an example.

Reducing CPU consumption through dynamic control of log streams

To build a small database, OceanBase Database V4.0 needs to control CPU consumption in the first place. In versions earlier than V4.0, OceanBase Database would generate a Paxos log stream for each partition of a data table to ensure data consistency among multiple replicas based on the Paxos protocol. This is a very flexible design because Paxos groups are based on partitions, which means that partitions can be migrated between servers. However, this design puts a heavy workload on the CPU because each Paxos log stream consumes overhead for leader selection, heartbeat, and log synchronization. Such additional overhead occupies a moderate percentage of the CPU resource if servers have large specifications, or the number of partitions is small but causes an unbearable burden for small-specification servers.

So how do we solve that issue in OceanBase Database V4.0?

We go straight forward and reduce the number of Paxos log streams. If we can reduce the number of Paxos log streams to the same as that of servers, the overhead for Paxos log streams is roughly equal to that for logs in a conventional database in the primary/standby mode.

oceanbase database

Figure 2 Dynamic log streams of a cluster of OceanBase Database V4.0

OceanBase Database V4.0 generates a Paxos log stream for multiple data table partitions and dynamically controls the log streams. As shown in the figure above, the database cluster consists of three zones, and each zone has two servers deployed. Assume that two resource units are configured for a tenant. In this case, two Paxos log streams are generated for the tenant, with one containing P1, P2, P3, and P4 partitions and the other containing P5 and P6 partitions.

  1. When the two servers are not load-balanced, the load-balancing module of OceanBase Database migrates the partitions between the Paxos log streams.
  2. To scale out the cluster, a user can split one Paxos log stream into multiple Paxos log streams and migrate them as a whole.
  3. To scale in the cluster, the user can migrate multiple Paxos log streams and merge the streams.

With dynamic log stream control, OceanBase Database V4.0 greatly reduces the CPU overhead of the distributed architecture and guarantees high availability and flexible scaling.

Dynamic metadata loading allows high concurrency with a small memory occupation

The second challenge that OceanBase Database V4.0 needs to take in building a small database is to optimize memory occupation. For the sake of performance, the OceanBase Database of versions earlier than V4.0 stored some metadata in memory. The memory usage of this portion of metadata was not high if the total memory size was large but unacceptable for a small-specification server. To support ultimate performance at small specifications, we have achieved dynamic loading of all metadata in OceanBase Database V4.0.

oceanbase database

Figure 3 SSTable hierarchical storage

As shown in the figure above, we store an SSTable in a hierarchical structure. To be specific, we store the microblocks of the SSTable in partitions and maintain only the handle of the partitions in memory. The requested data is dynamically loaded by using KVCache only when the partitions need to be accessed. In this way, OceanBase Database V4.0 is capable of processing highly concurrent requests for a massive amount of data with small memory size.

Performance comparison under small specifications

To test the actual performance of OceanBase Database with small specifications, we deployed OceanBase Database Community Edition V4.0 in 1:1:1 mode based on three 4C16G servers and compared its performance with that of RDS for MySQL 8.0, which was also deployed on 4C16G servers. The comparison was performed by using Sysbench and the results show that OceanBase Database Community Edition V4.0 outperforms RDS for MySQL 8.0 in most data processing scenarios. In particular, under the same hardware specifications, OceanBase Database Community Edition V4.0 handles a throughput 1.9 times that of RDS for MySQL 8.0 in INSERT and UPDATE operations.

oceanbase database

Figure 4 Throughput: OceanBase V4.0 VS RDS for MySQL 8.0 on Sysbench (4C16G)

We also compared the two at specifications of 8C32G, 16C64G, and 32C128G, which are the most popular among users. As the server specifications increase, the performance gap widens between OceanBase Database Community Edition V4.0 and RDS for MySQL 8.0. At 32C128G specifications, OceanBase Database Community Edition V4.0 achieves a throughput 4.3 times that of RDS for MySQL 8.0 with 75% less response time.

oceanbase database

Figure 5 Throughput: OceanBase V4.0 VS RDS for MySQL 8.0 on Sysbench

oceanbase database

Table 1 Performance: OceanBaseV4.0 VS RDS for MySQL 8.0 on Sysbench

OceanBase Database has achieved ultimate performance in the TPC-C benchmark test with a massive cluster of more than a thousand servers, and ultimate resource usage in standalone performance tests at small specifications, such as 4C16G. Nonetheless, we still have a lot to do to make it better. OceanBase Database Community Edition V4.0 is now available and we are looking forward to working with all users to build a general database system that is easier to use.

Leave a comment below if you have anything to say!

ICON_SHARE
ICON_SHARE