Blog编组 28
How Alipay Handles Traffic Surge during Double 11 with OceanBase
右侧logo

This article summarizes the presentation by Deng Rongwei, a senior database expert at Alipay, at OceanBase Annual Conference 2022.

Just like Black Friday, the Double 11 Shopping Festival attracted hundreds of millions of people to shop online on the same day, even at the same time. Shoppers are happy because they can usually buy things at a much lower price during the campaign. It has been a huge challenge to apps like Alipay because they have to deal with a much larger amount of concurrent transactions than usual.

In 2019, the peak QPS reached 61,000,000 during Double 11, meaning that Alipay had to deal with that many transactions concurrently.

How does Alipay upgrade its database architecture to deal with the challenge?

Alipay's Evolving Online Database Architecture

Alipay was born in the “IOE” age when the conventional combination of IBM midrange computers, Oracle databases, and EMC storage devices was preferred by companies. The original database system of Alipay used a centralized architecture without sharding that unsurprisingly drowned in the traffic of the first Double 11 shopping festival.

Soon after that, Alipay turned to database and table sharding, which gave birth to the first-generation distributed architecture based on middleware.

Version 1.0: Sharding-based distributed database architecture

This architecture had performance bottlenecks and rising storage costs as Alipay’s user base kept growing and the annually held Double 11 shopping festival caused ultra-high concurrency in the database system. After taking off the last IBM midrange computer from its database system in May 2013, Alipay moved on to upgrading its database architecture.

Despite the daunting challenge of replacing the storage system, Alipay asked for a database that not only runs at low costs but also delivers high performance with excellent scalability and high availability, hence the birth of OceanBase Database.

The figure below shows the evolution of Alipay’s database architecture till 2017.

oceanbase database

We first replaced underlying storage devices with x86-based servers, and then upgraded from Oracle to OceanBase Database V0.5 and then V1.0.

OceanBase Database V0.5 was not able to deliver ultimate performance because, although it could read data in a distributed manner, it still centrally wrote data. OceanBase Database V1.0, in comparison, supported clusters of equivalent nodes capable of data read and write, and was therefore a genuine distributed database system.

As the version that serves Alipay for the longest period with the most extensive application, OceanBase Database V1.0 boasts three major features:

1. Multitenancy.

Multitenancy is a great way to improve resource utilization in the resource pooling scenario of small- and medium-sized databases.

For example, if we use a MySQL or Oracle database, we will have to do database and table sharding in response to traffic surges. OceanBase Database, by contrast, supports automatic scaling by increasing tenant resources.

2. High availability.

Alipay financial services require that the database should be able to recover from Internet data center-level (IDC-level) and even city-level disasters. The distributed architecture of OceanBase Database is built on the Paxos protocol and inherently meets these requirements.

3. Autoscaling.

Alipay used to distribute some traffic to its standby data center during the Double 11 shopping festival each year. When Alipay was using conventional relational databases, to eject a production database, it had to build a standby database to take over traffic from the production database and then complete the switchover.

This method caused business interruption that was sensible to users due to the transient disconnection at the moment of switchover. After migrating to OceanBase Database, we were able to add or remove replicas and perform a smooth leader/follower switchover with zero business interruption.

In addition, OceanBase Database can release thousands of idle servers to support other businesses within 30 minutes after the traffic peak during a promotion campaign.

Version 2.0: A native distributed database without sharding

The data metrics of the Double 11 shopping festival keep hitting record highs year after year. In 2017, even 1% of the peak traffic would overwhelm a standalone database server.

Typically, the capacity of conventional databases is increased by sharding-based horizontal scaling, which allows multiple database shards to handle the workload in peak periods. This scaling solution, though feasible, has many drawbacks.

For example, it involves multiple data sources and database clusters, leading to higher costs in configuration and routine O&M management.

So, we needed a distributed database solution that handles traffic surges of millions or even higher payment requests per second more gracefully, that is, without sharding.

In this solution, horizontal scaling should be completed within the database without interrupting applications. OceanBase Database V2.0 made this solution possible.

Partitioning in OceanBase Database V2.0 is similar to sharding because both are about breaking up a large data set into smaller subsets. For the payment database with 100 shards, we further divided each shard into 100 partitions by UID. Applications are not aware of such partitioning and can only see the shards. This way, data can be split and distributed to more servers, without affecting user experience, to break the performance bottleneck of a standalone database server and achieve load balancing.

Some may ask, why not give up sharding when you can do the partitioning? Well, Alipay is deployed in a zone-based architecture, and sharding is a base for this architecture. Therefore, we can never give up sharding but can only partition the existing shards. OceanBase Database automatically distributes data in each of the 100 shards to multiple servers based on load balancing, but for users, it looks like data is still stored in 100 shards.

Can OceanBase Database make use of multiple servers without partitioning?

The answer is yes.

Many internal systems of Alipay directly distribute data in shards to several servers without further partitioning the shards. A partition ID is not necessary for table partitioning in OceanBase Database. If your business requires ultimate performance, you are advised to specify the partition ID. You can do partitioning based on certain rules so that applications or middleware can calculate the partition ID. As long as OBProxy identifies the partition, it can route SQL statements to the right server. Otherwise, internal secondary routing is performed, which slightly degrades the performance of OceanBase Database.

To achieve ultimate performance, partition groups were introduced to OceanBase Database V2.0.

Partitions in the same partition group are hosted on the same server so that transactions involving these partitions can be processed on a single node.

For example, assume that sharding is not performed and data is written to three tables for a payment request. As these tables are partitioned by UID based on the same rules, after payment is made, the data is for sure written to the same partition in the three tables. In other words, the transaction can be processed by using three same partitions, without cross-server operations or distributed processing.

OceanBase Database V2.0 has the following advantages and is used for many core business systems of Alipay:

1. High compatibility.

It supports MySQL and Oracle syntax, which allows Alipay to migrate its database at lower costs.

2. Improved performance at reduced costs.

Data can be distributed to multiple servers even if sharding is not performed. This way, the capacity of a standalone database is no longer limited by the hardware specifications of a single server. Compared to OceanBase Database V1.0, the overall database performance of V2.0 is improved by 50%, and 30% of the storage space is saved.

3. Lower O&M costs with better scalability.

Previously, Alipay must work with the business to eject a production database. OceanBase Database V2.0, however, supports horizontal scaling, which can be done quite easily with zero service interruption.

Migration to OceanBase Database

In its early years, Alipay systems were hosted in Oracle and MySQL databases. Those conventional relational databases became bulky later and failed to catch up with the growth of Alipay due to their shortages in scalability and disaster recovery. Also, database administrators (DBAs) had to invest a great deal of time and effort to ensure data consistency and maintain multiple O&M systems, disaster recovery systems, and eco-tools.

Now, the whole stack of Alipay is running on OceanBase Database.

Migrating from one database to another heterogeneous database is a complex process, mainly due to compatibility and data quality issues. OceanBase Migration Service (OMS), an all-in-one data migration tool, is provided to help users efficiently migrate their data to OceanBase Database from other heterogeneous sources. To deal with compatibility issues, OMS supports the assessment of static code and dynamic traffic replay. To guarantee data consistency in real time, OMS supports full verification, incremental verification, and offline verification.

After migrating to OceanBase Database, the efficiency was improved. DBAs can easily control the O&M system to automatically and efficiently handle routine capacity scaling and disaster recovery. They can also complete scheduled capacity scaling and disaster recovery drills for major promotion campaigns with just a few clicks.

In addition, costs were reduced, not only because of the multitenancy feature that reduces resource fragmentation by aggregating our long tail business hosted in Oracle and MySQL databases, but also the super high data compression ratio of OceanBase Database. I was very impressed that, when the last conventional centralized databases of Alipay were taken off, the legacy data that would otherwise be squeezed onto hundreds of servers was nicely stored on about 10 servers in OceanBase Database.

PB-scale Data Archiving

As more business lines of Alipay are migrating to OceanBase Database along with Alipay’s rapid growth, data archiving becomes imperative for problems such as insufficient disk space of production databases, decreasing IOPS, and increasing backup time.

How can we efficiently and cost-effectively archive our data? We used to bring in high-end commercial storage arrays or a next-generation distributed storage system to meet these requirements. With OceanBase Database, we can easily store petabytes of data in a standalone database by simply adding x86-based SATA disks.

Frankly speaking, building archive databases is all about two key points: load balancing and fault recovery.

oceanbase database

Key point 1: Load balancing

The primary purpose of load balancing is to make the most use of storage and computing resources at the lowest possible costs.

Storage utilization is maximized in OceanBase Database. A database cluster gets larger as it takes in more servers to cope with more business lines. The model differences among servers purchased in different years will lead to the so-called short-board effect, which means that a small portion of servers can affect the disk usage of the whole cluster and cause storage waste. To address the issue, OceanBase Database supports dynamic load balancing at the partition level, which maximizes the disk usage of each server.

Utilization of computing resources is maximized in OceanBase Database. Generally, tables in an archive database are partitioned by time, and partitions of the same month tend to cluster on a small number of servers, resulting in data write hotspots and unbalanced workloads. To prevent write hotspots and make full use of the computing resources of each server, OceanBase Database disperses partitions of the same month in different tables across servers and evenly distributes the leaders of the partitions to all the servers.

Key point 2: Fault recovery

Whether the faulty server is completely down or not, OceanBase Database migrates replicas to the substitute server. The migration performance depends on the write capacity of this substitute server. If the substitute server uses regular SATA disks that support data write at 200 MB/s, it will take almost a week to migrate 100 TB of data. Such a long migration period will be fraught with uncertainties, such as another server crash that leaves the system with only one replica.

Given the risks due to a long migration period, we have figured out a two-step solution to accelerate the process by leveraging the native distributed architecture of OceanBase Database.

First, we put the new server online and then permanently take off the faulty server. Once the missing of the replica is detected by RootService, the master management service of OceanBase Database, it launches replica synchronization in multi-to-multi mode. In a production database, the speed can reach 30 to 50 GB/s, which means that 100 TB of data can be migrated in two to three hours.

What’s Next?

Alipay requires diversified storage capabilities to take care of its various business scenarios. Our DBA team will keep polishing OceanBase Database by focusing on its capabilities of hybrid transaction/analytical processing (HTAP) and multi-model processing.

First, we will go further into HTAP.

Our current HTAP link is complicated. Like a conventional solution, online logs are subscribed for computing in the data warehouse, which is not only expensive to maintain but also relies heavily on third-party tools, leading to frequent unavailability issues. We will support online analytical processing (OLAP) on top of the online transaction processing (OLTP) capabilities of OceanBase Database and come up with a cost-effective solution that uses the same set of data of the same system for both transaction and real-time analytical processing. Replicas of the same set of data can be stored both by row and column for different workloads.

Second, we will work on multi-model processing.

To meet the storage requirements of each independent business line, we used to store different forms of data, such as document, KV, and time series, in databases with different models. As a result, we had to maintain multiple O&M systems and repeatedly build disaster recovery systems. In the days ahead, we will introduce a multi-model processing capability to the unified storage engine and distributed engine of OceanBase Database. That way, we will be able to satisfy diversified storage needs with ease by providing multiple methods for accessing the same set of data.

ICON_SHARE
ICON_SHARE