This topic describes how to assess the performance differences between a MySQL database and an OceanBase database.
Run start.sh with reference to the sample command in this topic. For more information about the parameters in the script, see Product form. In Microsoft Windows, replace sh bin/start.sh with start.bat.
Enable GeralLog, so that the collector can collect the traffic data in the source MySQL database and generate the corresponding General Log file.
Analyze the files.
Find the recorded file in the
/home/mysql/gldirectory, compress it, upload it to the server where it is to be analyzed, and execute the following statements:sh bin/start.sh \ # The task name, which can be set to any value. --name test_mysql_replay \ # A fixed value that indicates a database replay. --mode REPLAY \ # A fixed value that indicates the analysis phase of replay. --replay-phase COLLECT \ # The path of files to be replayed. --source-file "/home/mysql/gl" \ # Set this parameter to 1. Generally, only one General Log file exists. --parallel-count 1 \ # The type of the source database. Set it to MYSQL. --source-type MYSQL \ # Set this parameter to 1. Generally, only one General Log file exists. --parse-thread-count 1 \ # The General Log file does not contain schema information. It support only single-schema replay. Make sure that the schema name specified in this parameter is the same as the schema name in the target OceanBase Database. --source-db-user "schema_name" # The timestamp format in the General Log file. You need to specify it according to the actual situation in the logs. --log-time-regex "\\\d{4}-\\\d{2}-\\\d{2}T\\\d{2}:\\\d{2}:\\\d{2}\\\.\\\d{6}\\\+08:00"Replay the files.
bin/start.sh \ # The task name. --name send1 \ # The tenant type of OceanBase Database. --ob-mode MYSQL \ # A fixed value that indicates a database replay. --mode REPLAY \ # A fixed value that indicates the sending phase of replay. --replay-phase SEND \ # The replay mode. Valid values: READ, WRITE, READ_WRITE, PL, and ALL. Default value: READ. --replay-mode ALL \ # The path of files to be replayed. --source-file "./dump/test" \ # Specifies to replay in order. --replay-process-name "sort" \ # Specifies the period after which the replay will be restarted, in the unit of seconds. Default value: 0, which specifies to immediately start the replay. --delay-start-time 5 \ # The replay scale. The default value of this parameter is 1, which indicates that the replay will run at the original speed. --replay-scale 5 \ # The sampling ratio, which ranges from 0 to 1. The default value is 1, which indicates full sampling. --replay-sample 1 \ # The maximum number of threads, which is 400 by default and cannot exceed 2,000. --max-parallel 400 \ # Due to the cold start during a replay, a warm-up time must be specified. --warm-up 30 \ # The IP address of the target database. You can specify only one target database for replay. --target-db-host xxx.xxx.xxx.xxx \ # The port number of the target database. --target-db-port 2883 \ # The tenant name and cluster name, which must be separated with a number sign (#). The tenant name precedes the cluster name. # You can replay calls from multiple users, but the users must belong to the same tenant. --target-db-tenant-cluster "oms_oracle#ob_100****.admin" \ # The login username and password, which must be separated with a colon (:). Multiple usernames and passwords must be separated by commas (,). Please note that there is only one hyphen (-) before DuserAndPassword, and the first username must be ROOT, SYS, or a superuser with high privileges, which is used to obtain the SQL AUDIT information of OceanBase Database. If the first user account cannot access the Schema to be replayed, you need to specify another username and password with access to the corresponding Schema. -DuserAndPassword="root:******,username:******" \ # The custom configuration for the JDBC connection. Specify the JDBC parameters based on the actual situation. For example, you can set useUnicode to true and characterEncoding to utf8 as needed. This parameter is optional. -DjdbcConnectionConfigure=useUnicode:true,characterEncoding:utf8 \ # If you set this parameter to XX, the system will forcibly exit the main process when the main process has not sent any data for XX minutes. --monitor-processor-time 10 \ # The interval for scanning the SQL_AUDIT view, in seconds. Default value: 60. Set this parameter based on your requirements. --sql-audit-interval 300 \ # The mode of OceanBase Database. --ob-mode MYSQL \ # The Schema of OceanBase Database that needs to be replayed. --target-db-schemas "AAA,BBB" \ # The type of the target database. --target-db-type OBMYSQL \ # The version of OceanBase Database. If you use OceanBase V4.x, set the value to 4.2.0. --target-db-version 4.2.0After this command is executed, the real-time replay progress is displayed.
Destination SCHEMA :::SCHEMA : [ schema_name ] : -----================ [ schema_name ] Traffic histogram [ Step size: 12 seconds ]===============------- 4412.0┤ █ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ 4191.4┤ █ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ 3970.8┤ █ ─ ─ ─ ─ ─ ─ ─ █ █ ─ ─ ─ ─ █ ─ 3750.2┤ █ ─ ─ ─ ─ ─ ─ ─ █ █ ─ ─ ─ ─ █ ─ 3529.6┤ █ ─ ─ ─ ─ ─ ─ ─ █ █ ─ ─ ─ ─ █ ─ 3309.0┤ █ ─ ─ ─ ─ ─ ─ ─ █ █ █ ─ ─ █ █ ─ 3088.4┤ █ ─ ─ ─ ─ ─ ─ █ █ █ █ █ ─ █ █ █ 2867.8┤ █ ─ ─ █ ─ ─ █ █ █ █ █ █ █ █ █ █ 2647.2┤ █ ─ ─ █ ─ ─ █ █ █ █ █ █ █ █ █ █ 2426.6┤ █ ─ ─ █ █ ─ █ █ █ █ █ █ █ █ █ █ 2206.0┤ █ ─ ─ █ █ █ █ █ █ █ █ █ █ █ █ █ 1985.4┤ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ 1764.8┤ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ 1544.2┤ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ 1323.6┤ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ 1103.0┤ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ 882.4┤ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ 661.8┤ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ 441.2┤ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ 220.6┤ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ Start time: 2021-05-31 00:14:04 End time: 2021-05-31 00:17:15 Time spent (seconds): 190 Requests: 50774 ******************************************************************************************View the replay report.
- After the replay is completed, a folder containing CSV files is generated in the
reportdirectory. You can view the traffic replay comparison between the source and target databases by using the CSV report files in the folder. The SQL statements in the CSV files are aggregated based on the SQL IDs.
The folder name is in the format of "Time and date+string", and the CSV file name is in the format of "Schema name.csv". If multiple schemas exist, multiple corresponding CSV files are generated.
- After the replay is completed, a folder containing CSV files is generated in the