This topic describes how to assess the performance differences between different versions of OceanBase Database. Replay from an OceanBase database of the source version to that of the target version is mainly intended to check whether an upgrade of OceanBase Database meets the expectation.
Procedure
Collect data from the source OceanBase database for analysis.
sh bin/start.sh \ # The task name. --name ob_collect \ # A fixed value that indicates a database replay. --mode REPLAY \ # A fixed value that indicates the analysis phase of replay. --replay-phase COLLECT \ # The number of parallel threads, which affects the replay performance. We recommend that you set the value to be as large as possible if the server performance permits. --parallel-count 20 \ # The IP address of the source OceanBase database. --source-db-host xxx.xxx.xxx.xxx \ # The port number of the source OceanBase database. --source-db-port 2883 \ # The complete username of the source OceanBase database. We recommend that you use the SYS user. --source-db-user "SYS@oms_oracle#ob.admin" \ # The password of the source OceanBase database. --source-db-password pass \ # The schema from which data is to be collected. --schemas "schema_name,TBCS" \ # The tenant from which data is to be collected. --source-tenant tenant \ # The mode of OceanBase Dtatabase, supporting ORACLE or MYSQL. --ob-mode MYSQL \ # The duration of data collection, in minutes. --collect-during-time 30 # The start time of data collection. The default value of this parameter is one hour ago. --collect-start-time "2023-01-01\ 00:00:00" \ # The end time of data collection. The default value of this parameter is the current time. --collect-end-time "2023-01-01\ 00:00:00" \ # The type of the source database. --source-type OBMYSQL \ # The version of OceanBase Database. --source-db-version "4.2.1"After the analysis is completed, a folder named after the task name is generated in the
./dump/directory. Example:test-20210530_233520. The folder stores replayable files in JSON format.Replay the files.
Run
start.shby referring to the following sample commands. For more information about the parameters in the script, see Product form. In Microsoft Windows, replacesh bin/start.shwithstart.bat.bin/start.sh \ # The task name. --name send1 \ # A fixed value that indicates a database replay. --mode REPLAY \ # A fixed value that indicates the sending phase of replay. --replay-phase SEND \ # The replay mode. Valid values: READ, WRITE, READ_WRITE, PL, and ALL. Default value: READ. --replay-mode ALL \ # The path of files to be replayed. --source-file "./dump/test" \ # Specifies to replay in order. --replay-process-name "sort" \ # Specifies the period after which the replay will be restarted, in the unit of seconds. Default value: 0, which specifies to immediately start the replay. --delay-start-time 5 \ # The replay scale. The default value of this parameter is 1, which indicates that the replay will run at the original speed. --replay-scale 5 \ # The sampling ratio, which ranges from 0 to 1. The default value is 1, which indicates full sampling. --replay-sample 1 \ # The maximum number of threads, which is 400 by default and cannot exceed 2,000. --max-parallel 400 \ # The system warm-up time, in seconds. Default value: 0. # Due to the cold start during a replay, a warm-up time must be specified. --warm-up 30 \ # The IP address of the target database. You can specify only one target database for replay. --target-db-host xxx.xxx.xxx.xxx \ # The port number of the target database. --target-db-port 2883 \ # The tenant name and cluster name, which must be separated with a number sign (#). The tenant name precedes the cluster name. # You can replay calls from multiple users, but the users must belong to the same tenant. --target-db-tenant-cluster "oms_oracle#ob_100811****.admin" \ # The login username and password, which must be separated with a colon (:). Multiple username/password pairs must be separated with commas (,). # Note that only one hyphen (-) is added to the beginning of this parameter. The first username must be that of the SYS user, which is used to obtain information from the SQL_AUDIT view. -DuserAndPassword=<sys_name>:<sys_password>,schema_name:123456,schema_name2:123456 \ # The custom configuration for the JDBC connection. Specify the JDBC parameters based on the actual situation. For example, you can set useUnicode to true and characterEncoding to utf8 as needed. This parameter is optional. -DjdbcConnectionConfigure=useUnicode:true,characterEncoding:utf8 \ # The process self-monitoring parameter, in minutes. Default value: 10. We recommend that you set this parameter to 30. # If you set this parameter to XX, the system will forcibly exit the main process when the main process has not sent any data for XX minutes. --monitor-processor-time 30 \ # The interval for scanning the SQL_AUDIT view, in seconds. Default value: 60. Set this parameter based on your requirements. --sql-audit-interval 300 \ # Indicates that transactions are to be replayed. # Specifies to replay transactions. If this parameter is absent, transactions are not replayed. --with-replay-transaction # The mode of OceanBase Database. --ob-mode MYSQL \ # The Schema of OceanBase Database that needs to be replayed. --target-db-schemas "AAA,BBB"After this command is executed, the real-time replay progress is displayed.
Destination SCHEMA :::SCHEMA : [ schema_name ] : -----================ [ schema_name ] Traffic histogram [ Step size: 12 seconds ]===============------- 4412.0┤ █ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ 4191.4┤ █ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ 3970.8┤ █ ─ ─ ─ ─ ─ ─ ─ █ █ ─ ─ ─ ─ █ ─ 3750.2┤ █ ─ ─ ─ ─ ─ ─ ─ █ █ ─ ─ ─ ─ █ ─ 3529.6┤ █ ─ ─ ─ ─ ─ ─ ─ █ █ ─ ─ ─ ─ █ ─ 3309.0┤ █ ─ ─ ─ ─ ─ ─ ─ █ █ █ ─ ─ █ █ ─ 3088.4┤ █ ─ ─ ─ ─ ─ ─ █ █ █ █ █ ─ █ █ █ 2867.8┤ █ ─ ─ █ ─ ─ █ █ █ █ █ █ █ █ █ █ 2647.2┤ █ ─ ─ █ ─ ─ █ █ █ █ █ █ █ █ █ █ 2426.6┤ █ ─ ─ █ █ ─ █ █ █ █ █ █ █ █ █ █ 2206.0┤ █ ─ ─ █ █ █ █ █ █ █ █ █ █ █ █ █ 1985.4┤ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ 1764.8┤ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ 1544.2┤ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ 1323.6┤ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ 1103.0┤ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ 882.4┤ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ 661.8┤ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ 441.2┤ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ 220.6┤ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ Start time: 2021-05-31 00:14:04 End time: 2021-05-31 00:17:15 Time spent (seconds): 190 Requests: 50774 ******************************************************************************************View the replay report.
- After the replay is completed, a folder containing CSV files is generated in the
reportdirectory. You can view the traffic replay comparison between the source and target databases by using the CSV report files in the folder. The SQL statements in the CSV files are aggregated based on the SQL IDs.
The folder name is in the format of "Time and date+string", and the CSV file name is in the format of "Schema name.csv". If multiple schemas exist, multiple corresponding CSV files are generated.
- After the replay is completed, a folder containing CSV files is generated in the