Analyze and replay files|V3.3.2|OceanBase Migration Assessment| docs|Distributed Database

This topic describes how to analyze and replay files by using the distributed edition.

Limits

The distributed edition supports only the data replay from an Oracle database to an OceanBase database.

Analyze files

Run the following commands on the server where the client is located.

Notice:

The command for the distributed edition is similar to that for the standalone edition, except that the --distributed parameter and network parameters are added.

sh bin/start.sh \
# The task name.
--name test \
# A fixed value that specifies to perform a database replay.
--mode REPLAY \  
# A fixed value that specifies the COLLECT phase of replay.
--replay-phase COLLECT \
# The source file.
--source-file "/root/oma/test" \
# The number of parallel threads, which affects the replay performance. We recommend that you set the value to be as large as possible if the server performance permits.
--parallel-count 20 \
# The schema name corresponding to the UID. The schema name must be in uppercase. Multiple schema names must be separated with commas (,). Note that only one hyphen (-) is added to the beginning of this parameter.
-DuseUidToSchemaMap=103:UWBPS,20:UBXYZ \
# The number of parallel threads during parsing. Specify this parameter based on the server performance. Default value: 10.
--parse-thread-count 200 \
# The extended configuration file, which is a standard Java configuration file. You can leave this parameter unspecified if no special configuration is required.
# In the configuration file, specify one configuration item in each line and separate multiple configuration items with equal signs (=).
--extend-configure "/your/path/file/name.conf" \
# The date format of OceanBase Database. You can use this parameter to convert the date format.
# By default, if this parameter is not specified, the date format is DD-MON-RR. You can run the show variables like '%nls%format%' command to view the specific NLS configuration.
--nls-format "YYYY-MM-DD"  

# The following parameters are specific to the distributed edition.
# Indicates to start the distributed edition.
--distributed \
# The IP addresses and port ranges of two workers.
--node-ip-addresses "192.168.0.20:48500..48520,192.168.0.30:48500..48520" \
# The IP address of the local host.
--local-host-address "192.168.0.10" \
# The IP address of the cluster. Specify the IP address of any worker, without a port range.
--cluster-jdbc "192.168.0.20"  

# The following parameters are required only when you want to tune SQL statements.
# Indicates to enable tuning.
--tuning-mode \
# The IP address of the source Oracle database.
--source-db-host 10.10.10.1 \
# The port number of the source Oracle database.
--source-db-port 1521 \
# The username of the source Oracle database.
--source-db-user user \
# The password of the source Oracle database.
--source-db-password password \
# The service name of the source Oracle database.
--source-db-service-name orcl11g.us.oracle.com

After the analysis is completed, the client automatically distributes the analyzed files to each worker and displays the name of the result set of the analysis task. Record the name of the result set.

Replay files

Run the following commands on the server where the client is located.

Notice:

The command for the distributed edition is similar to that for the standalone edition, except that the --distributed parameter and network parameters are added based on the networking mode.

bin/start.sh \
# The task name.
--name send1 \
# A fixed value that specifies to perform a database replay.
--mode REPLAY \
# A fixed value that specifies the SEND phase of replay.
--replay-phase SEND \
# The replay mode. Valid values: READ, WRITE, READ_WRITE, PL, and ALL. Default value: READ.
--replay-mode ALL \
# The name of the analysis result set.
--analyze-table test_replay \  
# Specifies to replay in order.
--replay-process-name "sort" \
# The replay scale. The default value of this parameter is 1, which indicates that the replay will run at the original speed.
--replay-scale 5 \  
# The sampling ratio, which ranges from 0 to 1. The default value is 1, which indicates full sampling.  
--replay-sample 1 \
# The maximum number of threads, which is 400 by default and cannot exceed 2,000.     
--max-parallel 400 \
# The IP address of the destination database. You can specify only one destination database for replay.
--target-db-host 11.11.11.11 \
# The port number of the destination database.
--target-db-port 2883 \
# The tenant name and cluster name, which must be separated with a number sign (#). The tenant name precedes the cluster name.
# You can replay calls from multiple users, but the users must belong to the same tenant.
--target-db-tenant-cluster "oms_oracle#rrr_admin" \  
# The logon username and password, which must be separated with a colon (:). Multiple username/password pairs must be separated with commas (,).
# Note that only one hyphen (-) is added to the beginning of this parameter. The first username must be that of the SYS user, which is used to obtain information from the SQL_AUDIT view.
-DuserAndPassword=SYS:1***,schema_name:1***1,schema_name2:*** \  
# The custom configuration for the JDBC connection. Specify this parameter as needed. This parameter is optional.
-DjdbcConnectionConfigure=AAA:BBB \
# The process self-monitoring parameter, in minutes. Default value: 10. We recommend that you set this parameter to 30.
# If you set this parameter to XX, the system will forcibly exit the main process when the main process has not sent any data for XX minutes.
--monitor-processor-time 30 \  
# The interval for scanning the SQL_AUDIT view, in seconds. Default value: 60. Set this parameter based on your requirements.
--sql-audit-interval 300 \  
# The type of the destination database.
--target-db-type OBORACLE \
# The version of the destination database. 2.2.50 indicates V2.2.5x, 2.2.70 indicates V2.2.7x, and 3.1.20 indicates V3.1.x.
--target-db-version 2.2.70 \

# Indicates that transactions are to be replayed.
# If this parameter is absent, transactions are not to be replayed.
--with-replay-transaction \

# The following parameters are specific to the distributed edition.
# Indicates to start the distributed edition.
--distributed \
# The IP addresses and port ranges of two workers.
--node-ip-addresses "192.168.0.20:48500..48520,192.168.0.30:48500..48520" \
# The IP address of the local host.
--local-host-address "192.168.0.10" \
# The IP address of the cluster. Specify the IP address of any worker, without a port range.
--cluster-jdbc "192.168.0.20"

During the replay, the client displays the replay progress, active sessions, and thread pool information of each worker in real time.
After the replay is completed, you can perform the following operations to view the workload replay report.
1. Go to the replayReportTool directory and double-click the index.html file.
2. On the page that appears, click Authorize and View Report.
3. Select and open the oma.sqlite file.
  
  Generally, the oma.sqlite file is stored in the db folder under the root directory of OMA.
4. On the page that appears after you open the oma.sqlite file, click Select Report in the upper-right corner and select the workload replay report that you want to view from the drop-down list.
  
  You can view the users who support the replay calls in the destination database, replay success rate, replay traffic comparison chart, and list of replay information in the workload replay report.

Enterprise Edition

Analyze and replay files

Limits

Analyze files

Replay files