Generate test data in batches

2025-06-24 11:54:40  Updated

This topic describes how to generate test data in batches by using a shell script, a stored procedure, and OceanBase Developer Center (ODC).

Prerequisites

  • You have deployed an OceanBase cluster and created a MySQL-compatible tenant.
  • You have the CREATE, INSERT, and SELECT privileges. If you do not have the required privileges, contact the administrator to obtain the privileges.

Use a shell script to generate test data in batches

You can compile a shell script to generate an SQL script that inserts test data in batches. This way, you do not need to manually write complex SQL statements. This method can generate a large amount of test data as required, improving efficiency and reducing manual work.

Procedure

  1. Create a test database and a test table.
  2. Create a shell script.
  3. Run the SQL script.
  4. View data.

Step 1: Create a test database and a test table

Use a database management tool, such as a command-line tool or a GUI-based tool, to create a database for storing test data and create a test table in the database.

  1. Connect to the prepared MySQL-compatible tenant.

    Here is an example:

    obclient -hxxx.xxx.xxx.xxx -P2881 -uroot@mysql001 -p****** -A
    
  2. Create a test database.

    Here is an example:

    Create a test database named test_sql_file_db.

    CREATE DATABASE test_sql_file_db;
    
  3. Create a test table.

    Here is an example:

    Create a test table named test_sql_file_db.test_sql_file_tbl1.

    CREATE TABLE test_sql_file_db.test_sql_file_tbl1 (id INT, name VARCHAR(50), email VARCHAR(50));
    

Step 2: Create a shell script

Use a text editor to create a shell script file with the .sh extension. In the shell script, use the output redirection symbol (> or >>) to write the generated test data into an SQL script file. During the loop or traversal, the generated data is written as SQL statements (INSERT) to the SQL script file.

  1. Open the terminal.

  2. Create a shell script file.

    Use the vi or vim editor to create a new shell script file.

    Here is an example:

    Write a shell script named generate_sql.sh.

    vi generate_sql.sh
    
  3. Enter the insert mode.

    Press the i or Insert key to enter the insert mode of vi or vim to edit the content of the file.

  4. Write the shell script logic.

    In insert mode, write the logic and commands for the shell script. These commands may be shell commands, conditional statements, loops, and functions.

    Here is an example:

    Content of the generate_sql.sh script is as follows:

    #!/bin/bash
    
    # Name the SQL file
    SQL_FILE="insert_test_sql_file_tbl1.sql"
    
    # Create the SQL file
    touch $SQL_FILE
    
    # Define the SQL statement
    INSERT_SQL="INSERT INTO test_sql_file_tbl1 (id, name, email) VALUES "
    
    # Generate 100,000 user records in loops
    for ((i=1; i<=100000; i++))
    do
        user_id=$i
        user_name="user_$i"
        user_email="user_$i@example.com"
        values="($user_id, '$user_name', '$user_email')"
        if (($i == 100000))
        then
            INSERT_SQL="$INSERT_SQL$values;"
        else
            INSERT_SQL="$INSERT_SQL$values, "
        fi
    done
    
    # Write SQL statements into the SQL file
    echo $INSERT_SQL >> $SQL_FILE
    

    Note

    • The script generates an SQL file named insert_test_sql_file_tbl1.sql and inserts 100,000 user records into it. You can modify the SQL statement and the number of user records generated as needed.
    • When you insert a large amount of data, pay attention to the resource usage of the relevant server in advance to avoid data insertion failures or performance degradation caused by insufficient resources.

  5. Save the file.

    Press the Esc key to exit the insert mode. Enter the :wq command to save the file and exit the vi or vim editor.

  6. Run the shell script file.

    Run the created shell script in the terminal to generate an SQL script.

    Here is an example:

    Run the created shell script. The following command generates an SQL script file named insert_test_sql_file_tbl1.sql in the current directory. The file contains 100,000 INSERT statements.

    sudo bash generate_sql.sh
    

Step 3: Run the SQL script

You can run the following command in the CLI to import data from the SQL script file.

obclient -h$host -u$user_name -P$port -p$password -D$database_name < $sql_file

The parameters are described as follows:

  • $host: the IP address for connecting to OceanBase Cloud, which is the IP address of an OceanBase Database Proxy (ODP) for connection through ODP, or the IP address of an OBServer node for direct connection.

  • $port: the port for connecting to OceanBase Cloud. For connection through ODP, the default value is 2883, which can be customized when ODP is deployed. For direct connection, the default value is 2881, which can be customized when OceanBase Cloud is deployed.

  • $database_name: the name of the database to be accessed.

    Notice

    The user for connecting to a tenant must have the CREATE, INSERT, and SELECT privileges on the database.

  • $user_name: the tenant account. Two formats are supported in the case of connection through ODP: username@tenant name#cluster name and cluster name:tenant name:username. In the case of direct connection, the format is username@tenant name.

  • $password: the password of the account.

  • $sql_file: the name of the SQL script file.

    Note

    To run the SQL script, use the absolute path of the SQL script file.

Here is an example:

Run the following command to connect to the specified OBServer node and import all INSERT statements in the SQL script file to the test_sql_file_db database to insert 100,000 data records into the insert_test_sql_file_tbl1 table.

obclient -hxxx.xxx.xxx.xxx -uroot@mysql001 -P2881 -p****** -Dtest_sql_file_db < /home/admin/test_data/insert_test_sql_file_tbl1.sql

Step 4: View data

Execute the following SQL statement to view the number of rows in the test_sql_file_db.test_sql_file_tbl1 table.

obclient [(none)]> SELECT count(*) FROM test_sql_file_db.test_sql_file_tbl1;

The return result is as follows:

+----------+
| count(*) |
+----------+
|   100000 |
+----------+
1 row in set

Use a stored procedure to generate test data in batches

You can use a stored procedure to automatically generate test data in batches in an effective manner.

Procedure

  1. Create a test database and a test table.
  2. Create a stored procedure.
  3. Call the stored procedure.
  4. View data.

Step 1: Create a test database and a test table

Use a database management tool, such as a command-line tool or a GUI-based tool, to create a database for storing test data and create a test table in the database.

  1. Connect to the prepared MySQL-compatible tenant.

    Here is an example:

    obclient -hxxx.xxx.xxx.xxx -P2881 -uroot@mysql001 -p****** -A
    
  2. Create a test database.

    Here is an example:

    Create a test database named test_db.

    obclient [(none)]> CREATE DATABASE test_db;
    
  3. Switch to the test_db database.

    obclient [(none)]> use test_db;
    

    The return result is as follows:

    Database changed
    obclient [test_db]>
    
  4. Create a test table.

    Here is an example:

    Create a table named test_pro_tbl1 with four fields:

    • id: an auto-increment integer field defined as the primary key.
    • create_time: a datetime field that indicates the time when data in a row was created. DEFAULT CURRENT_TIMESTAMP is used to set the default value of this field to the current time.
    • name: a character field that supports a maximum of 50 characters.
    • enrollment_date: a date field used to store date data.
    obclient [test_db]> CREATE TABLE test_pro_tbl1 (
      id INT NOT NULL AUTO_INCREMENT,
      create_time DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
      name VARCHAR(50),
      enrollment_date DATE,
      PRIMARY KEY(id));
    

Step 2: Create a stored procedure

  1. Specify a custom delimiter.

    Here is an example:

    Use DELIMITER to specify the custom delimiter //.

    DELIMITER //
    
  2. Create a stored procedure.

    Here is an example:

    Execute the following SQL statement to create a stored procedure named pro_generate_data. The input parameter is n, which specifies the number of data records to insert. Use loop statements and INSERT statements to generate and insert data. Here, test_pro_tbl1 is the name of the table into which the data is inserted, name and enrollment_date are the fields into which the data is inserted, and i is the loop counter. The CONCAT function is used to generate a name, and the DATE_ADD function is used to generate a date.

    CREATE PROCEDURE pro_generate_data(IN n INT)
    BEGIN
    DECLARE i INT DEFAULT 1;
    WHILE i <= n DO
        INSERT INTO test_pro_tbl1 (name, enrollment_date) VALUES (CONCAT('Name', i), DATE_ADD('2022-01-01', INTERVAL i DAY));
        SET i = i + 1;
    END WHILE;
    END;
    //
    
  3. Restore to the default delimiter, which is semicolon (;).

    DELIMITER ;
    

Step 3: Call the stored procedure

Use the CALL statement to call the stored procedure to execute the logic that generates the test data. You can pass a parameter to the stored procedure to specify the amount of data to be generated.

Here is an example:

Execute the following SQL statement to call the stored procedure pro_generate_data and pass the parameter value 100,000, specifying to insert 100,000 data records.

obclient [test_db]> CALL pro_generate_data(100000);

Note

You can increase or decrease the value of the input parameter to control the amount of test data. When adjusting the parameter value, take note of the database performance and storage space to avoid database crashes or storage space insufficiency.

Step 4: View data

Execute the following SQL statement to view the number of rows in the test_pro_tbl1 table.

obclient [test_db]> SELECT count(*) FROM test_pro_tbl1;

The return result is as follows:

+----------+
| count(*) |
+----------+
|   100000 |
+----------+
1 row in set

Use ODC to generate test data in batches

ODC is an enterprise-level database development platform tailored for OceanBase Cloud.

ODC provides the data mocking feature that can generate data based on field types in a table. This can meet your requirement for generating a large amount of data during database performance tests or feature verification.

Contact Us