If the imported data contains bad record or discard record errors, OBLOADER V4.2.4 or later allows you to control the impact of such dirty data on the exit status of the process. You can manually fix such errors based on the generated error file.
This topic describes the mechanism for handling errors that occur when you use OBLOADER to import data. It also describes the optional parameters.
Error type
Valid errors
Data errors
| Error type | Description |
|---|---|
| Data type mismatch | For example, VARCHAR data is inserted into an INT column. |
| Invalid NULL value | A NOT NULL constraint is set. |
| Data type overflow | A string or INT type value exceeds the limit. |
| Data preprocessing failure | A call to a preprocessing function defined in the control file failed. |
| Column count mismatch | The number of columns in the file does not match the number of columns in the database. |
| Duplicate primary key or unique key | The primary key or unique key in the table is duplicated. |
Note
- When a data error occurs during data import, the import task cannot be retried.
- You can view details of errors including data type mismatch, invalid NULL values, data type overflow, data preprocessing failure, and column count mismatch in the
ob-loader-dumper.badfile in the {ob-loader-dumper}/logs directory, and details of errors that involve a duplicate primary or unique key in theob-loader-dumper.discardfile in the {ob-loader-dumper}/logs directory.
Environment errors
| Error type | Description |
|---|---|
| Network errors |
|
| Database system errors |
|
Note
- When an environment error occurs during data import, a single import task can be retried at most five times. The number of retries cannot be modified. After an import task reaches the retry threshold, the import task is marked as failed and the process is terminated.
- If an environment error occurs when a table without a primary key or unique key is imported, the import task will not be retried and the current subtask is marked as failed to avoid reinserting the same batch of data.
Errors that cannot be handled
| Error type | Description |
|---|---|
| Original file inaccessible | For example, the file is damaged or the disk access privilege is not granted. |
| Grammar errors in the original file | For example, a CSV file has no delimiter or an SQL file has no line break. |
| Unable to connect to the database | For example, the network between the client and server is disconnected or a connection parameter is incorrect. |
| OS errors | For example, the JVM OOM error is returned or the process is unexpectedly terminated. |
Note
When an error that cannot be handled occurs during data import, the import task cannot be retried and is marked as failed, and the process is terminated.
Best practices
In this example, the imported data has a known error. To fix this error, perform the following steps:
Prepare data.
Sample code:
DROP TABLE IF EXISTS `example`; CREATE TABLE `example` (c1 TINYINT PRIMARY KEY, c2 VARCHAR(12) NOT NULL UNIQUE); INSERT INTO `example` (c1, c2) VALUES (0, NULL), -- The column cannot be empty. (1, 'one'), (2, 'two'), (40, 'forty'), (50, 'fifty-four'), (77, 'seventy-seven'), -- The string exceeds 12 characters in length. (600, 'six hundred'), -- The number is beyond the range supported for the TINYINT data type. (40, 'forty'), -- Conflicts with '40' (42, 'fifty-four'); -- Conflicts with `'fifty-four'`Use OBLOADER to import data and specify the
--strictoption to skip errors and continue.Sample code:
$./obloader -h xx.x.x.x -P 2883 -u test -p ****** --sys-user **u*** --sys-password ****** -c cluster_a -t mysql -D USERA --csv --table 'example' -f /output --strictThe default value of the
--strictoption istrue, which indicates that the tool exits at the failed state (System exit 1) when the imported data containsbad recordordiscard recorderrors. To prevent dirty data from affecting the exit state (System exit 0) of the tool, you must set the--strictoption tofalse. The--strictoption can be used in combination with the--max-discardsor--max-errorsoption to specify that when the amount of duplicate data or the number of errors is within the specified range, the tool will skip the error and continue.Run OBLOADER, delete the error data, and skip four errors.
Sample result:
... 2023-07-11 06:25:22 [WARN] Bad records were found in file: "/home/admin/obloaderobdumper/output/data/USERA/TABLE/example.1.0.csv". Check "ob-loader-dumper.bad" for details 2023-07-11 06:25:22 [WARN] Skipped 4 problematic records for table `USERA`.`example`. Check "logs" folder for details 2023-07-11 06:25:24 [INFO] Drain and halt the worker group finished 2023-07-11 06:25:24 [INFO] Close connections: 26 of the BIZ DataSource. 2023-07-11 06:25:24 [INFO] Shutdown task context finished 2023-07-11 06:25:24 [ERROR] Error: Skipped 4 problematic records for table `USERA`.`example`. Check "logs" folder for details 2023-07-11 06:25:24 [INFO] ---------- Finished Tasks: 1 Running Tasks: 0 Progress: 100.00% ---------- 2023-07-11 06:25:24 [INFO] All Load Tasks Finished: ---------------------------------------------------------- | No.# | Type | Name | Count | Status | ---------------------------------------------------------- | 1 | TABLE | example | 9 -> 5 | FAILURE | ---------------------------------------------------------- Total Count: 5 End Time: 2023-07-11 06:25:24 2023-07-11 06:25:24 [INFO] Load record finished. Total Elapsed: 3.281 s 2023-07-11 06:25:24 [ERROR] System exit 1 ...Example of data to be imported:
... SELECT * FROM `example`; +----+------------+ | c1 | c2 | +----+------------+ | 0 | NULL | | 1 | one | | 2 | two | | 40 | forty | | 50 | fifty-four | +----+------------+ 5 rows in set (0.05 sec) ...View error details.
Note
- View details of data type mismatch and data type overflow errors in the
{ob-loader-dumper}/logs/ob-loader-dumper.badfile. - View details of duplicate primary key or unique key errors in the
{ob-loader-dumper}/logs/ob-loader-dumper.discardfile.
... cd /output/logs cat ob-loader-dumper.bad INSERT INTO example (`c1`,`c2`) VALUES ('77','seventy-seven'); Cause: Data too long for column 'c2' at row 1 INSERT INTO example (`c1`,`c2`) VALUES ('600','six hundred'); Cause: Out of range value for column cat ob-loader-dumper.discard INSERT INTO example (`c1`,`c2`) VALUES ('40','forty'); INSERT INTO example (`c1`,`c2`) VALUES ('42','fifty-four'); ...- View details of data type mismatch and data type overflow errors in the
Manually fix such errors based on the error messages.