OBDUMPER can preprocess the exported data by using a predefined control file.
File template
The template for defining a control file is as follows:
lang=java
(
${Column name} ${Byte offset} (optional) "${Preprocessing function}",
${Column name} ${Byte offset} (optional) "${Preprocessing function}",
${Column name} ${Byte offset} (optional) "${Preprocessing function}"
);
Column name: the name of a field in the database table structure. OBDUMPER is not case-sensitive to column names. If you want to specify column names in a case-sensitive manner, enclose the column names in either square brackets [ ] or backticks ``. For example,
[c1]indicates the c1 column, and[C1]indicates the C1 column.Byte offset: This option is supported only for data in the
--posformat.Relative offset:
position(length), wherelengthindicates the length of the field in bytes. You can use the relative offset declaration method of the POSITION keyword to specify the column length and export a specific segment of bytes from the database table. Example:id position(2), gender position(7)Here,
id position(2)indicates that bytes 1 to 2 are exported, andgender position(2)indicates that bytes 3 to 9 are exported.Preprocessing function: Configure a preprocessing function in the control file for a specified column to preprocess the exported data.
Notice
The naming convention for a control file is <table name>.ctrl.
Considerations
If a column name in the table contains a database keyword, you must enclose the column name in backticks (`). Example:
lang=java ( `WHEN` "lower(`WHEN`)", c2 "nanvl(c2,'0')" );A control file cannot take effect together with the
--exclude-column-namesoption. The functionality of the--exclude-column-namesoption is already included in the control file.The control file must list the field names in the target database table. Otherwise, the field values in the data file cannot be correctly corresponded.
Examples
Here is an example of exporting data in the CUT format Here is an example:
lang=java
(
c1 "lower(c1)", -- Convert the characters in the c1 column to lowercase.
c2 "ltrim(c2)", -- Truncate leading spaces from the values in the c2 column.
c3 "rtrim(c3)", -- Truncate trailing spaces from the values in the c3 column.
c4 "substr(c4,0,5)", -- Extract a string of 5 characters from the first position in the values of the c4 column.
c5 "trim(c5)", -- Truncate leading and trailing spaces from the values in the c5 column.
c6 "upper(c6)", -- Convert the characters in the c6 column to uppercase.
c7 "nanvl(c7,'0')", -- Verify if the values in the c7 column are numeric. If not, return 0.
c8 "replace(c8,'a','A')", -- Replace 'a' with 'A' in the values of the c8 column.
c9 "nvl(c9,'nill')", -- Check if the values in the c9 column are null. If so, return the string 'nill'.
c10 "to_timestamp(c10,'yyyyMMddHHmmssSSS')", -- Format the values in the c10 column. If the formatting fails, return null. Otherwise, return a timestamp in the format of yyyy-MM-dd HH:mm:ss.SSS.
c11 "length(c11)", -- Calculate the length of the values in the c11 column.
c12 "lpad(c12,5,'x')", -- Add a 5-character string 'x' to the left of the values in column c12
c13 "rpad(c13,5,'x')", -- Add a 5-character string 'x' to the right of the values in column c13
c14 "convert(c14,'utf-8','gbk')", -- Convert the character encoding of the values in the c14 column from gbk to utf-8.
c15 "concat(c15, '_suffix')", -- Concatenate the values in the c15 column with the constant '_suffix'.
c16 "none", -- Return the values in the c16 column without any processing.
c17 "systimestamp", -- Return the current timestamp of the cluster for the values in the c17 column.
c18 "constant('1')", -- Return the constant value 1 for the values in the c18 column.
c19 "tmsfmt(c19,'yyyyMMddHHmmssSSS','20210310000000000','yyyyMMddHHmmssSSS')", -- Verify the date in the values of the c19 column. If the verification fails, return the default value.
c20 "lpadb(c20,5,'x')", -- Add five single-byte 'x' characters to the left side of the values in the c20 column.
c21 "rpadb(c21,5,'x')", -- Add five single-byte 'x' characters to the right side of the values in the c21 column.
c22 "case when length(trim(c22))<18 then 'Y' else 'N' end", -- Check if the length of the values in the c22 column (after trimming) is less than 18. If so, return 'Y'; otherwise, return 'N'.
c23 "case length(trim(c23)) when '1' then 'one' when '2' then 'two' else 'unknown' end", -- Check if the values in the c23 column (after trimming) are equal to '1' or '2'. If so, return the corresponding value; otherwise, return 'unknown'.
c24 "sysdate", -- Return the current date for the values in the c24 column.
c25 "sequence(100,1)" -- Generate an incremental sequence value for the c25 column. The initial value is 100 and the increment is 1.
c26 "reverse(c26)", -- Reverse the values in the c26 column.
c27 "mask("c27",'A','a','b')", -- Convert uppercase letters in the c27 column to 'A', lowercase letters to 'a', and digits to 'b'.
c28 "mask_first_n(c28,'A','a','b',5)", -- Convert the first 5 characters of the c28 column: uppercase letters to 'A', lowercase letters to 'a', and digits to 'b'.
c29 "mask_last_n(c29,'A','a','b',5)", -- Convert the last 5 characters of the c29 column: uppercase letters to 'A', lowercase letters to 'a', and digits to 'b'.
c30 "mask_show_first_n(c30,'A','a','b',5)", -- Convert all characters except the first 5 of the c30 column: uppercase letters to 'A', lowercase letters to 'a', and digits to 'b'.
c31 "mask_show_last_n(c31,'A','a','b',5)" -- Convert all characters except the last 5 of the c31 column: uppercase letters to 'A', lowercase letters to 'a', and digits to 'b'.
);
Note
For more information, see the list of preprocessing functions.