OBDUMPER can preprocess data to be exported based on predefined control files.
File template
Here is a template for defining a control file:
lang=java
(
Column name Offset position in bytes (optional) "Preprocessing function",
Column name Offset position in bytes (optional) "Preprocessing function",
Column name Offset position in bytes (optional) "Preprocessing function",
);
Column name: the name of a field in the schema of a database table. In OBDUMPER, column names are case-insensitive. To enable case sensitivity, enclose the column names in square brackets ([ ]) or backticks (`). For example,
[c1]indicates Column c1, and[C1]indicates Column C1.Offset position in bytes: Only data in the
POSformat supports defining offset positions in bytes.Relative offset: It is defined in the format of
position(length), wherelengthspecifies the field length in bytes. You can declare a relative offset by using thePOSITIONkeyword to specify a column length, thereby exporting a segment of bytes from the database table. Here is an example:id position(2), gender position(7)In the example,
id position(2)specifies to export the first and second bytes, andgender position(7)specifies to export the third to the ninth bytes.Preprocessing function: You can configure a preprocessing function for a specified column in the control file to preprocess data before the data is exported.
Notice
A control file is named in the format of <table name>.ctrl.
Considerations
If the column name defined in the table contains database keywords, you must add a backtick (`) both before and after the column name to escape the column name. Here is an example:
lang=java
(
`WHEN` "lower(`WHEN`)",
c2 "nanvl(c2,'0')"
);
Examples
Export data in the CUT format. Here is an example:
lang=java
(
c1 "lower(c1)", -- Convert letters in values of Column c1 to lowercase.
c2 "ltrim(c2)", -- Truncate leading spaces in values of Column c2.
c3 "rtrim(c3)", -- Truncate trailing spaces in values of Column c3.
c4 "substr(c4,0,5)", -- Extract a substring of five characters from values of Column c4. The extraction starts from the first byte of each value.
c5 "trim(c5)", -- Truncate leading and trailing spaces in values of Column c5.
c6 "upper(c6)", -- Convert letters in values of Column c6 to uppercase.
c7 "nanvl(c7,'0')", -- Verify values of Column c7 and return 0 for non-numeric values.
c8 "replace(c8,'a','A')", -- Replace Letter 'a' in values of Column c8 with Letter 'A'.
c9 "nvl(c9,'nill')", -- Verify whether values of Column c9 are null and return nill for null values.
c10 "to_timestamp(c10,'yyyyMMddHHmmssSSS')", -- Convert values of Column c10 to the yyyy-MM-dd HH:mm:ss.SSS format. Return null if formatting fails.
c11 "length(c11)", -- Calculate the length of values of Column c11.
c12 "lpad(c12,5,'x')", -- Append a string of five 'x' to the left of values of Column c12.
c13 "rpad(c13,5,'x')", -- Append a string of five 'x' to the right of values of Column c13.
c14 "convert(c14,'utf-8','gbk')", -- Convert the character set of values of Column c14 from GBK to UTF-8.
c15 "concat(c15, '_suffix')", -- Concatenate values of Column c15 with a specific constant.
c16 "none", -- Specify not to process values of Column c16. Return the values in the corresponding column directly.
c17 "systimestamp", -- Specify not to process values of Column c17. Return the timestamp of the current cluster directly.
c18 "constant('1')", -- Specify not to process values of Column c18. Return a constant 1.
c19 "tmsfmt(c19,'yyyyMMddHHmmssSSS','20210310000000000','yyyyMMddHHmmssSSS')", -- Verify the dates of values of Column c19. If the verification fails, return the default value.
c20 "lpadb(c20,5,'x')", -- Append five single-byte 'x' to the left of values of Column c20.
c21 "rpadb(c21,5,'x')", -- Append five single-byte 'x' to the right of values of Column c21.
c22 "case when length(trim(c22))<18 then 'Y' else 'N' end", - Check whether the values of Column c22 meet the specified condition. If yes, return the values in the corresponding column.
c23 "case length(trim(c23)) when '1' then 'one' when '2' then 'two' else 'unknown' end", -- Check whether the values of Column c23 are equal to the specified value. If yes, return the values in the corresponding column.
c24 "sysdate", -- Set values of Column c24 to the current date.
c25 "sequence(100,1)" -- Generate an incremental sequence for Column c25, with 100 as the start value and 1 as the increment step.
c26 "reverse(c26)", -- Reverse characters in the values of Column c26.
c27 "mask("c27", 'A','a','b')", -- Mask the values of Column c27 by replacing uppercase letters with A, lowercase letters with a, and numbers with b.
c28 "mask_first_n(c28, 'A','a','b', 5)", -- Mask the first five characters in the values of Column c28 by replacing uppercase letters with A, lowercase letters with a, and numbers with b.
c29 "mask_last_n(c29, 'A','a','b', 5)", -- Mask the last five characters in the values of Column c29 by replacing uppercase letters with A, lowercase letters with a, and numbers with b.
c30 "mask_show_first_n(c30, 'A','a','b', 5)", -- Mask all characters except the first five characters in the values of Column c30 by replacing uppercase letters with A, lowercase letters with a, and numbers with b.
c31 "mask_show_last_n(c31, 'A','a','b', 5)" -- Mask all characters except the last five characters in the values of Column c31 by replacing uppercase letters with A, lowercase letters with a, and numbers with b.
)
Note
For more information about the functions, see the function list in Preprocessing functions.