You can define rules in a control file by which OBLOADER can preprocess data before the data is imported.
File template
Control file template (field_position indicates the position in the data file):
lang=java
(
Column name Position of byte offsets (optional) "Preprocessing function" (optional) Mapping definition (optional),
Column name Position of byte offsets (optional) "Preprocessing function" (optional) Mapping definition (optional),
Column name Position of byte offsets (optional) "Preprocessing function" (optional) Mapping definition (optional),
Column name Position of byte offsets (optional) "Preprocessing function" (optional) Mapping definition (optional)
);
Sample control file (field_position indicates the position in the data file):
lang=java server=mysql|oracle (
c1 "nvl(c1, 'not null')" map(field_position),
c2 "none" map(field_position)
);
The name of the control file is table name. ctrl . The order of columns defined in the control file must be consistent with that defined in the table. If you import data in the POS format, you must define the positions of byte offsets. If import data in the CUT format, you do not need to define the positions of byte offsets. For more information about preprocessing functions, see Preprocessing functions.
Usage notes
If the column name defined in the table contains database keywords, you must add a grave accent (`````) both before and after the column name to escape the column name.
Example:
lang=java ( `WHEN` "lower(`WHEN`)", c2 "nanvl(c2,'0')" );In OBLOADER V3.x, the field mapping feature of map files is integrated into control files. Map files are no longer supported.
Example:
-- The table name is test, and the control file name is test.ctrl. Configurations: lang=java( c1 map(1), c2 map(2) );
Examples
Sample control file in the
--posformat:lang=java ( c1 position(1:5) "trim(c1)", -- Extract the first to fifth bytes of characters from values in Column c1 and truncate the leading and trailing spaces of the result. c2 position(7:25) "replace(trim(c2),'',' ')" -- Extract the 7th to 25th bytes of characters from values in Column c2, truncate the leading and trailing spaces of the result, and replace empty characters with spaces. );Sample control file in the
--cutformat:lang=java ( c1 "lower(c1)", -- Convert letters of values in Column c1 to lowercase. c2 "ltrim(c2)", -- Truncate leading spaces of values in Column c2. c3 "rtrim(c3)", -- Truncate trailing spaces of values in Column c3. c4 "substr(c4,0,5)", -- Extract a substring of five characters from values in Column c4. The extraction starts from the first byte of each value. c5 "trim(c5)", -- Truncate leading and trailing spaces of values in Column c5. c6 "upper(c6)", -- Convert letters of values in Column c6 to uppercase. c7 "nanvl(c7,'0')", -- Verify values in Column c7 and return 0 for non-numeric values. c8 "replace(c8,'a','A')", -- Replace Letter 'a' of values in Column c8 with Letter 'A'. c9 "nvl(c9,'nill')", -- Verify whether values in Column c9 are null and return nill for null values. c10 "to_timestamp(c10,'yyyyMMddHHmmssSSS')", -- Convert values in Column c10 to the yyyy-MM-dd HH:mm:ss.SSS format. Return null if formatting fails. c11 "length(c11)", -- Calculate the length of values in Column c11. c12 "lpad(c12,5,'x')", -- Append a string of five 'x' to the left of values in Column c12. c13 "rpad(c13,5,'x')", -- Append a string of five 'x' to the right of values in Column c13. c14 "convert(c14,'utf-8','gbk')", -- Convert the encoding of values in Column c14 from GBK to UTF-8. c15 "concat(c15, '_suffix')", -- Concatenate values in Column c15 with a specific constant. c16 "none", -- Do not process values in Column c16. c17 "systimestamp", -- Do not process values in Column c17. c18 "constant('1')", -- Do not process values in Column c18. Return a constant 1. c19 "groovy('if(value==null){return "nill";}else{return value.toUpperCase();}')", -- Pass the values in Column c19 to a dynamic function for processing. c20 "tmsfmt(c20,'yyyyMMddHHmmssSSS','20210310000000000','yyyyMMddHHmmssSSS')", -- Verify the dates of values in Column c20. If the verification fails, return the default value. c21 "lpadb(c21,5,'x')", -- Append five single-byte 'x' to the left of values in Column c21. c22 "rpadb(c22,5,'x')", -- Append five single-byte 'x' to the right of values in Column c22. c23 "case when length(trim(c23))<18 then 'Y' else 'N' end", -- Verify whether values in Column c23 match the specified condition. If yes, return 'Y'. c24 "case length(trim(c24)) when '1' then 'one' when '2' then 'two' else 'unknown' end", -- Verify whether values in Column c24 are equal to the specified value. If yes, return the corresponding value. )Sample control file for data cleansing:
-- The table name is test, and the control file name is test.ctrl. Configurations: lang=java ( c1 "nvl(c1, 'replacement')", -- This is equivalent to --null-replacer 'replacement'. c2 "replace(c2,'','replacement')", -- This is equivalent to --empty-replacer 'replacement'. c3 "tmsfmt(c3,'yyyyMMddHHmmsssSSSS','19700101000000000')" -- This is equivalent to --date-format 'yyyyMMddHHmmsssSSSS' --default-date '19700101000000000'. );Notice
The following parameters are not supported by OBLOADER V3.0.0 and later:
--null-pointer,--empty-replacer,--default-date, and--date-format.