You can define rules in a control file by which OBLOADER can preprocess data before the data is imported.
File template
Template for defining a control file:
lang=java
(
Column name Position of byte offsets (optional) "Preprocessing function" (optional) Mapping definition (optional),
Column name Position of byte offsets (optional) "Preprocessing function" (optional) Mapping definition (optional),
Column name Position of byte offsets (optional) "Preprocessing function" (optional) Mapping definition (optional),
Column name Position of byte offsets (optional) "Preprocessing function" (optional) Mapping definition (optional)
);
Currently, the control file is designed as follows:
lang=java
server=mysql|oracle (
c1 "nvl(c1, 'not null')" map(field_position),
c2 "none" map(field_position)
);
Notice
field_positionspecifies the column position of preprocessed data in the imported data file.- A control file is named in the format of
- The order of columns defined in the control file must be consistent with that defined in the table. Column names defined in the control file must be in the same letter case as those in the table.
- If you import data in the
POSformat, you must define the positions of byte offsets.- If you import data in the
CUT,CSV, orSQLformat, you do not need to define the positions of byte offsets.- For more information about preprocessing functions, see Preprocessing functions.
Usage notes
If a column name defined in the table contains database keywords, you must add a grave accent (`) both before and after the column name to escape the column name. Example:
lang=java ( `WHEN` "lower(`WHEN`)", c2 "nanvl(c2,'0')" );In OBLOADER V3.x, the field mapping feature of map files is integrated into control files. Map files are no longer supported. Example:
lang=java( C1 "NONE" map(1), C2 "NONE" map(1), // Multi-column mapping C3 "CONSTANT('xxx')", // xxx is a specified constant. C4 "SEQUENCE(100, 1)" // 100 is the first sequence number, and 1 is the increment. );Note
OBLOADER versions earlier than V3.0.0 support only single-column mapping and do not support generated column mapping. OBLOADER V3.0.0 and later versions allow you to map multiple columns and generated columns in control files. Generated columns are columns that do not exist in a data file.Examples
.ctrl.
Import data in the
POSformat. Example:lang=java ( c1 position(1:5) "trim(c1)", -- Extract the first to fifth bytes of characters from the values in Column c1 and truncate the leading and trailing spaces of the result. c2 position(7:25) "replace(trim(c2),'',' ')" -- Extract the seventh to twenty-fifth bytes of characters from the values in Column c2, truncate the leading and trailing spaces of the result, and replace the empty characters with spaces. );Import data in the
CUT,CSV, andSQLformats. Example:lang=java ( c1 "lower(c1)", -- Convert the letters of the values in Column c1 to lowercase. c2 "ltrim(c2)", -- Truncate the leading spaces of the values in Column c2. c3 "rtrim(c3)", -- Truncate the trailing spaces of the values in Column c3. c4 "substr(c4,0,5)", -- Extract a substring of five characters from the values in Column c4. The extraction starts from the first byte of each value. c5 "trim(c5)", -- Truncate the leading and trailing spaces of the values in Column c5. c6 "upper(c6)", -- Convert the letters of the values in Column c6 to uppercase. c7 "nanvl(c7,'0')", -- Verify the values in Column c7 and return 0 for non-numeric values. c8 "replace(c8,'a','A')", -- Replace Letter 'a' of values in Column c8 with Letter 'A'. c9 "nvl(c9,'nill')", -- Verify whether the values in Column c9 are null and return nill for null values. c10 "to_timestamp(c10,'yyyyMMddHHmmssSSS')", -- Convert the values in Column c10 to the yyyy-MM-dd HH:mm:ss.SSS format, and return null if formatting fails. c11 "length(c11)", -- Calculate the length of the values in Column c11. c12 "lpad(c12,5,'x')", -- Append a string of five 'x' to the left of the values in Column c12. c13 "rpad(c13,5,'x')", -- Append a string of five 'x' to the right of the values in Column c13. c14 "convert(c14,'utf-8','gbk')", -- Convert the character set of the values in Column c14 from GBK to UTF-8. c15 "concat(c15, '_suffix')", -- Concatenate the values in Column c15 with a specific constant. c16 "none", -- Do not process the values in Column c16. Return the values in the corresponding column directly. c17 "systimestamp", -- Do not process the values in Column c17. Return the timestamp of the current cluster directly. c18 "constant('1')", -- Do not process the values in Column c18. Return a constant 1. c19 "tmsfmt(c19,'yyyyMMddHHmmssSSS','20210310000000000','yyyyMMddHHmmssSSS')", -- Verify the dates of the values in Column c19. If the verification fails, return the default value. c20 "lpadb(c20,5,'x')", -- Append five single-byte 'x' to the left of the values in Column c20. c21 "rpadb(c21,5,'x')", -- Append five single-byte 'x' to the right of the values in Column c21. c22 "case when length(trim(c22))<18 then 'Y' else 'N' end", -- Check whether the values in Column c22 meet the specified condition. If yes, return the values in the corresponding column. c23 "case length(trim(c23)) when '1' then 'one' when '2' then 'two' else 'unknown' end", -- Check whether the values in Column c23 are equal to the specified value. If yes, return the values in the corresponding column. C24 "SYSDATE", -- The values in Column c24 are the current date. C25 "MASK(C25)", -- Anonymize the values in Column c25 by replacing the letters and numbers in the values with default anonymous characters (X for uppercase letters, x for lowercase letters, and n for numbers). You can specify only column names in MASK(). C26 "MASK_FIRST_N(C26,'A','a','b',3)", -- Anonymize the specified letters and numbers (counted forward from the position specified by N) for the values in Column 26. N is 0 by default, indicating counting from the first character in each value. C27 "MASK_LAST_N(C27,'A','a','b',3)", -- Anonymize the specified letters and numbers (counted backward from the position specified by N) for the values in Column 27. N is 0 by default, indicating counting from the last character in each value. C28 "MASK_SHOW_FIRST_N(C28,'A','a','b',3)", -- Anonymize the values in Column 28 except the specified letters and numbers (counted forward from the position specified by N) in the values. N is 0 by default, indicating counting from the first character in each value. C29 "MASK_SHOW_LAST_N(C29,'A','a','b',3)", -- Anonymize the values in Column 29 except the specified letters and numbers (counted backward from the position specified by N) in the values. N is 0 by default, indicating counting from the last character in each value. C30 "REVERSE(C30)", -- Reverse characters for the values in Column c30. )Notice
OBLOADER V3.0.0 and later versions do not support the--null-replacer,--empty-replacer,--default-date, or--date-formatoption.