You can configure a preprocessing function for each column when you define a control file.
Considerations
For a preprocessing function configured for a column, only the values of the current column can be used as formal parameters. For example,
c14 "concat(c15, '_suffix')"is not supported, andc15 "concat(c15, '_suffix')"is supported.LPADB()andRPADB()are suitable for processing characters such as letters and numbers. It may be truncated when you use it to process emojis.REPLACE()is used differently in Oracle and MySQL modes. It has the syntax that is similar to an Oracle function but is implemented like a MySQL function.NVL()refers to the corresponding built-in function in MySQL. It distinguishes an empty character fromNULLin implementation.For preprocessing functions related to date and time, such as
SYSTIMESTAMP,TMSFMT(), andTO_TIMESTAMP(), the value is accurate only to milliseconds. In addition, the server clock must be the same as that of the database.LPAD(char,length[,pad_string])andRPAD(char,length[,pad_string])refer to the corresponding built-in functions of MySQL. The length parameter indicates the total length of the return value that is displayed on the client. In most cases, it displays the actual number of characters in the return value. However, for some multibyte character sets, the displayed length of a string may differ from the actual length. Therefore, we recommend that you do not use this function to process multibyte values.
Functions
The following table describes the preprocessing functions supported in the current version.
| Function signature | Return type | Description |
|---|---|---|
| LOWER(char) | String | Converts letters in a parameter value into lowercase letters. The parameter value can be a column name, constant, or function expression. If the parameter value is a constant, quote it with single quotation marks (' '). Example: LOWER('A'). |
| LTRIM(char[,set]) | String | Matches and truncates a parameter value from left to right. The value of the char parameter can be a column name, a constant, or a function expression. The value of the set parameter is a constant. By default, if the set parameter is omitted, its value is replaced with a space. Example: LTRIM(' abc '). |
| RTRIM(char[,set]) | String | Matches and truncates a parameter value from right to left. The value of the char parameter can be a column name, constant, or function expression. The value of the set parameter is a constant. By default, if the set parameter is omitted, its value is replaced with a space. Example: RTRIM(' abc '). |
| SUBSTR(char,position[,length]) | String | Truncates a parameter value based on the specified start position and length. The value of the char parameter can be a column name, constant, or function expression. The values of the position and length parameters are numeric constants. Example: SUBSTR('abc',0,3). |
| TRIM(char) | String | Truncates leading and trailing spaces of a parameter value. The value of the char parameter can be a column name, a constant, or a function expression. Example: TRIM(' abc '). |
| REVERSE(char) | String | Reverses parameter values. For example, REVERSE("C1") reverses characters in each value of the C1 column. |
| UPPER(char) | String | Converts letters in a parameter value into uppercase letters. The value of the char parameter can be a column name, constant, or function expression. If the parameter value is a constant, quote it with single quotation marks (' '). Example: UPPER('a'). |
| NANVL(char, default) | String | Verifies whether a parameter value is a number. If not, the default value is returned. The value of the char parameter can be a column name, a constant, or a function expression. The value of the default parameter is a constant. Example: NANVL('1','1'). |
| REPLACE(char,search[,replacement]) | String | Replaces a parameter value based on the specified search condition. The value of the char parameter can be a column name, constant, or function expression. The values of the search and replacement parameters are constants. Example: REPLACE('abc','a','A'). |
| NVL(char, default) | String | Verifies whether a parameter value is left empty. If yes, the default value is returned. The value of the char parameter can be a column name, a constant, or a function expression. The value of the default parameter is a constant. Example: NVL('a','--'). |
| TO_TIMESTAMP(char,fmt1[,fmt2]) | String | Converts a parameter value into a date. If the conversion fails, NULL is returned. The value of the char parameter can be a column name, constant, or function expression. The value of the fmt1 parameter is a date parsing template. The value of the optional fmt2 parameter is a date formatting template. Default format of the return value of fmt2: yyyy-MM-dd HH:mm:ss.SSS. |
| LENGTH(char) | String | Calculates the length of a parameter value. The value of the char parameter can be a column name, a constant, or a function expression. |
| LPAD(char,length[,pad_string]) | String | Appends a character string of the specified length to the left of a parameter value. If the specified length is smaller than the length of the parameter value, the parameter value is truncated. The value of the char parameter can be a column name, a constant, or a function expression. The values of the length and pad_string parameters are constants. In addition, the value of the pad_string parameter must be a single-byte character. Example: LPAD('a',1,'x'). |
| RPAD(char,length[,pad_string]) | String | Appends a character string of the specified length to the right of a parameter value. If the specified length is smaller than the length of the parameter value, the parameter value is truncated. The value of the char parameter can be a column name, a constant, or a function expression. The values of the length and pad_string parameters are constants. In addition, the value of the pad_string parameter must be a single-byte character. Example: RPAD('a',1,'x'). |
| CONVERT(char,charset1[,charset2]) | String | Converts the character set of a parameter value. The value of the char parameter can be a column name, constant, or function expression. The values of the charset1 and charset2 parameters are constants. Example: CONVERT('a','utf-8','gbk'). |
| CONCAT(char1,char2) | String | Concatenates parameter values. The values of the char1 and char2 parameters can be a column name, constant, or function expression. Example: CONCAT('a','b'). |
| NONE | String | Returns a parameter value without processing it. |
| SYSTIMESTAMP | String | Returns the timestamp of the current server without processing the parameter value. Default format: yyyy-MM-dd HH:mm:ss.SSS. |
| SYSDATE | String | Returns the date on the current server without processing the parameter value. Default format: yyyy-MM-dd HH:mm:ss. |
| CONSTANT(char) | String | Returns a defined constant value without processing it. The value of the char parameter is a constant. Example: CONSTANT('1'). |
| TMSFMT(char,fmt1,default,fmt2) | String | Validates the date format of a parameter. If the validation fails, the default date format is returned. Example: TMSFMT(c1,'yyyyMMddHHmmssSSS','2021-03-10 00:00:00.000','yyyy-MM-dd HH:mm:ss.SSS'). |
| LPADB(char,byte_size[,pad_char]) | String | Appends a byte string of the specified length to the left of a parameter value. If the specified length is smaller than the length of the parameter value, the parameter value is truncated. The value of the char parameter can be a column name, a constant, or a function expression. The values of the byte_size and pad_char parameters are constants. Example: LPADB('a',1,'x'). |
| RPADB(char,byte_size[,pad_char]) | String | Appends a byte string of the specified length to the right of a parameter value. If the specified length is smaller than the length of the parameter value, the parameter value is truncated. The value of the char parameter can be a column name, a constant, or a function expression. The values of the byte_size and pad_char parameters are constants. Example: RPADB('a',1,'x'). |
| MASK(string str[, string upper[, string lower[, string number]]]) | String | Masks uppercase letters, lowercase letters, or numbers in a field value. For example, MASK("C1",A,a,b) masks the values of Column C1 by replacing uppercase letters with A, lowercase letters with a, and numbers with b. |
| MASK_FIRST_N(string str[, int n]) | String | Masks the first n characters in a field value. For example, MASK_FIRST_N("C1",'A','a','b',5) masks the first five characters in the values of Column C1 by replacing uppercase letters with A, lowercase letters with a, and numbers with b. |
| MASK_LAST_N(string str[, int n]) | String | Masks the last n characters in a field value. For example, MASK_LAST_N("C1",'A','a','b',5) masks the last five characters in the values of Column C1 by replacing uppercase letters with A, lowercase letters with a, and numbers with b. |
| MASK_SHOW_FIRST_N(string str[, int n]) | String | Masks all characters except the first n characters in a field value. For example, MASK_SHOW_FIRST_N("C1",'A','a','b',5) masks all characters except the first five characters in the values of Column C1 by replacing uppercase letters with A, lowercase letters with a, and numbers with b. |
| MASK_SHOW_LAST_N(string str[, int n]) | String | Masks all characters except the last n characters in a field value. For example, MASK_SHOW_LAST_N("C1",'A','a','b',5) masks all characters except the last five characters in the values of Column C1 by replacing uppercase letters with A, lowercase letters with a, and numbers with b. |
| SEQUENCE(int m,int n) | String | Generates an incremental sequence for a specified column. The parameter m specifies the start value and n specifies the increment step. For example, SEQUENCE(100,1) generates a sequence with a start value of 100 and an increment step of 1. |
| DB_SEQUENCE(string sequence) | String | Assigns sequence values to rows to be imported so that the rows can be imported into the database based on the sequence values. For example, DB_SEQUENCE(seq01) specifies the sequence value seq01.
Note |
| CHR(string) | String | Converts a number represented by a string into an ASCII value.
|
| NCHR(string) | String | Converts a number represented by a string into an NVARCHAR2 value.
|
| ASCII(char) | String | Converts a character into a decimal ASCII value.
|
| RAWTOHEX(string) | String | Converts a raw string into a hexadecimal string. The parameter value can be a column name or a constant. If the parameter value is a constant, quote it with single quotation marks (' '). Examples: RAWTOHEX('ABC') and RAWTOHEX(c1). |
| HEXTORAW(string) | String | Restores a hexadecimal string to a raw string. The parameter value can be a column name or a constant. If the parameter value is a constant, quote it with single quotation marks (' '). Examples: HEXTORAW('414243') and HEXTORAW(c1).
NoteOBLOADER can process hexadecimal data that does not contain |
| REPLACE(char,search[,replacement]) | String | Replaces a parameter value based on the specified search condition. The value of the char parameter can be a column name, a constant, or a function expression. The value of the search parameter can be a constant or a function expression. The value of the replacement parameter must be a constant. Examples: REPLACE('abc','a','A') and REPLACE('abc',CHR('13'),'!'). |
| TRUNC(number[, decimal]) | String | Truncates a number.
NoteThe parameters of the function support any numeric data types and non-numeric data types that can be implicitly converted into numeric data types. |
| sm4_encrypt (plaintext, key [,encoding]) | String | Encrypts the given data by using the SM4 algorithm and returns the encrypted data. The plaintext parameter indicates the original text, key indicates the Base64-encoded ciphertext, and encoding indicates the encoding type of the ciphertext. Supported encoding types are Base64 and HEX. The default encoding type is Base64. Example: c1 "sm4_encrypt(c1, 'myKey')". |
| sm4_decrypt(cipherText, key [,encoding]) | String | Decrypts the encrypted data by using the SM4 algorithm and returns the original data. The ciphertext parameter indicates the ciphertext, key indicates the Base64-encoded ciphertext, and encoding indicates the encoding type. Supported encoding types are Base64 and HEX. The default encoding type is Base64. Example: c1 "sm4_decrypt(c1, 'myKey')". |
| TIMESTAMP_ADD (dateAndTime, addInt, datetimeFieldString,pattern: yyyy-MM-dd HH:mm:ss) | String | Increases or decreases values of the TIMESTAMP type.
TIMESTAMPADD(c1, '1', 'MONTH') specifies to increase values of Column c1 by one month. TIMESTAMPADD(c1, '-2', 'WEEK') specifies to decrease values of Column c1 by two weeks. |
| DATE_ADD (date, addInt, datetimeFieldString,pattern: yyyy-MM-dd) | String | Increases or decreases values of the DATE type.
DATE_ADD(c1, '-1', 'YEAR') specifies to decrease values of Column c1 by one year. |
| DATE_TRUNC (date, datetimeFieldStringpattern: yyyy-MM-dd) | String | Truncates values of the DATE type to the specified field.
DATE_TRUNC(c1, 'MONTH') specifies to truncate values of Column c1 to the month. |
| TIMESTAMP_TRUNC (dateAndTime, datetimeFieldString,pattern: yyyy-MM-dd HH:mm:ss) | String | Truncates values of the TIMESTAMP type to the specified field.
TIMESTAMP_TRUNC(c1, 'HOUR') specifies to truncate values of Column c1 to the hour. |
| DAY_OF_MONTH (dateAndTime,pattern: yyyy-MM-dd HH:mm:ss) | String | Returns the day (1 to 31) in the month for values of the TIMESTAMP type. For example: DAY_OF_MONTH (c1) specifies to return the day in the month for values of Column c1. |
| DAY_OF_WEEK (dateAndTime,pattern: yyyy-MM-dd HH:mm:ss) | String | Returns the day (1 to 7) in the week for values of the TIMESTAMP type. For example, DAY_OF_WEEK (c1) specifies to return the day in the week for values of Column c1. |
| DAY_OF_YEAR(dateAndTime,pattern: yyyy-MM-dd HH:mm:ss) | String | Returns the day (1 to 366) in the year for values of the TIMESTAMP type. For example, DAY_OF_YEAR (c1) specifies to return the day in the year for values of Column c1. |
| HOUR(dateAndTime,pattern: yyyy-MM-dd HH:mm:ss) | String | Returns the hour (0 to 23) for values of the TIMESTAMP type. For example, HOUR (c1) specifies to return the hour for values of Column c1. |
| MNUTE (dateAndTime,pattern: yyyy-MM-dd HH:mm:ss) | String | Returns the minute (1-59) for values of the TIMESTAMP type. For example, MNUTE (c1) specifies to return the minute for values of Column c1. |
| MONTH (dateAndTime,pattern: yyyy-MM-dd HH:mm:ss) | String | Returns the month (1 to 12) for values of the TIMESTAMP type. For example, MONTH (c1) specifies to return the month for values of Column c1. |
| WEEK (dateAndTime,pattern: yyyy-MM-dd HH:mm:ss) | String | Returns the week for values of the TIMESTAMP type, namely, the week (1 to 53) in the year. For example, WEEK (c1) specifies to return the week in the year for values of Column c1. |
| YEAR (dateAndTime,pattern: yyyy-MM-dd HH:mm:ss) | String | Returns the year for values of the TIMESTAMP type. For example, YEAR (c1) specifies to return the year for values of Column c1. |
Notice
The groovy dynamic function is no longer supported in OBLOADER V3.x.