You can configure a preprocess function for each column defined in a control file.
Notes
For a preprocess function defined for a column, only the values of the current column can be used as formal parameters. For example,
c14 "concat(c15, '_suffix')"is not supported. Onlyc15 "concat(c15, '_suffix')"is supported.GROOVY()dynamically generates its signaturepublic String handleVirtualMethod$GroovyUUID(String value), so you can only use the fixed single input parametervalueto define the function.LPADB()andRPADB()are suitable for processing characters such as letters and numbers. It may be truncated when you use it to process Emoji.REPLACE()is used differently in Oracle and MySQL modes. It has the syntax that is similar to an Oracle function but is implemented like a MySQL function.NVL()refers to the corresponding built-in function in MySQL. It distinguishes an empty character fromNULLin implementation.For preprocess functions related to date and time, such as
SYSTIMESTAMP(),TMSFMT(), andTO_TIMESTAMP(), the time is accurate to milliseconds. In addition, the server clock must be the same as that of the database.LPAD(char,length[,pad_string])andRPAD(char,length[,pad_string])refer to the corresponding built-in functions of MySQL. The parameterlengthindicates the total length of the return value that is displayed on the client. In most cases, it displays the actual number of characters in the return value. For some multibyte character sets, however, the displayed length of a string may differ from the actual length. Therefore, we do not recommend that you use this function to process multibyte values.
Functions
The following table shows the preprocess functions supported by the current version.
| Function signature | Return type | Description |
|---|---|---|
| LOWER(char) | String | Converts letters in the parameter value to lowercase. The parameter value can be a column name, constant, or function expression. When the parameter value is a constant, quote it with single quotation marks (' '). Example: LOWER('A'). |
| LTRIM(char[,set]) | String | Matches and trims the parameter value from left to right. The value of char can be a column name, constant, or function expression. The value of set is a constant. By default, if the set parameter is omitted, it is matched as a space. Example: LTRIM(' abc '). |
| RTRIM(char[,set]) | String | Matches and trims the parameter value from right to left. The value of char can be a column name, constant, or function expression. The value of set is a constant. By default, if the set parameter is omitted, it is matched as a space. Example: RTRIM(' abc '). |
| SUBSTR(char,position[,length ]) | String | Truncates the parameter value based on the specified start position and length. The value of char can be a column name, constant, or function expression. The values of position and length are a numeric constant. Example: SUBSTR('abc',0,3). |
| TRIM(char) | String | Trims leading and trailing spaces of the parameter value. The value of char can be a column name, constant, or function expression. Example: TRIM(' abc '). |
| UPPER(char) | String | Converts letters in the parameter value to uppercase. The value of char can be a column name, constant, or function expression. When the parameter value is a constant, quote it with single quotation marks (' '). Example: UPPER('a'). |
| NANVL(char, default) | String | Verifies whether the parameter value is a number. If not, it returns the default value. The value of char can be a column name, constant, or function expression. The value of default is a constant. Example: NANVL('1','1'). |
| REPLACE(char,search[,replacement ]) | String | Specifies to replace the parameter value based on the specified search condition. The value of char can be a column name, constant, or function expression. The values of search and replacement are constants. Example: REPLACE('abc','a','A'). |
| NVL(char, default) | String | Verifies whether the parameter value is null. If yes, it returns the default value. The value of char can be a column name, constant, or function expression. The value of default is a constant. Example: NVL('a','--'). |
| TO_TIMESTAMP(char,fmt1[,fmt2]) | String | Converts the parameter value into a date. If the conversion fails, it returns NULL. The value of char can be a column name, constant, or function expression. The value of fmt1 is a date parsing template. The value of fmt2 is a date formatting template (optional). By default, the date is returned in the yyyy-MM-dd HH:mm:ss.SSS format. |
| LENGTH(char) | String | Calculates the length of the parameter value. The value of char can be a column name, constant, or function expression. |
| LPAD(char,length[,pad_string]) | String | Appends a character string of the specified length to the left of the parameter value. If the specified length is smaller than the length of the parameter value, it truncates the parameter value. The value of char can be a column name, constant, or function expression. The values of length and pad_string are constants. In addition, the value of pad_stringmust be a single-byte character. Example: LPAD('a',1,'x'). |
| RPAD(char,length[,pad_string]) | String | Appends a character string of the specified length to the right of the parameter value. If the specified length is smaller than the length of the parameter value, it truncates the parameter value. The value of char can be a column name, constant, or function expression. The values of length and pad_string are constants. In addition, the value of pad_stringmust be a single-byte character. Example: RPAD('a',1,'x'). |
| CONVERT(char, charset1 [,charset2] ) | String | Converts the encoding format of the parameter value. The value of char can be a column name, constant, or function expression. The values of charset1and charset2 are constants. Example: CONVERT('a','utf-8','gbk'). |
| CONCAT(char1,char2) | String | Concatenates the parameter values. The values of char1 and char2 can be a column name, constant, or function expression. Example: CONCAT('a','b'). |
| NONE() | String | Returns the parameter value without processing it. |
| SYSTIMESTAMP() | String | Returns the timestamp of the current server without processing the parameter value. The format is yyyy-MM-dd HH:mm:ss.SSS. |
| CONSTANT(char) | String | Returns the defined constant value without processing it. The value of char is a constant. Example: CONSTANT('1'). |
| GROOVY(char) | String | Passes the parameter into a GROOVY function for dynamic processing. The generated function parameter is a String value. Example: GROOVY('if(value == null){return 'not null';}') |
| TMSFMT(char,fmt1,default,fmt2) | String | Validates the date format of the parameter. If the validation fails, the default date format is returned. Example: TMSFMT(c1,'yyyyMMddHHmmssSSS','2021-03-10 00:00:00.000','yyyy-MM-dd HH:mm:ss.SSS'). |
| LPADB(char,byte_size[,pad_char]) | String | Appends a byte string of the specified length to the left of the parameter value. If the specified length is smaller than the length of the parameter value, it truncates the parameter value. The value of char can be a column name, constant, or function expression. The values of byte_size and pad_char are constants. Example: LPADB('a',1,'x'). |
| RPADB(char,byte_size[,pad_char]) | String | Appends a byte string of the specified length to the right of the parameter value. If the specified length is smaller than the length of the parameter value, it truncates the parameter value. The value of char can be a column name, constant, or function expression. The values of byte_size and pad_char are constants. Example: RPADB('a',1,'x'). |
