Preprocessing functions

2025-11-17 03:00:24  Updated

You can configure a preprocessing function for each column when you define a control file.

Considerations

  • For a preprocessing function configured for a column, only the values of the current column can be used as formal parameters. For example, c14 "concat(c15, '_suffix')" is not supported, and c15 "concat(c15, '_suffix')" is supported.

  • LPADB() and RPADB() are suitable for processing characters such as letters and digits. When you use either function to process an emoji, the emoji may be truncated.

  • REPLACE() in OBLOADER & OBDUMPER differs from that in Oracle and MySQL. It has the syntax that is similar to an Oracle function but is implemented like a MySQL function.

  • NVL() references the corresponding built-in function in MySQL. It distinguishes an empty character from NULL in implementation.

  • For preprocessing functions related to date and time, such as SYSTIMESTAMP, TMSFMT(), and TO_TIMESTAMP(), the value is accurate only to milliseconds. In addition, the server clock must be the same as that of the database.

  • LPAD(char,length[,pad_string]) and RPAD(char,length[,pad_string]) reference the corresponding built-in functions in MySQL. The length parameter specifies the total length of the return value that is displayed on the client. For most character sets, it is the actual number of characters in the return value. However, for some multibyte character sets, the displayed length of a string may differ from the actual length. Therefore, we recommend that you do not use this function to process multibyte values.

Functions

The following table describes the preprocessing functions supported in the current version.

Function signature Return type Description
LOWER(char) String Converts letters in a parameter value into lowercase. The parameter value can be a column name, constant, or function expression. If the parameter value is a constant, enclose it with single quotation marks (' ').
Example: LOWER('A').
LTRIM(char[,set]) String Matches and truncates a parameter value from left to right. The value of the char parameter can be a column name, constant, or function expression. The value of the set parameter is a constant. If the set parameter is omitted, its value is replaced with a space by default.
Example: LTRIM(' abc ').
RTRIM(char[,set]) String Matches and truncates a parameter value from right to left. The value of the char parameter can be a column name, constant, or function expression. The value of the set parameter is a constant. If the set parameter is omitted, its value is replaced with a space by default.
Example: RTRIM(' abc ').
SUBSTR(char,position[,length ]) String Truncates a parameter value based on the specified start position and length. The value of the char parameter can be a column name, constant, or function expression. The values of the position and length parameters are numeric constants.
Example: SUBSTR('abc',0,3).
TRIM(char) String Truncates leading and trailing spaces of a parameter value. The value of the char parameter can be a column name, constant, or function expression.
Example: TRIM(' abc ').
REVERSE(char) String Reverses parameter values.
For example, REVERSE("C1") reverses characters in each value of the C1 column.
UPPER(char) String Converts letters in a parameter value into uppercase. The value of char can be a column name, constant, or function expression. If the parameter value is a constant, enclose it with single quotation marks (' ').
Example: UPPER('a').
NANVL(char, default) String Verifies whether a parameter value is a number. If not, the default value is returned. The value of the char parameter can be a column name, constant, or function expression. The value of the default parameter is a constant.
Example: NANVL('1','1').
REPLACE(char,search[,replacement ]) String Replaces a parameter value based on the specified search condition. The value of the char parameter can be a column name, constant, or function expression. The values of the search and replacement parameters are constants.
Example: REPLACE('abc','a','A').
NVL(char, default) String Verifies whether a parameter value is left empty. If yes, the default value is returned. The value of the char parameter can be a column name, constant, or function expression. The value of the default parameter is a constant.
Example: NVL('a','--').
TO_TIMESTAMP(char,fmt1[,fmt2]) String Converts a parameter value into a date. If the conversion fails, NULL is returned. The value of the char parameter can be a column name, constant, or function expression. The value of the fmt1 parameter is a date parsing template. The value of the optional fmt2 parameter is a date formatting template.
Default format of the return value of fmt2: yyyy-MM-dd HH:mm:ss.SSS.
LENGTH(char) String Calculates the length of a parameter value. The value of the char parameter can be a column name, constant, or function expression.
LPAD(char,length[,pad_string]) String Appends a character string of the specified length to the left of a parameter value. If the specified length is smaller than the length of the parameter value, the parameter value is truncated. The value of the char parameter can be a column name, constant, or function expression. The values of the length and pad_string parameters are constants. In addition, the value of the pad_string parameter must be a single-byte character.
Example: LPAD('a',1,'x').
RPAD(char,length[,pad_string]) String Appends a character string of the specified length to the right of a parameter value. If the specified length is smaller than the length of the parameter value, the parameter value is truncated. The value of the char parameter can be a column name, constant, or function expression. The values of the length and pad_string parameters are constants. In addition, the value of the pad_string parameter must be a single-byte character.
Example: RPAD('a',1,'x').
CONVERT(char,charset1[,charset2]) String Converts the character set of a parameter value. The value of the char parameter can be a column name, constant, or function expression. The values of the charset1 and charset2 parameters are constants.
Example: CONVERT('a','utf-8','gbk').
CONCAT(char1,char2) String Concatenates parameter values. The values of the char1 and char2 parameters can be column names, constants, or function expressions.
Example: CONCAT('a','b').
NONE String Returns a parameter value without processing it.
SYSTIMESTAMP String Returns the timestamp of the current server without processing the parameter value.
Default format: yyyy-MM-dd HH:mm:ss.SSS.
SYSDATE String Returns the date on the current server without processing the parameter value.
Default format: yyyy-MM-dd HH:mm:ss.
CONSTANT(char) String Returns a defined constant value without processing it. The value of the char parameter is a constant.
Example: CONSTANT('1').
TMSFMT(char,fmt1,default,fmt2) String Validates the date format of a parameter. If the validation fails, the default date format is returned.
Example: TMSFMT(c1,'yyyyMMddHHmmssSSS','2021-03-10 00:00:00.000','yyyy-MM-dd HH:mm:ss.SSS').
LPADB(char,byte_size[,pad_char]) String Appends a byte string of the specified length to the left of a parameter value. If the specified length is smaller than the length of the parameter value, the parameter value is truncated. The value of the char parameter can be a column name, constant, or function expression. The values of the byte_size and pad_char parameters are constants.
Example: LPADB('a',1,'x').
RPADB(char,byte_size[,pad_char]) String Appends a byte string of the specified length to the right of a parameter value. If the specified length is smaller than the length of the parameter value, the parameter value is truncated. The value of the char parameter can be a column name, constant, or function expression. The values of the byte_size and pad_char parameters are constants.
Example: RPADB('a',1,'x').
MASK(string str[, string upper[, string lower[, string number]]]) String Masks uppercase letters, lowercase letters, and digits in the values of the specified column.
For example, MASK("C1",A,a,b) masks the values of Column C1 by replacing uppercase letters with A, lowercase letters with a, and digits with b.
MASK_FIRST_N(string str[, int n]) String Masks the first n characters in the values of the specified column.
For example, MASK_FIRST_N("C1",'A','a','b',5) masks the first five characters in the values of Column C1 by replacing uppercase letters with A, lowercase letters with a, and digits with b.
MASK_LAST_N(string str[, int n]) String Masks the last n characters in the values of the specified column.
For example, MASK_LAST_N("C1",'A','a','b',5) masks the last five characters in the values of Column C1 by replacing uppercase letters with A, lowercase letters with a, and digits with b.
MASK_SHOW_FIRST_N(string str[, int n]) String Masks all characters except the first n characters in the values of the specified column.
For example, MASK_SHOW_FIRST_N("C1",'A','a','b',5) masks all characters except the first five characters in the values of Column C1 by replacing uppercase letters with A, lowercase letters with a, and digits with b.
MASK_SHOW_LAST_N(string str[, int n]) String Masks all characters except the last n characters in the values of the specified column.
For example, MASK_SHOW_LAST_N("C1",'A','a','b',5) masks all characters except the last five characters in the values of Column C1 by replacing uppercase letters with A, lowercase letters with a, and digits with b.
SEQUENCE(int m,int n) String Generates an incremental sequence for a specified column. The parameter m specifies the start value and n specifies the increment step.
For example, SEQUENCE(100,1) generates a sequence with a start value of 100 and an increment step of 1.
DB_SEQUENCE(string sequence) String Assigns sequence values to rows to be imported so that the rows can be imported into the database based on the sequence values.
For example, DB_SEQUENCE(seq01) specifies the sequence value seq01.
Note
This function is supported in OBLOADER V4.1.0 and later.
CHR(string) String Converts a number represented by a string into an ASCII value.
  • If the value of string is greater than 256, the function is equivalent to 'string' mod 256.
    For example, the results of both CHR ('10') and CHR('266') are '\n'.
  • If the value of any parameter is NULL, the CHR() function returns NULL.
  • The parameter value can be a column name or constant. If the parameter value is a constant, enclose it with single quotation marks (' ').
    Examples: CHR('13') and CHR(c1).
NCHR(string) String Converts a number represented by a string into an NVARCHAR2 value.
  • Unlike the CHR() function, the NCHR() function can handle parameter values greater than 256.
    For example, NCHR('257') returns ā.
  • If the value of any parameter is NULL, the NCHR() function returns NULL.
  • The parameter value can be a column name or constant. If the parameter value is a constant, enclose it with single quotation marks (' ').
    Examples: NCHR('13') and NCHR(c1).
ASCII(char) String Converts a character into a decimal ASCII value.
  • If multiple characters are passed in, only the first character is converted.
    For example, both ASCII('A') and ASCII('abc') return '65'.
  • The parameter value can be a column name or constant. If the parameter value is a constant, enclose it with single quotation marks (' ').
    Examples: ASCII('A') and ASCII(c1).
RAWTOHEX(string) String Converts a raw string into a hexadecimal string.
The parameter value can be a column name or constant. If the parameter value is a constant, enclose it with single quotation marks (' ').
Examples: RAWTOHEX('ABC') and RAWTOHEX(c1).
HEXTORAW(string) String Restores a hexadecimal string to a raw string.
The parameter value can be a column name or constant. If the parameter value is a constant, enclose it with single quotation marks (' ').
Examples: HEXTORAW('414243') and HEXTORAW(c1).

Note

OBLOADER can process only hexadecimal data that does not contain 0x or that is prefixed with 0x. If you want to use OBLOADER to process hexadecimal data, you must use the nested replace function to remove the data that contains 0x but is not prefixed with 0x. Example: lang=java (c1 "hextoraw(replace(c1, '0x', ''))");.

REPLACE(char,search[,replacement ]) String Replaces a parameter value based on the specified search condition.
The value of the char parameter can be a column name, constant, or function expression. The value of the search parameter can be a constant or function expression. The value of the replacement parameter is a constant.
Examples: REPLACE('abc','a','A') and REPLACE('abc',CHR('13'),'!')
TRUNC(number[, decimal]) String Truncates a number.
  • number specifies the number to be truncated, and decimal specifies the number of decimal places to be retained after truncation. For example, TRUNC(22.29, 1) returns 22.2.
  • If decimal is not specified, the number specified by number is truncated to zero decimal places.
    For example, TRUNC(22.29) returns 22.
  • The value of decimal can be negative, which specifies to truncate the number specified by number to the left of the decimal point and set the digit on the left of the decimal point to zero.
    For example, TRUNC(22.29, -1) returns 20.

Note

The parameters of the function support any numeric data types and non-numeric data types that can be implicitly converted into numeric data types.

sm4_encrypt (plaintext, key [,encoding]) String Encrypts the given data by using the SM4 algorithm and returns the encrypted data.
The plaintext parameter specifies the original text, key specifies the Base64-encoded ciphertext, and encoding specifies the encoding type of the ciphertext. Supported encoding types are Base64 and HEX. The default encoding type is Base64.
Example: c1 "sm4_encrypt(c1, 'myKey')".
sm4_decrypt(cipherText, key [,encoding]) String Decrypts the encrypted data by using the SM4 algorithm and returns the raw data. The ciphertext parameter specifies the ciphertext, key specifies the Base64-encoded ciphertext, and encoding specifies the encoding type. Supported encoding types are Base64 and HEX. The default encoding type is Base64.
Example: c1 "sm4_decrypt(c1, 'myKey')".
TIMESTAMP_ADD (dateAndTime, addInt, datetimeFieldString, pattern: yyyy-MM-dd HH:mm:ss) String Increases or decreases values of the TIMESTAMP type.
  • datetimeFieldString specifies the unit. Valid values are YEAR, MONTH, WEEK, DAY, HOUR, MINUTE, and SECOND.
  • addInt specifies the value to be added or deducted. A minus sign specifies to deduct the specified value.
  • pattern is optional and specifies the original time format. The default value is yyyy-MM-dd HH:mm:ss.
    If the value of pattern is the same as its default value, this parameter is optional. If the value of pattern is different from its default value, this parameter is required.
Example with pattern specified: TIMESTAMP_ADD(c1, '1', 'MONTH', 'yyyy-MM-dd HH:mm:ss').
Examples with pattern unspecified: TIMESTAMP_ADD(c1, '1', 'MONTH'), which specifies to increase the values of Column c1 by one month.
TIMESTAMP_ADD(c1, '-2', 'WEEK') specifies to decrease the values of Column c1 by two weeks.
DATE_ADD (date, addInt, datetimeFieldString, pattern: yyyy-MM-dd) String Increases or decreases values of the DATE type.
  • datetimeFieldString specifies the unit. Valid values are YEAR, MONTH, WEEK, and DAY.
  • addInt specifies the value to be added or deducted. A minus sign specifies to deduct the specified value.
  • pattern is optional and specifies the original time format. The default value is yyyy-MM-dd.
    If the value of pattern is the same as its default value, this parameter is optional. If the value of pattern is different from its default value, this parameter is required.
Example with pattern specified: DATE_ADD(c1, '-1', 'YEAR', 'yyyy-MM-dd').
Example with pattern unspecified: DATE_ADD(c1, '-1', 'YEAR'), which specifies to decrease the values of Column c1 by one year.
DATE_TRUNC (date, datetimeFieldString, pattern: yyyy-MM-dd) String Truncates values of the DATE type to the specified field.
  • datetimeFieldString specifies the unit. Valid values are YEAR, MONTH, WEEK, and DAY.
  • patternis optional and specifies the original date format. The default value is yyyy-MM-dd.
    If the value of pattern is the same as its default value, this parameter is optional. If the value of pattern is different from its default value, this parameter is required.
Example with pattern specified: DATE_TRUNC(c1, 'MONTH', 'yyyy-MM-dd').
Example with pattern unspecified: DATE_TRUNC(c1, 'MONTH'), which specifies to truncate the values of Column c1 to the month.
TIMESTAMP_TRUNC (dateAndTime, datetimeFieldString,pattern: yyyy-MM-dd HH:mm:ss) String Truncates values of the TIMESTAMP type to the specified field.
  • datetimeFieldString specifies the unit. Valid values are YEAR, MONTH, WEEK, DAY, HOUR, MINUTE, and SECOND.
  • patternis optional and specifies the original date format. The default value is yyyy-MM-dd HH:mm:ss.
    If the value of pattern is the same as its default value, this parameter is optional. If the value of pattern is different from its default value, this parameter is required.
Example with pattern specified: TIMESTAMP_TRUNC(c1, 'HOUR', 'yyyy-MM-dd HH:mm:ss').
Example with pattern unspecified: TIMESTAMP_TRUNC(c1, 'HOUR'), which specifies to truncate the values of Column c1 to the hour.
DAY_OF_MONTH (dateAndTime,pattern: yyyy-MM-dd HH:mm:ss) String Returns the day (1 to 31) in the month for values of the TIMESTAMP type.
For example: DAY_OF_MONTH (c1) specifies to return the day in the month for values of Column c1.
DAY_OF_WEEK (dateAndTime,pattern: yyyy-MM-dd HH:mm:ss) String Returns the day (1 to 7) in the week for values of the TIMESTAMP type.
For example, DAY_OF_WEEK (c1) specifies to return the day in the week for values of Column c1.
DAY_OF_YEAR(dateAndTime,pattern: yyyy-MM-dd HH:mm:ss) String Returns the day (1 to 366) in the year for values of the TIMESTAMP type.
For example, DAY_OF_YEAR (c1) specifies to return the day in the year for values of Column c1.
HOUR(dateAndTime,pattern: yyyy-MM-dd HH:mm:ss) String Returns the hour (0 to 23) for values of the TIMESTAMP type.
For example, HOUR (c1) specifies to return the hour for values of Column c1.
MNUTE (dateAndTime,pattern: yyyy-MM-dd HH:mm:ss) String Returns the minute (1-59) for values of the TIMESTAMP type.
For example, MNUTE (c1) specifies to return the minute for values of Column c1.
MONTH (dateAndTime,pattern: yyyy-MM-dd HH:mm:ss) String Returns the month (1 to 12) for values of the TIMESTAMP type.
For example, MONTH (c1) specifies to return the month for values of Column c1.
WEEK (dateAndTime,pattern: yyyy-MM-dd HH:mm:ss) String Returns the week for values of the TIMESTAMP type, namely, the week (1 to 53) in the year.
For example, WEEK (c1) specifies to return the week in the year for values of Column c1.
YEAR (dateAndTime,pattern: yyyy-MM-dd HH:mm:ss) String Returns the year for values of the TIMESTAMP type.
For example, YEAR (c1) specifies to return the month for values of Column c1.
sm3_digest (plaintext [,encoding]) String Hashes data by using the SM3 algorithm and returns the extracted digest. The parameters are described as follows:
  • plaintext: the original text.
  • encoding: the encoding type for the generated digest. Supported encoding types are Base64 and HEX. The default value is HEX.
Examples: c1 "sm3_digest(c1)"
c2 "sm3_digest(c2, 'BASE64')".

Notice

Groovy dynamic functions are no longer supported in obloader V3.x and later.

Contact Us