This topic describes the application scope and limits of extract, transform, and load (ETL) supported by OceanBase Migration Service (OMS).
Applicable versions
OMS V3.2.2 and later
Implementation principle
OMS uses the MySQL syntax as the input syntax for ETL. You can only filter data. You can simply enter the data filtering conditions to filter the full and incremental data.
The filter conditions entered are parsed into a syntax tree composed of operators. OMS uses the operators in the syntax tree to compute each record and then determines whether to filter the record.
OMS supports the following operators:
expr indicates an operator, and [|] indicates the selection of one in regex matching.
Unary operator:
For integers
+expr, -erpr, ~expr, !expr
For the Boolean type
expr NOT TRUE, expr IS TRUE
For comparison against null values
expr IS NULL, expr IS NOT NULL
Binary operators:
For the Boolean type
expr [AND | OR | XOR] expr
For arithmetic operations of integers
expr [ & | >> | << | + | - | * | / | % | ^] expr
expr | expr
For operations of value comparison
expr [> | >= | < | <= | = | != | <>] expr
For judgement based on Boolean values:
BETWEEN expr AND expr
IN (expr, ...., expr)
For functions (custom functions):
CONCAT (expr1, expr2)
IF expr1 == NULL, RETURN expr2
ELSE IF expr2 == null, RETURN expr1
ELSE RETURN expr1 + expr2
If both expr1 and expr2 are NULL, NULL is returned.
Otherwise, the CONCAT value of one or both non-NULL parties is returned.
SUBSTR (expr, exprS (,exprC)?)
SUBSTR (expr, exprS (,exprC)?) is equivalent to the SUBSTRING function of a database.
The index subscript of the string starts from 1. exprS is the position where the string copying starts, and exprC is the number of copied strings.
If exprC <= 0 or exprS == 0, an empty string is returned.
str = '1234567';
If exprS > 0 and exprS <= str.length, the string returned is as follows:
SUBSTRING(str, 1, 2) = '12'
SUBSTRING(str, 1) = '1234567'
SUBSTRING(str, 6, 3) = '67'
SUBSTRING(str, 7, 2) = '7'
If exprS > str.length, an empty string is returned.
SUBSTRING(str, 8, 2) = ''
If exprS < 0 and-exprS <= str.length, the start position is str.length + exprS.
SUBSTRING(str, -1, 2) = '7'
SUBSTRING(str, -4, 2) = '45'
SUBSTRING(str, -4) = '4567'
If exprS < 0 and -exprS > str.length, an empty string is returned.
REPLACE(exprO, expr1, expr2)
If exprO is null, null is returned.
If expr1 is null, exprO is returned.
If expr2 is null, it is equivalent to REPLACE(exprO, expr1, "") or REMOVE(exprO, expr1).
Usage notes
Unsupported types
Calculation and comparison of DOUBLE/FLOAT numbers are not supported. Otherwise, errors may occur in the results.
Comparison of NULL values
FALSE is returned for any comparison against NULL values, including =, >=, <=, ! =, >, and <, and FALSE is returned for NULL = NULL as well.
If you want to compare NULL values, use IS NULL or IS NOT NULL.
Comparison of CHAR strings
A trailing space is ignored in comparison, that is, 'abc' = 'abc '.
A leading space is not ignored, that is, ' abc' != 'abc'.
Case is not ignored, that is, 'abc' != 'ABC'.
Comparison of VARCHAR strings
A trailing space is ignored for the comparison of MySQL and DB2-related VARCHAR strings, that is, 'abc' = 'abc '.
A trailing space is ignored for the comparison of ORACLE VARCHAR strings, that is, 'abc' != 'abc '.
Precedence of AND and OR
For AND and OR operators not enclosed in parentheses, AND takes precedence over OR. Use parentheses to specify the precedence.
Column name conversion
ETL does not perform any case conversion on column names that you pass in. Use the escape character
\to escape column names.Implicit conversions
ETL uses CAST(expr AS type) for explicit conversions. The following type conversions are supported:
type = CHARACTER/CHAR to VARCHAR_STRING
type = DEC/NUMERIC/DECIMAL to DECIMAL
type = DOUBLE to DOUBLE
type = SIGNED/UNSIGNED to INT64
type = DATE to DATE
type = TIME to TIME
type = DATETIME to DATETIME
The following implicit conversions are supported:
INT64 and DECIMAL to DECIMAL for comparison
INT64 and STRING to INT64 for comparison
DECIMAL and STRING to DECIMAL for comparison
You can use a Linux timestamp or 'YYYY-MM-DD HH:MM:SS ZoneID' for the TIMSTAMP type of MySQL. If ZoneID is not specified, +08:00 is used to convert the time zone.
For ORACLE TIMESTAMP WITH LOCAL TIMEZONE, you can use 'YYYY-MM-DD HH:MM:SS ZoneID' for comparison. If ZoneID is not specified, +08:00 will be used to convert the time zone.
Sample code
Filter condition: 'id' > 100 AND 'ts' < '2022-07-11 00:00:00+08:00'.
INSERT/DELETE id = 1000, ts = '2022-07-10 00:00:00 +08:00' passes.
INSERT/DELETE id = 99, ts = '2022-07-10 00:00:00 +08:00' filtered out.
UPDATE Before update, id = 1000, ts = '2022-07-10 00:00:00 +08:00'.
After update, id = 1000, ts = '2022-07-10 00:00:00 +08:00' passes.
UPDATE Before update, id = 99, ts = '2022-07-10 00:00:00 +08:00'.
After update, id = 99, ts = '2022-07-10 00:00:00 +08:00' filtered out.
UPDATE Before update, id = 99, ts = '2022-07-10 00:00:00 +08:00'.
After update, id = 1000, ts = '2022-07-10 00:00:00 +08:00'.
Split into INSERT id = 1000, ts = '2022-07-10 00:00:00 +08:00'.
UPDATE Before update, id = 1000, ts = '2022-07-10 00:00:00 +08:00'.
After update, id = 99, ts = '2022-07-10 00:00:00 +08:00'.
Split into DELETE id = 1000, ts = '2022-07-10 00:00:00 +08:00'.