Type conversion during expression evaluation
When an operator in an expression is used with operands of different types, data type conversion occurs to make the operands compatible.
Conversion methods
OceanBase Database supports both explicit and implicit data type conversion.
Implicit data type conversion occurs when an operation requires a parameter of a specified data type, but the actual parameter value provided in the statement is not of that specified data type. In such cases, OceanBase Database automatically converts the actual parameter value to the specified data type before proceeding with the operation.
For example, OceanBase Database automatically converts strings to numbers and vice versa as needed.
obclient> SELECT 1+'1';
+-------+
| 1+'1' |
+-------+
| 2 |
+-------+
1 row in set
Explicit data type conversion in OceanBase Database is achieved using the CAST function. For example:
obclient> SELECT 31.4, CAST(31.4 AS TIME);
+------+--------------------+
| 31.4 | CAST(31.4 AS TIME) |
+------+--------------------+
| 31.4 | 00:00:31 |
+------+--------------------+
1 row in set
For more information about the CAST function, see CAST.
Conversion rules
Conversion rules for comparison operations
If one or both parameters are
NULL, the comparison result isNULL, except for the<=>comparison operator. ForNULL <=> NULL, the result is true, and no conversion is needed.If both parameters are strings, they are compared as strings.
If both parameters are integers, they are compared as integers.
If not compared with a number, hexadecimal values are treated as binary strings.
A single-row subquery from one or more tables is not considered a constant. For example, if an integer returned by a subquery is compared with a
DATETIMEvalue, both operands are treated as integers. Since integers cannot be converted to time values, to compare the operands asDATETIMEvalues, useCAST()to explicitly convert the subquery value toDATETIME.If one parameter is a
TIMESTAMPorDATETIMEcolumn and the other is a constant, the constant is converted to a timestamp before the comparison. For safety, always use complete date-time, date, or time strings when performing comparisons. For example, when usingBETWEENwith date or time values, to get the best results, useCAST()to explicitly convert the values to the required data type.If one parameter is a decimal value, the comparison type depends on the other parameter. If the other parameter is a decimal or integer value, the parameters are compared as decimal values. If the other parameter is a floating-point value, the parameters are compared as floating-point values.
In all other cases, the parameters are compared as floating-point (real) numbers. For example, string and numeric operands are compared as floating-point numbers.
Comparisons between floating-point numbers and large integers are approximate because integers are converted to double-precision floating-point numbers before the comparison, and not all 64-bit integers can be accurately represented.
Type downgrading
To achieve efficient comparisons, constants are aligned with column types as much as possible while maintaining comparison semantics. Currently, most comparisons between columns and constants are supported under standard semantics. Remaining comparisons require casting, which are listed separately:
- For
intoruintcolumns, constants of character type,decimaltype, and floating-point (double/float) type can be downgraded to integer columns. - For
decimalcolumns, floating-point (double/float) values can be downgraded todecimalcolumns. - For
yearcolumns, numeric types (including integers, floating-point, anddecimal), character types, and time types (includingdate,datetime,timestamp, andtime) can be downgraded toyearcolumns. - For
datecolumns, character types,datetime,timestamp, andtimetypes can be downgraded todatecolumns. - For
timestampcolumns,datetimevalues can be downgraded totimestampcolumns.
The following scenarios involve casting (excluding complex types like arrays, XML, and JSON):
- For
floatcolumns, comparisons always usedouble, so casting is required when comparing with any constant. - For
fixed doublecolumns, defined with a precision specification likedouble(10, 2), casting is required when the constant has more decimal places than the column. - For character columns, comparisons with numeric types are performed as numeric comparisons, requiring casting the column to a numeric type. Character set comparison rules may also require casting.
Non-standard comparisons
The logic for non-standard comparisons is similar to type downgrading, handling only comparisons between constants and columns. Comparisons between columns are not processed. Currently, only integer constants and character columns are supported. Note that setting it to range may result in sorting that does not align with standard semantics, so choose based on data characteristics.
Here are some examples:
Create a table named
t1with abiz_daycolumn of typevarchar(10).obclient> CREATE TABLE t1 (biz_day varchar(10)); Query OK, 0 rows affected (0.216 sec)View the execution plan under default settings.
obclient> EXPLAIN SELECT * FROM t1 WHERE biz_day BETWEEN 20200101 AND 20200201; +-------------------------------------------------------------------------------------------------------------------------------------------------+ | Query Plan | +-------------------------------------------------------------------------------------------------------------------------------------------------+ | ====================================================== | | |ID|OPERATOR |NAME|EST.ROWS|EST.TIME(us)| | | ------------------------------------------------------ | | |0 |COLUMN TABLE FULL SCAN|T1 |2 |3 | | | ====================================================== | | Outputs & filters: | | ------------------------------------- | | 0 - output([T1.BIZ_DAY]), filter([cast(T1.BIZ_DAY, NUMBER(-1, -85)) >= 20200101], [cast(T1.BIZ_DAY, NUMBER(-1, -85)) <= 20200201]), rowset=16 | | access([T1.BIZ_DAY]), partitions(p0) | | is_index_back=false, is_global_index=false, filter_before_indexback[false,false], | | range_key([T1.__pk_increment]), range(MIN ; MAX)always true | +-------------------------------------------------------------------------------------------------------------------------------------------------+ 11 rows in set (0.004 sec)- Query plan: By default, the query optimizer converts the
biz_daycolumn from string to number type for comparison. - Filter condition:
filter([cast(T1.BIZ_DAY, NUMBER(-1, -85)) >= 20200101], [cast(T1.BIZ_DAY, NUMBER(-1, -85)) <= 20200201])indicates thatbiz_dayis converted to a number for range comparison.
- Query plan: By default, the query optimizer converts the
Set
non_standard_comparison_leveltoequaland view the execution plan.obclient> EXPLAIN SELECT /*+opt_param('non_standard_comparison_level', 'equal')*/ * FROM t1 WHERE biz_day BETWEEN 20200101 AND 20200201; +----------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Query Plan | +----------------------------------------------------------------------------------------------------------------------------------------------------------------+ | ====================================================== | | |ID|OPERATOR |NAME|EST.ROWS|EST.TIME(us)| | | ------------------------------------------------------ | | |0 |COLUMN TABLE FULL SCAN|t1 |1 |3 | | | ====================================================== | | Outputs & filters: | | ------------------------------------- | | 0 - output([t1.biz_day]), filter([cast(t1.biz_day, DECIMAL(-1, -1)) >= cast(20200101, DECIMAL(20, 0))], [cast(t1.biz_day, DECIMAL(-1, -1)) <= cast(20200201, | | DECIMAL(20, 0))]), rowset=16 | | access([t1.biz_day]), partitions(p0) | | is_index_back=false, is_global_index=false, filter_before_indexback[false,false], | | range_key([t1.__pk_increment]), range(MIN ; MAX)always true | +----------------------------------------------------------------------------------------------------------------------------------------------------------------+ 12 rows in set (0.003 sec)- Query plan: After setting
non_standard_comparison_leveltoequal, the query optimizer converts thebiz_daycolumn from string to decimal type for comparison. - Filter condition:
filter([cast(t1.biz_day, DECIMAL(-1, -1)) >= cast(20200101, DECIMAL(20, 0))], [cast(t1.biz_day, DECIMAL(-1, -1)) <= cast(20200201, DECIMAL(20, 0))])indicates thatbiz_dayis converted to a decimal for range comparison. Sincebetweenis a range query, the equal comparison rule forequaldoes not convert the constant type.
- Query plan: After setting
Set
non_standard_comparison_leveltorangeand view the execution plan.obclient> EXPLAIN SELECT /*+opt_param('non_standard_comparison_level', 'range')*/ * FROM t1 WHERE biz_day BETWEEN 20200101 AND 20200201; +-------------------------------------------------------------------------------------------------------------------------------------------------------+ | Query Plan | +-------------------------------------------------------------------------------------------------------------------------------------------------------+ | ====================================================== | | |ID|OPERATOR |NAME|EST.ROWS|EST.TIME(us)| | | ------------------------------------------------------ | | |0 |COLUMN TABLE FULL SCAN|t1 |1 |3 | | | ====================================================== | | Outputs & filters: | | ------------------------------------- | | 0 - output([t1.biz_day]), filter([t1.biz_day >= demote_cast(20200101, VARCHAR(10))], [t1.biz_day <= demote_cast(20200201, VARCHAR(10))]), rowset=16 | | access([t1.biz_day]), partitions(p0) | | is_index_back=false, is_global_index=false, filter_before_indexback[false,false], | | range_key([t1.__pk_increment]), range(MIN ; MAX)always true | +-------------------------------------------------------------------------------------------------------------------------------------------------------+ 11 rows in set (0.004 sec)- Query plan: After setting
non_standard_comparison_leveltorange, the query optimizer converts integer constants to string type for comparison. - Filter condition:
filter([t1.biz_day >= demote_cast(20200101, VARCHAR(10))], [t1.biz_day <= demote_cast(20200201, VARCHAR(10))])indicates that integer constants are converted to strings for range comparison. After non-standard comparison, comparisons are performed based on character order. For example,'202002010'is in the character order['20200101', '20200201']but not in the numeric order[20200101, 20200201].
- Query plan: After setting
Other related conversion rules
The conversion methods from string to float and from integer to float may not be the same. The results may be affected by factors such as the compiler version. One way to avoid such issues is to use
CAST()to prevent the value from being implicitly converted to a float.Implicit conversion of numeric or time values to strings produces a value with the character set and collation determined by the
character_set_connectionandcollation_connectionsystem variables.This means that such conversions generate non-binary strings (
CHAR,VARCHAR, orLONGTEXTvalues). If the connection character set is set to binary, the conversion result is a binary string (BINARY,VARBINARY, orLONGBLOBvalue).Type conversion during integer expression evaluation is different. For example, when using an integer expression as part of a CREATE TABLE statement, the new table will contain a column of type
INTorBIGINTbased on the result type of the expression:CREATE TABLE t SELECT integer_expr;If the expression's maximum length is not suitable for
INT,BIGINTis used. However, you can use a sufficiently long expression to force the use ofBIGINTinstead ofINT:obclient> CREATE TABLE t SELECT 000000000000000000000; Query OK, 1 row affected
Implicit data type conversion rules
OceanBase Database automatically converts a value from one data type to another when the conversion makes sense.
The following table shows the implicit conversion matrix for all data types. The conversion direction and context are not considered.
| Data type | BOOL | INT | SMALLINT | MEDIUMINT | BIGINT | SERIAL | DECIMAL | NUMERIC | FLOAT | DOUBLE | BIT | DATETIME | TIMESTAMP | DATE | TIME |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| YEAR | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| CHAR | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| VARCHAR | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| BINARY | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| VARBINARY | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| TEXT | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| TINYBLOB | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| MEDIUMBLOB | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| LONGBLOB | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| TINYTEXT | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| MEDIUMTEXT | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| LONGTEXT | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| ENUM | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| SET | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| BLOB | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
Implicit data type conversion examples
obclient> SELECT CAST(BOOL_COLUMN AS YEAR) FROM YOUR_TABLE;
+---------------------------+
| CAST(BOOL_COLUMN AS YEAR) |
+---------------------------+
| 2001 |
+---------------------------+
2 rows in set (0.001 sec)
