Type conversion during expression evaluation
When operators in an expression are used with operands of different types, type conversion occurs to make the operands compatible.
Conversion methods
OceanBase Database supports both explicit and implicit data type conversion.
Implicit data type conversion occurs when an operation requires a parameter of a specific data type, but the actual parameter value provided is not of that data type. In such cases, OceanBase Database automatically converts the actual parameter value to the specified data type before proceeding with the operation.
For example, OceanBase Database automatically converts strings to numbers and vice versa as needed.
obclient> SELECT 1+'1';
+-------+
| 1+'1' |
+-------+
| 2 |
+-------+
1 row in set
Explicit data type conversion in OceanBase Database is performed using the CAST function. Here's an example:
obclient> SELECT 31.4, CAST(31.4 AS TIME);
+------+--------------------+
| 31.4 | CAST(31.4 AS TIME) |
+------+--------------------+
| 31.4 | 00:00:31 |
+------+--------------------+
1 row in set
For more information about the CAST function, see CAST.
Conversion rules
Conversion rules for comparison operations
If one or both parameters are
NULL, the comparison result isNULL, except for the<=>comparison operator. ForNULL <=> NULL, the result is true, and no conversion is required.If both parameters are strings, they are compared as strings.
If both parameters are integers, they are compared as integers.
If not compared with a number, hexadecimal values are treated as binary strings.
Single-row subqueries from one or more tables are not considered constants. For example, if a subquery returns an integer to be compared with a
DATETIMEvalue, both operands are treated as integers. Since integers cannot be converted to time values, to compare the operands asDATETIMEvalues, useCAST()to explicitly convert the subquery value toDATETIME.If one parameter is a
TIMESTAMPorDATETIMEcolumn and the other is a constant, the constant is converted to a timestamp before the comparison. For safety, always use complete date-time, date, or time strings when performing comparisons. For example, when usingBETWEENwith date or time values, to get the best results, useCAST()to explicitly convert the values to the required data type.If one parameter is a decimal value, the comparison type depends on the other parameter. If the other parameter is a decimal or integer value, the parameters are compared as decimal values. If the other parameter is a float value, the parameters are compared as float values.
In all other cases, the parameters are compared as floating-point (real) numbers. For example, string and numeric operands are compared as floating-point numbers.
Comparisons between floating-point numbers and large values of the
INTEGERtype are approximate because integers are converted to double-precision floating-point numbers before the comparison, and not all 64-bit integers can be accurately represented.
Type downgrading
To achieve efficient comparisons, constants are aligned with column types as much as possible, provided that the comparison semantics are satisfied. Currently, most comparisons between columns and constants are supported under standard semantics. Remaining comparisons involve casting columns, which are listed separately:
- For
intoruintcolumns, constants of character type,decimaltype, and floating-point types (double/float) can be downgraded to integer columns. - For
decimalcolumns, floating-point types (double/float) can be downgraded todecimalcolumns. - For
yearcolumns, numeric types (including integers, floating-point numbers, anddecimaltype), character types, and time types (includingdate,datetime,timestamp, andtime) can be downgraded toyearcolumns. - For
datecolumns, character types,datetime,timestamp, andtimetypes can be downgraded todatecolumns. - For
timestampcolumns,datetimecan be downgraded totimestamp.
The following scenarios involve casting columns (excluding complex types such as arrays, XML, and JSON):
- For
floatcolumns, comparisons always usedouble, sofloatcolumns are cast when compared with any constant. - For
fixed doublecolumns, which are specified with precision and scale (e.g.,double(10, 2)), if the constant has more decimal places, the precision is aligned during comparison, and the column is cast. - For character columns, comparisons between characters and numeric types are performed as numeric comparisons, and the column is cast to a numeric type. Additionally, character set comparison rules may also involve casting the column.
Non-standard comparisons
The logic and type downgrading for non-standard comparisons are similar, as they only handle comparisons between constants and columns. Comparisons between columns are not processed. Currently, only comparisons between integer constants and character columns are supported. Note that if set to range, the sorting results may not conform to standard semantics, so the setting should be chosen based on the data characteristics.
Here are some examples:
Create a table named
t1with abiz_daycolumn of typevarchar(10).obclient> CREATE TABLE t1 (biz_day varchar(10)); Query OK, 0 rows affected (0.216 sec)View the execution plan under default settings.
obclient> EXPLAIN SELECT * FROM t1 WHERE biz_day BETWEEN 20200101 AND 20200201; +-------------------------------------------------------------------------------------------------------------------------------------------------+ | Query Plan | +-------------------------------------------------------------------------------------------------------------------------------------------------+ | ====================================================== | | |ID|OPERATOR |NAME|EST.ROWS|EST.TIME(us)| | | ------------------------------------------------------ | | |0 |COLUMN TABLE FULL SCAN|T1 |2 |3 | | | ====================================================== | | Outputs & filters: | | ------------------------------------- | | 0 - output([T1.BIZ_DAY]), filter([cast(T1.BIZ_DAY, NUMBER(-1, -85)) >= 20200101], [cast(T1.BIZ_DAY, NUMBER(-1, -85)) <= 20200201]), rowset=16 | | access([T1.BIZ_DAY]), partitions(p0) | | is_index_back=false, is_global_index=false, filter_before_indexback[false,false], | | range_key([T1.__pk_increment]), range(MIN ; MAX)always true | +-------------------------------------------------------------------------------------------------------------------------------------------------+ 11 rows in set (0.004 sec)- Query plan: By default, the query optimizer converts the
biz_daycolumn from string type to numeric type and then performs the comparison. - Filter condition:
filter([cast(T1.BIZ_DAY, NUMBER(-1, -85)) >= 20200101], [cast(T1.BIZ_DAY, NUMBER(-1, -85)) <= 20200201])indicates that thebiz_dayis converted to a numeric type for range comparison.
- Query plan: By default, the query optimizer converts the
Set
non_standard_comparison_leveltoequaland view the execution plan.obclient> EXPLAIN SELECT /*+opt_param('non_standard_comparison_level', 'equal')*/ * FROM t1 WHERE biz_day BETWEEN 20200101 AND 20200201; +----------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Query Plan | +----------------------------------------------------------------------------------------------------------------------------------------------------------------+ | ====================================================== | | |ID|OPERATOR |NAME|EST.ROWS|EST.TIME(us)| | | ------------------------------------------------------ | | |0 |COLUMN TABLE FULL SCAN|t1 |1 |3 | | | ====================================================== | | Outputs & filters: | | ------------------------------------- | | 0 - output([t1.biz_day]), filter([cast(t1.biz_day, DECIMAL(-1, -1)) >= cast(20200101, DECIMAL(20, 0))], [cast(t1.biz_day, DECIMAL(-1, -1)) <= cast(20200201, | | DECIMAL(20, 0))]), rowset=16 | | access([t1.biz_day]), partitions(p0) | | is_index_back=false, is_global_index=false, filter_before_indexback[false,false], | | range_key([t1.__pk_increment]), range(MIN ; MAX)always true | +----------------------------------------------------------------------------------------------------------------------------------------------------------------+ 12 rows in set (0.003 sec)- Query plan: After setting
non_standard_comparison_leveltoequal, the query optimizer converts thebiz_daycolumn from string type to decimal type and then performs the comparison. - Filter condition:
filter([cast(t1.biz_day, DECIMAL(-1, -1)) >= cast(20200101, DECIMAL(20, 0))], [cast(t1.biz_day, DECIMAL(-1, -1)) <= cast(20200201, DECIMAL(20, 0))])indicates that thebiz_dayis converted to a decimal type for range comparison. Sincebetweenis a range query, the equal comparison rule forequaldoes not convert the constant type.
- Query plan: After setting
Set
non_standard_comparison_leveltorangeand view the execution plan.obclient> EXPLAIN SELECT /*+opt_param('non_standard_comparison_level', 'range')*/ * FROM t1 WHERE biz_day BETWEEN 20200101 AND 20200201; +-------------------------------------------------------------------------------------------------------------------------------------------------------+ | Query Plan | +-------------------------------------------------------------------------------------------------------------------------------------------------------+ | ====================================================== | | |ID|OPERATOR |NAME|EST.ROWS|EST.TIME(us)| | | ------------------------------------------------------ | | |0 |COLUMN TABLE FULL SCAN|t1 |1 |3 | | | ====================================================== | | Outputs & filters: | | ------------------------------------- | | 0 - output([t1.biz_day]), filter([t1.biz_day >= demote_cast(20200101, VARCHAR(10))], [t1.biz_day <= demote_cast(20200201, VARCHAR(10))]), rowset=16 | | access([t1.biz_day]), partitions(p0) | | is_index_back=false, is_global_index=false, filter_before_indexback[false,false], | | range_key([t1.__pk_increment]), range(MIN ; MAX)always true | +-------------------------------------------------------------------------------------------------------------------------------------------------------+ 11 rows in set (0.004 sec)- Query plan: After setting
non_standard_comparison_leveltorange, the query optimizer converts integer constants to string type and then performs the comparison. - Filter condition:
filter([t1.biz_day >= demote_cast(20200101, VARCHAR(10))], [t1.biz_day <= demote_cast(20200201, VARCHAR(10))])indicates that integer constants are converted to string type for range comparison. After non-standard comparison, comparisons are performed in character order. For example,'202002010'is in the character order['20200101', '20200201'], but not in the numeric order[20200101, 20200201].
- Query plan: After setting
Other relevant conversion rules
The conversion methods from strings to floating-point numbers and from integers to floating-point numbers are not necessarily the same. The results may be influenced by factors such as the compiler version. One way to avoid such issues is to use
CAST()to prevent the value from being implicitly converted to a floating-point number.Implicit conversion of numeric or time values to strings generates a value with a character set and collation determined by the
character_set_connectionandcollation_connectionsystem variables.This means that such conversions generate non-binary strings (
CHAR,VARCHAR, orLONGTEXTvalues). If the connection character set is set to binary, the conversion result is a binary string (BINARY,VARBINARY, orLONGBLOBvalue).Type conversion during integer expression evaluation is slightly different. For example, when using an integer expression as part of a CREATE TABLE statement, the new table created will contain a column of type
INTorBIGINTbased on the result type of the expression:CREATE TABLE t SELECT integer_expr;If the expression's maximum length is not suitable for
INT,BIGINTis used. However, you can force the use ofBIGINTinstead ofINTby using an expression with sufficient length:obclient> CREATE TABLE t SELECT 000000000000000000000; Query OK, 1 row affected
Implicit data type conversion rules
OceanBase Database automatically converts a value from one data type to another when the conversion is meaningful.
The following table shows the implicit conversion matrix for all data types, without considering the direction or context of the conversion.
| Data type | BOOL | INT | SMALLINT | MEDIUMINT | BIGINT | SERIAL | DECIMAL | NUMERIC | FLOAT | DOUBLE | BIT | DATETIME | TIMESTAMP | DATE | TIME |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| YEAR | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| CHAR | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| VARCHAR | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| BINARY | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| VARBINARY | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| TEXT | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| TINYBLOB | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| MEDIUMBLOB | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| LONGBLOB | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| TINYTEXT | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| MEDIUMTEXT | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| LONGTEXT | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| ENUM | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| SET | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| BLOB | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
Implicit data type conversion examples
obclient> SELECT CAST(BOOL_COLUMN AS YEAR) FROM YOUR_TABLE;
+---------------------------+
| CAST(BOOL_COLUMN AS YEAR) |
+---------------------------+
| 2001 |
+---------------------------+
2 rows in set (0.001 sec)