Data type mapping ensures that data is accurately mapped from the original data type to the target data type when data is exported from OceanBase Database in Parquet or ORC format. In OceanBase Database V4.3.5, a mapping table is provided to map the data types of MySQL and Oracle databases to the data types supported by Parquet or ORC. This ensures that data is not lost, overloaded, or semantically altered during the export process.
| Parquet physical type |
Parquet logical type |
Hive data type |
Data type under MySQL tenant |
Remarks |
| INT32 |
INT(8,TRUE) |
TINYINT |
TINYINT |
|
| INT32 |
INT(16,TRUE) |
SMALLINT |
SMALLINT |
|
| INT32 |
INT(32,TRUE) |
INT |
INT |
|
| INT64 |
INT(64,TRUE) |
BIGINT |
BIGINT |
|
| INT32 |
INT(8,FALSE) |
TINYINT(overflow value is NULL) |
TINYINT UNSIGNED |
|
| INT32 |
INT(16,FALSE) |
SMALLINT(overflow value is NULL) |
SMALLINT UNSIGNED |
|
| INT32 |
INT(32,FALSE) |
INT(overflow value is NULL) |
INT UNSIGNED |
|
| INT64 |
INT(64,FALSE) |
BIGINT(overflow value is NULL) |
BIGINT UNSIGNED |
|
| FLOAT |
NONE |
FLOAT |
FLOAT |
|
| DOUBLE |
NONE |
DOUBLE |
DOUBLE |
|
| FIXED_LEN_BYTE_ARRAY |
DECIMAL |
DECIMAL |
DECIMAL, DECIMAL UNSIGNED |
You must specify the precision and scale. |
| BYTE_ARRAY |
STRING |
CHAR |
CHAR, BINARY |
The string type in Parquet is encoded in UTF-8. |
| BYTE_ARRAY |
STRING |
VARCHAR |
VARCHAR, VARBINARY |
|
| BYTE_ARRAY |
STRING |
STRING |
TINYTEXT, TEXT, MEDIUMTEXT, LONGTEXT, TINYBLOB, BLOB, MEDIUMBLOB, LONGBLOB |
|
| INT64 |
TIMESTAMP(is_adjusted_to_utc=true, parquet::LogicalType::TimeUnit::MICROS) |
TIMESTAMP |
TIMESTAMP |
|
| INT64 |
TIMESTAMP(is_adjusted_to_utc=false, parquet::LogicalType::TimeUnit::MICROS) |
TIMESTAMP |
DATETIME |
|
| INT32 |
DATE |
DATE |
DATE |
|
| INT64 |
TIME |
\ |
TIME |
|
| INT32 |
INT(8,FALSE) |
\ |
YEAR |
|
| ORC type |
Hive data type |
Data type under MySQL tenant |
| BYTE |
TINYINT |
TINYINT |
| SHORT |
SMALLINT |
SMALLINT |
| INT |
INT |
INT |
| LONG |
BIGINT |
BIGINT |
| FLOAT |
FLOAT |
FLOAT |
| DOUBLE |
DOUBLE |
DOUBLE |
| DECIMAL |
DECIMAL |
DECIMAL |
| CHAR |
CHAR |
CHAR |
| VARCHAR |
VARCHAR |
VARCHAR |
| STRING |
STRING |
TINYTEXT/TEXT/MEDIUMTEXT/LONGTEXT |
| BINARY |
BINARY |
TINYBLOB/BLOB/MEDIUMBLOB/LONGBLOB/BINARY/VARBINARY |
| DATE |
DATE |
DATE |
| TIMESTAMP |
TIMESTAMP |
DATETIME/TIMESTAMP |
References
SELECT INTO