Field design

2025-01-02 01:58:39  Updated

You need to consider the field data types and lengths during field design. This topic provides recommended data types for fields in several scenarios, to help you standardize field design.

MySQL mode

  • Numeric fields

    We recommend that you use the BIGINT data type, rather than integer data types such as INT and SMALLINT, to prevent the value range from being exceeded in the future.

  • Character fields

    • We recommend that you use the VARCHAR(N) data type for all dynamic strings.

    • Use CHAR(1) only for single-character fields. For yes-or-no fields, we recommend that you use the CHAR(1) data type to save space, with 1 indicating TRUE and 0 indicating FALSE. The field values must be consistent across all applications. For example, for the is_deleted field that specifies whether to perform logical deletion, 1 specifies to perform logical deletion, and 0 specifies not to.

    Notice

    NUMBER(1) can also express the yes-or-no concept but takes up more space.

    • Do not use NVARCHAR or NCLOB as a column data type.

    Notice

    A column of the character data type can store all alphanumeric values. However, a column of the NUMBER data type can store only numeric values.

  • Datetime fields

    • For fields requiring time accuracy, you can use the DATETIME(6) data type.

    • For fields without requiring time accuracy, you can simply use the DATETIME data type.

    • If a field may involve internationalization in the future, we recommend that you use the TIMESTAMP data type.

    • We recommend that you do not use characters as the data types of time fields, which may cause implicit type conversion.

  • Recommendation on data field selection

    We recommend that you use the following methods, especially for large tables with millions of rows:

    • The time fields of all tables in a business must be unified. We recommend that you use the DATE data type. For businesses with higher requirements on precision, you can use the TIMESTAMP data type.

    • We recommend that you use the NUMBER data type for numeric data storage to save storage space. Implement conversion at the frontend. Keep value ranges consistent with network segment data in size.

    • You can store IPv4 and IPv6 data in different fields based on service needs by using the VARCHAR(N) data type.

Oracle mode

  • Numeric fields

    • We recommend that you use the NUMBER data type for storage. When the NUMBER data type is used to store fixed-point numbers with a variable length and decimal precision, write the values in the format of NUMBER(p,s). When the NUMBER data type is used to store floating-point numbers, write the values in the format of NUMBER.

    • For decimal fields, we recommend that you use the DECIMAL data type and do not use the BINARY_FLOAT or BINARY_DOUBLE data type. If you use the BINARY_FLOAT and BINARY_DOUBLE data types, the issue of precision loss may occur in data storage, which may cause incorrect results in value comparison.

    • Priorities in implicit data type conversion

      BINARY_DOUBLE has the highest priority and is followed by BINARY_FLOAT, which precedes NUMBER.

    • Value range of a numeric field

      Type Value range Length (in bytes)
      NUMBER 1.0E-130F to 1.0E+126F (1.0E+126F excluded) 4 to 40
      BINARY_FLOAT 1.17549E-38F to 3.40282E+38F 4
      BINARY_DOUBLE 2.22507485850720E-308 to 1.79769313486231E+308 8
  • Character fields

    • We recommend that you use VARCHAR2.

    • OceanBase Database compares VARCHAR2 values without padding spaces and compares CHAR values with spaces padded.

    • Datetime fields

      • The TIMESTAMP WITH TIME ZONE and TIMESTAMP WITH LOCAL TIME ZONE data types store time zone information. Pay attention to the time zone difference.

      • We recommend that you do not use characters as the data types of time fields, which may cause implicit type conversion.

Others

  • You must use the BIGINT data type for auto-increment columns. It is prohibited to use the INT data type. This avoids storage overflow.

  • It is prohibited to update constraint definitions of table columns through foreign key self-referencing or cascade deletion. This avoids repeated deletion.

  • Avoid using the ENUM data type as far as possible, for example, enum('x','y','z'),. You can use string types instead.

  • We recommend that you update the field comment in time after you modify the definition of a field or append a status value to a field.

  • When you design fields, maintain some redundancy to improve performance. However, you also need to take into consideration the synchronization of data. A redundancy field cannot be:

    • A field subject to frequent modification

    • An excessively long field

  • During implicit type conversion, numeric types have a lower priority than time types but a higher priority than characters and all other data types.

  • An appropriate character storage length saves database tablespace and index storage space, and more importantly, improves the retrieval speed.

    Unsigned values avoid the storage of negative numbers by mistake and increase the expression range. We recommend that you use different data types for different value ranges. Here are some examples:

    Object Age range Type Expression range
    Humankind Within 150 years old Unsigned TINYINT Unsigned values: 0 to 255
    Turtle Hundreds of years old Unsigned SMALLINT Unsigned values: 0 to 65535
    Dinosaur fossil Tens of millions of years old Unsigned INT Unsigned values: 0 to 4.3 billion
    Sun 5 billion years old Unsigned BIGINT Unsigned values: 0 to 1e+19

Contact Us