Table structure design|V4.1.0| docs|Distributed Database

Table structure design

Last Updated：2023-07-28 02:55:43 Updated

This topic aims to help you standardize table structure design when you create tables in database operations.

Three paradigms

Before we look into the table structure design, let's first look at the concept of database and the three paradigms of database design. To build a database with smaller redundancy and a more reasonable structure, we must follow certain rules when designing a database. In a relational database, these rules are called paradigms. Here is a brief introduction to the three paradigms.

First paradigm (ensuring that each column remains atomic)

The first paradigm is the most basic paradigm. If all the field values in a database table are non-exploded atomic values, the database table satisfies the first paradigm.

Example: The fields in a student information table are defined as follows.

sno	sname	Contact information
01	Zhao Qian	1******1@qq.com
02	Sun Li	138****1234
03	Zhou Wu	135****1234

The contact information in the table includes both email addresses and mobile phone numbers, and therefore is not atomic, failing the first paradigm. We can split the contact information into email addresses and mobile phone numbers to keep each column atomic, thus satisfying the first paradigm. The table is updated as follows.

sno	sname	Email	Mobile Number
01	Zhao Qian	1******1@qq.com
02	Sun Li		138****1234
03	Zhou Wu		135****1234

The second paradigm is based on the first paradigm in a more advanced layer. The second paradigm needs to ensure that each column in a database table is related to the primary key, not just one part of the primary key (primarily for the federated primary key). In other words, in a database table, only one type of data can be saved in a table, and multiple types of data cannot be saved in the same database table.

Example: The fields in a student and teacher information table are defined as follows.

(sno	tno)~pk~	sname	tname
01	01	Zhao Qian	Zheng Wang
01	02	Zhao Qian	Feng Chen
02	01	Sun Li	Zheng Wang
02	03	Sun Li	Chu Wei
03	02	Zhou Wu	Feng Chen
03	03	Zhou Wu	Chu Wei

The primary key in this table is (sno,tno). Each field is an atomic value unsplittable, which meets the first paradigm. However, sname depends on sno, while tname depends on tno. Therefore, the table does not meet the second paradigm and needs to be split. The table is split into the following three tables.

Student table:

sno~pk~	sname
01	Zhao Qian
02	Sun Li
03	Zhou Wu

Teacher table:

tno~pk~	tname
01	Zheng Wang
02	Feng Chen
03	Chu Wei

Student-teacher relationship table:

id~pk~	sno~fk~	tno~fk~
1	01	01
2	01	02
3	02	01
4	02	03
5	03	02
6	03	03

The preceding three tables contain only atomic columns with no partial dependency, satisfying the second paradigm.

The third paradigm needs to ensure that each column of data in a data table is directly related to the primary key, not indirectly.

Example: The fields in a student and class table are defined as follows.

tno~pk~	tname	cno	CNAME
01	Zhao Qian	01	Tang Dynasty class
02	Sun Li	01	Tang Dynasty class
03	Zhou Wu	02	Zhou Dynasty class

All fields in the table are atomic values and have no partial dependency, which meets the first and second paradigms. However, cname depends on cno, while cno depends on tno. Therefore, the table does not meet the third paradigm. Split this table into two tables as follows:

Class table:

cno~pk~	CNAME
01	Tang Dynasty class
02	Zhou Dynasty class

Student-class relationship table:

tno~pk~	tname	cno~fk~
01	Zhao Qian	01
02	Sun Li	01
03	Zhou Wu	02

After splitting, the tables no longer have transitive dependencies, which satisfy the third paradigm.

Specifications for common table structure design

You should take into consideration the business performance in table structure design, rather than simply following the aforesaid three paradigms. Design data redundancy in storage to reduce table associations and improve business performance. A redundancy field cannot be:
- A field subject to frequent modification
- An overlong varchar field
Set a primary key when you create a table.
- We recommend that you use a business column rather than an auto-increment column as the primary key or federated primary key.
- Tables in OceanBase Database are index-organized tables (IOTs). If you do not specify a primary key for a table, the system automatically generates a hidden primary key for the table.

Notice

Do not add a primary key after the table is created.

gmt_create and gmt_modified are required fields of a table.

Note

Set the data type of gmt_create and gmt_modified to DATE (accurate to seconds) or TIMESTAMP WITH TIME ZONE (accurate to microseconds with current time zone information). You can use the SYSDATE or SYSTIMESTAMP function.

Tables and fields must have the COMMENT attribute.
We recommend that you set the NOT NULL attribute for all fields in a table and define the DEFAULT value as needed.
The same column in multiple tables must have the same column definition.
Fields to be joined must be of the same data type to avoid implicit conversion.
Complex data types, such as BLOB and JSON, are not recommended.
We recommend that you set the data type of a yes-or-no field to unsigned tinyint, where the value 1 indicates yes and the value 0 indicates no. For example, for the is_deleted field that indicates whether logical deletion is performed, 1 indicates deleted, and 0 indicates not deleted.
We recommend that you update the field comment in time after you modify the definition of a field or append a status value to a field.