This topic aims to help you standardize table structure design when you create tables in database operations.
Three paradigms
Before we look into the table structure design, let's first look at the concept of database and the three paradigms of database design. To build a database with smaller redundancy and a more reasonable structure, we must follow certain rules when designing a database. In a relational database, these rules are called paradigms. Here is a brief introduction to the three paradigms.
First paradigm (ensuring that each column remains atomic)
The first paradigm is the most basic paradigm. If all the field values in a database table are non-exploded atomic values, the database table satisfies the first paradigm.
For example, the fields in a student information table are defined as follows.
| sno | sname | Contact information |
|---|---|---|
| 01 | Alice | 1******1@gmail.com |
| 02 | Bob | +1-5**-1**-45** |
| 03 | David | +1-6**-2**-78** |
The contact information in the table includes both email addresses and mobile phone numbers, and therefore is not atomic, failing the first paradigm. We can split the contact information into email addresses and mobile phone numbers to keep each column atomic, thus satisfying the first paradigm. The table is updated as follows.
| sno | sname | Mobile Number | |
|---|---|---|---|
| 01 | Alice | 1******1@gmail.com | |
| 02 | Bob | +1-5**-1**-45** | |
| 03 | David | +1-6**-2**-78** |
Second paradigm (ensuring that each column in the table is related to the primary key)
The second paradigm is based on the first paradigm in a more advanced layer. The second paradigm needs to ensure that each column in a database table is related to the primary key, not just one part of the primary key (primarily for the federated primary key). In other words, in a database table, only one type of data can be saved in a table, and multiple types of data cannot be saved in the same database table.
For example, the fields in a student and teacher information table are defined as follows.
| (sno | tno)~pk~ | sname | tname |
|---|---|---|---|
| 01 | 01 | Alice | Daniel |
| 01 | 02 | Alice | Tom |
| 02 | 01 | Bob | Daniel |
| 02 | 03 | Bob | John |
| 03 | 02 | David | Tom |
| 03 | 03 | David | John |
The primary key in this table is (sno,tno). Each field is an atomic value unsplittable, which meets the first paradigm. However, sname depends on sno, while tname depends on tno. Therefore, the table does not meet the second paradigm and needs to be split. The table is split into the following three tables.
Student table:
| sno~pk~ | sname |
|---|---|
| 01 | Alice |
| 02 | Bob |
| 03 | David |
Teacher table:
| tno~pk~ | tname |
|---|---|
| 01 | Daniel |
| 02 | Tom |
| 03 | John |
Student-teacher relationship table:
| id~pk~ | sno~fk~ | tno~fk~ |
|---|---|---|
| 1 | 01 | 01 |
| 2 | 01 | 02 |
| 3 | 02 | 01 |
| 4 | 02 | 03 |
| 5 | 03 | 02 |
| 6 | 03 | 03 |
The preceding three tables contain only atomic columns with no partial dependency, satisfying the second paradigm.
Third paradigm (ensuring that each column is directly related to the primary key column)
The third paradigm needs to ensure that each column of data in a data table is directly related to the primary key, not indirectly.
For example, the fields in a student and class table are defined as follows.
| tno~pk~ | tname | cno | cname |
|---|---|---|---|
| 01 | Alice | 01 | Class 1 |
| 02 | Bob | 01 | Class 1 |
| 03 | David | 02 | Class 2 |
All fields in the table are atomic values and have no partial dependency, which meets the first and second paradigms. However, cname depends on cno, while cno depends on tno. Therefore, the table does not meet the third paradigm. Split this table into two tables as follows:
Class table:
| cno~pk~ | cname |
|---|---|
| 01 | Class 1 |
| 02 | Class 2 |
Student-class relationship table:
| tno~pk~ | tname | cno~fk~ |
|---|---|---|
| 01 | Alice | 01 |
| 02 | Bob | 01 |
| 03 | David | 02 |
After splitting, the tables no longer have transitive dependencies, which satisfy the third paradigm.
Specifications for common table structure design
You should take into consideration the business performance in table structure design, rather than simply following the aforesaid three paradigms. Design data redundancy in storage to reduce table associations and improve business performance. A redundancy field cannot be:
A field subject to frequent modification
An overlong varchar field
Set a primary key when you create a table.
We recommend that you use a business column rather than an auto-increment column as the primary key or federated primary key.
Tables in OceanBase Database are index-organized tables (
IOTs). If you do not specify a primary key for a table, the system automatically generates a hidden primary key for the table.
Notice
Do not add a primary key after the table is created.
The
gmt_createandgmt_modifiedfields are required for a table.Note
Set the data type of
gmt_createandgmt_modifiedtoDATE(accurate to seconds) orTIMESTAMP WITH TIME ZONE(accurate to microseconds with current time zone information). You can use theSYSDATEorSYSTIMESTAMPfunction.Tables and fields must have the
COMMENTattribute.We recommend that you set the
NOT NULLattribute for all fields in a table and define theDEFAULTvalue as needed.The same column in multiple tables must have the same column definition.
Fields to be joined must be of the same data type to avoid implicit conversion.
Complex data types, such as
BLOBandJSON, are not recommended.We recommend that you set the data type of a yes-or-no field to
unsigned tinyint, where the value 1 indicates yes and the value 0 indicates no. For example, for theis_deletedfield that indicates whether a record is logically deleted, 1 indicates deleted, and 0 indicates not deleted.We recommend that you update the field comment in time after you modify the definition of a field or append a status value to a field.