Design rules for tables¶
Before you start creating database tables, you should carefully consider what data should be included in them and how the relationships between them should look. You may also want to consider general principles such as:
- goal definition
- avoiding duplicate data
- preservation of information atomicity
- avoiding empty fields
- unique identification of records.
Goal definition¶
The structure of the designed database should be selected according to the specificity of a particular system. The system in which personal data is its important point will be modeled differently (preparation of separate tables for persons and addresses along with the relation), and the system in which personal data will perform a secondary function and not a significant point of the system will be visualized (a common table connecting data personal and addresses).
Avoiding duplicate data¶
It is absolutely necessary to avoid duplication of the same data in the following lines.
For the example above, some information, such as the manufacturer's name and the manufacturer's address, are repeated many times in the database. This is first a waste of memory, and in addition any mistake will result in a lack of data consistency.
In this case, it is much more efficient to break the table into two and enter a specific type of relationship, e.g..:
In the example above, the relation one to many was used.
Information atomicity¶
Each database field should contain single information (i.e. atomic). Atomicity depends on the specific system, but in short, it consists in minimizing the amount of information stored in a given field.
For the example above, the manufacturer's address contains a large amount of information, and e.g. searching for a specific product by address can be very difficult. In this case, depending on your real needs, you can enter an additional table and tie it relation with the table of Manufacturer.
Another important aspect is to multiply references to another table. For example, if the customer rents a lot of computer equipment, he can implement these assumptions using the table below:
In the example above, the ProductId
field contains information about many products. However, this form is incorrect due to the great difficulty of obtaining all necessary statistical data, e.g. extracting the number of times a given product has been loaned. The correct table structure should look like this:
Avoiding empty fields¶
Avoid leaving empty fields (not containing data) in the tables. However, in certain specific situations, empty fields may be unavoidable.
In the above case, 75% of the records have an empty content of the Comments
field. For the above case, it is definitely better to introduce an additional table that will be responsible for storing additional information and reducing empty fields.
Unique record identifiers¶
Each table record should be uniquely identified. Otherwise you will not be able to distinguish them. The primary key must be well identified. Sometimes its abstract form can be replaced with a real identifier, e.g. ISBN for books, number od ID card for people. Everything depends primarily on the context of the modeled system.