08 – Database Design

7.1 Introduction

We have looked at the individual aspects of each element of the DBMS. We looked at the concept of table, columns and rows. Then we looked at the concept of normalization. Now we will look at the process of Database design.

7.2 Designing the database

In order to have a healthy database you need to first design the database. In order to design the database you need to understand the customers requirement. Databases are not designed alone. They are part of a larger software solution. So the database and the software needs to work hand in hand.

The first step in understanding the requirements – as far as the database is concerned – is to understand the data flow. What are the pieces of information and how does it flow through he business. Let’s understand this though an example:

In automobile spare parts shop the customer will first as for the part by giving a description of the part, the sales person will check the inventory to see whether that part is available and get the unique part no, makes a requisition from stores. While that is happening he gets the customer’s name and address and the quantity he wants. This information is passed onto the cashier who then produces the invoice and gets the payment from the customer. The customer’s payment details such as credit card or cash etc. are entered by the cashier.

In this scenario the data flow is as follows

From whom

To Whom



Sales Person

Part description

Sales Person


Part No


Sales Person

Quantity Available


Sales Person

Name and Address

Sales Person


Customer Name and Address and quantity required



Payment info


So an analysis like the above will an idea as to what data is used in the system. And how the data flows from one department to the other.

An analysis like this also gives you an idea as to what entities to include in the system. From the above scenario we see:







As the knowledge of the system grows you will come up with more entities. You can take what is in the right most column of the above table as the fields in a table. After you have identified the tables you can put the related information in those tables. For example in the customer table:

     Customer (Name, Address, part bought, date, amount paid, qty bought, pmt mode)

This is the first step. This is not yet in a real table/relation form. We have just gathered every possible information about the entity customer in the same place. You also need to do the same for all the tables as there could be duplicate information – the same information can be a candidate for two tables – so you need to select the best table for that information.

Part (no, description, qty at hand)

Payment (customer, invoice no, date, amt, mode)

Invoice (part no, qty, date, customer name)

Inventory (part no, desc, qty at hand)

Cashier (name)

Now you need to normalize these tables and come up with the optimal table set.

Customer (No, Name, Address, date)

Part (no, description, qty at hand)

Payment (invoice no, date, amt, mode)

Invoice Hdr (Invoice no, date, customer no)

Invoice Line (Invoice no, Invoice Line No, part no, qty)

Inventory (part no, qty at hand, location)

Cashier (no, name, date, amt collected)


As the requirements and the knowledge of the system grows this could even be fine-tuned. For example there could be a requirement for the customers to come and return the goods they have bought. So the system needs to cater to returns. And if the parts are returned then the value of these goods must be returned. So the invoice tables will have to be modified as follows:

Invoice Line (Invoice no, Invoice Line No, part no, qty, date, customer no, qty bought, qty returned, date returned)

However if the return process is complicated – multiple returns are possible – i.e. he can return one good today and another one tomorrow then it may be better to keep a separate table.


Invoice Line (Invoice no, Invoice Line No, part no, qty)

Returns (return no, Invoice no, Invoice Line No, part no, date, qty returned, unit price, amt returned, date returned, reason code, comments)

Return reason (code, desc)

Payment (invoice no, date, amt, mode, amt returned, return no)


As you can see in the above set up a customer can return multiple times for different or same product. For example if he buys 4 tires today he can return one tire tomorrow he can return another. And the reason for the return can also be entered.

Why should the reason be kept in a separate table? For one thing it will save space. Rather than entering the same reason again and again in the returns table it is entered and kept in the return reason table. For another you can have some kind of control – each data entry operator cannot enter whatever she wants into the system – she has to select the standard reasons which has been approved by the management. As you can see this has to do with organizational controls and policies. So these two play a part in database design.

Putting the return no in the Invoice line table would create duplication – as for the same invoice line there could be several returns.

And for easy identification you may want to keep a flag in the header table to identify the invoices that has had a return.

Invoice Hdr (Invoice no, date, customer no, returns flag)

But there is one inefficiency in the above design. In order to find out how much qty was initially bought and how much has been returned you need to join the following tables:

Invoice Line


So to eliminate this inefficiency here is an alternate design:

Invoice Line (Invoice no, Invoice Line No, part no, qty, qty returned)

There is still one confusion – the qty is it the original quantity bought or the quantity after return? i.e. if the customer originally bought 4 tires and returned 1 tire how much would the qty field show? Will it show 3 or 4? If it shows the original quantity bought and there is a return you always need to remember to make the calculation as current quantity = qty – qty returned. If a new comer or a user takes only the qty field and creates a management report which shows the sales figures for a particular month and if there had been several returns that month the figures could be way off. So it is best to design the table as follows:

Invoice Line (Invoice no, Invoice Line No, part no, qty, qty returned, original qty)

The qty field will always have the current quantity i.e. the quantity after the returns have been subtracted. This way you don’t need to remember to subtract qty returned from qty. And just be looking at the invoice line table itself you can find out how much was initially bought and how much was returned.

The return process has to update several tables:



Invoice Hdr

Invoice Line

So several factors influence the design of the tables/database:



Ease and Efficiency

Like us on Facebook