Good practice for tables?

05me39

Registered User.
Local time
Today, 11:15
Joined
Feb 7, 2014
Messages
13
Hello,

I rrad somewhere that it is a good practice to make a new table whenever some entry gets repeated. for example if you have to write the department of employees then it is better to have a separate table having all the departments in it and relating it to the main employees table.

how good is that practice? does that not make a large number of tables in the DB?

comments are welcome.
 
The best practice is to normalize your data.
Here's a link to more info.
 
The process is called normalisation. It can be taken to the nth degree but for the situation you describe it would be good practice to have more than one table.

The number of tables in a db is irrelevant and should not be a limitation on what you are doing.

In your example, if you only have one table, so what happens if the department changes its name? - you will have multiple records to update. If the name is stored in a different table, linked by an ID, you only have to change it once - so less for the db to do.

Other considerations are volumes of data. If a department name is 25 characters long, then you will be storing it a number of times, rather than once, leading to your db being bigger than it needs to be.

From a performance perspective that is more data volume that needs to be brought through in queries resulting in slower performance - which will be noticable on large volumes of data, particularly if working across a slow network.

Suggest you google 'normalisation' and read up on the subject
 
Hi,

Thank you all for the comments. all of the above comments are really helpful.
 
Hello,

I rrad somewhere that it is a good practice to make a new table whenever some entry gets repeated. for example if you have to write the department of employees then it is better to have a separate table having all the departments in it and relating it to the main employees table.

how good is that practice? does that not make a large number of tables in the DB?

comments are welcome.

You shouldn't create a table just because some values are repeated on multiple rows. That would probably be a bad idea, or the very least an entirely unhelpful rule of thumb.

A sensible basis for helping you decide what tables to create are the principles of Boyce-Codd Normal Form and 5th Normal Form. Normal Forms are concerned with ensuring that dependencies are adequately enforced by the keys of a database table.

For the avoidance of doubt, normalization does not require that new tables should be created when values are repeated in your tables. In any case, it should be clear that removing an attribute with "repeating" values and replacing it with a foreign key reference to a new table does not reduce repetition of values at all - you have simply substituted one value for another, which certainly has nothing to do with normalization.
 
hi.

its been a wonderful discussion. everybody seems to be talking about the normalization. I dont have an academic background of software or database. I dont seem to grab the meanings (and I have been trying for a while now) of normalization. I have read the definetions but they dont clear my mind. I shall be very thankful if someone could tell me the lay man meaning of normalization.
no technical terms would be great.
 
No need for an academic background. I don't. It's just a common database term. Once you grasp it, it becomes obvious.

First step is to consider First Normal Form. This asks that your data be atomic. Basically means not to store multiple items in a cell. So rather than a record having a field with 'bananas, apples, grapes' you would have three separate records, one each for 'bananas', 'apples', and 'grapes'. Common sense, really, as you can count or otherwise work with things if they are not separate.

Second stop: 2nd Normal Form. Each field should be dependent on the key. This is where we start to think of tables as being individual entities in our database. So every field in the table must relate to the entity. You would not have a table for shops, for example, with a field called Products. This is because there may be a need for shops to stock many kinds of products. Therefore the Product field is not dependent on the shop entity. In fact, we would model it with a table for Products. We could have put in that Product field the value 'shoes, newspapers, and toys' but that would violate 1st Normal Form. Thus we see that for a table to be in 2NF, it must already be in 1NF.

The same then applies to 3NF, which requires that no field be dependent on other fields. In other words: don't store calculated values. The reasons for this are that a) these can be calculated when needed in queries, forms, and reports; and b) this example: should a field called Price = £2 and a field called Quantity = 5, another field called Total would be calculated (and stored) as £10. Change a Price or Quantity and you're going to have problems because the Total field won't change. Hence why you calculate when needed.
 
For the avoidance of doubt, normalization does not require that new tables should be created when values are repeated in your tables. In any case, it should be clear that removing an attribute with "repeating" values and replacing it with a foreign key reference to a new table does not reduce repetition of values at all - you have simply substituted one value for another, which certainly has nothing to do with normalization.

Exactly! Many developers misunderstand normalization. This is about optimisation.

There are advantages substituting repeating values with numbers.

1. The displayed value is only stored once. As such it can easily be changed by updating one record in the lookup table. It is extremely useful where the application needs to be internationalised.

2. Numeric values take up far less storage space than long strings and are faster to index.
 
I agree. An extreme ridiculous situation is a table of Personnel where there are many Smiths and Johns.
 
No need for an academic background. I don't. It's just a common database term. Once you grasp it, it becomes obvious.

First step is to consider First Normal Form. This asks that your data be atomic. Basically means not to store multiple items in a cell. So rather than a record having a field with 'bananas, apples, grapes' you would have three separate records, one each for 'bananas', 'apples', and 'grapes'. Common sense, really, as you can count or otherwise work with things if they are not separate.

Second stop: 2nd Normal Form. Each field should be dependent on the key. This is where we start to think of tables as being individual entities in our database. So every field in the table must relate to the entity. You would not have a table for shops, for example, with a field called Products. This is because there may be a need for shops to stock many kinds of products. Therefore the Product field is not dependent on the shop entity. In fact, we would model it with a table for Products. We could have put in that Product field the value 'shoes, newspapers, and toys' but that would violate 1st Normal Form. Thus we see that for a table to be in 2NF, it must already be in 1NF.

The same then applies to 3NF, which requires that no field be dependent on other fields. In other words: don't store calculated values. The reasons for this are that a) these can be calculated when needed in queries, forms, and reports; and b) this example: should a field called Price = £2 and a field called Quantity = 5, another field called Total would be calculated (and stored) as £10. Change a Price or Quantity and you're going to have problems because the Total field won't change. Hence why you calculate when needed.

Thank you! it is really helpful.
 

Users who are viewing this thread

Back
Top Bottom