Normalising a database

George Bowyer

Registered User.
Local time
Today, 20:50
Joined
May 17, 2004
Messages
50
Hi.

A few questions about normalising (and yes I did search first, but it's rather a general search term...)


1) If normalising is so important, why aren't Northwind and wizard-built dbs normalised?

2) Can anyone direct me towards a good idiot's guide to normalisation?

3) Can anyone point me toards a good example database that is normalised? What would be really useful would be one that contains both a normalised and non-normalised structure for comparison.

4) In my case, my amateur built db started life some 6 years ago and has grown into a huge amorphous blob that does pretty much exactly what I want it to, but is no doubt about as far from normalised as it would be possible to get (its essentially a name-and-address book, and my main table has 37 fields in it! Loads of duplicated infoprmation all over the place...).

(i) Is it possible to "retrospectively" normalise a db that has been going for years, or is normalisation something that has to be done from inception?
(ii) If it can be done, how would one go about it?
(iii) As my db is pretty small-beer: only dealing with a couple of thousand contact records for one small club; operating soleley on one computer; is it really worth the hassle?

Thanks

George
 
Great questions:)
1) If normalising is so important, why aren't Northwind and wizard-built dbs normalised?
We had this exact discussion at the recent Access Insiders conference. The concensus of the Insiders was that ALL examples presented by Microsoft should show proper technique and good practice since they are the first things that most people see and should set them on the correct path. I don't think we convinced the development team managers though. They are convinced that they need to make the examples as simple as possible so they will err on the side of poor practice if it makes the example easier. So, I'm afraid you will continue to see poor table schemas and poor object names unless we also convinced some of the developers who sat in on the conference sessions and they convince the team managers.

2) Can anyone direct me towards a good idiot's guide to normalisation?
I haven't seen anything I would consider an Idiot's guide, mostly because Access can't be mastered by Idiots. I attached a file with a bunch of bookmarks of sites that talk about design.
3) Can anyone point me toards a good example database that is normalised? What would be really useful would be one that contains both a normalised and non-normalised structure for comparison.
I attached the sample db that I used for a private class. I can't attach the PPT show because it has too much propritary stuff in it. If you look at the relationships window and the Final Schema group you will see the end result. The original table is tblPerson1 and each subsequent number shows what happens at each step toward normalization.
4) In my case, my amateur built db started life some 6 years ago and has grown into a huge amorphous blob that does pretty much exactly what I want it to, but is no doubt about as far from normalised as it would be possible to get (its essentially a name-and-address book, and my main table has 37 fields in it! Loads of duplicated infoprmation all over the place...).

(i) Is it possible to "retrospectively" normalise a db that has been going for years, or is normalisation something that has to be done from inception?
You can do it later but it can be tedious. It is far better to avoid the problem.
(ii) If it can be done, how would one go about it?
That depends on what's wrong with it. There is no product that can do it for you. Basically, you need to go back to ground zero and collect data items and let them settle into entities and attributes. Then once you see what the structure should be, you can work out a plan to get there.
(iii) As my db is pretty small-beer: only dealing with a couple of thousand contact records for one small club; operating soleley on one computer; is it really worth the hassle?
The number of records in a database is not really relevant. The relevant issue is maintenance. How much time does it take you to make modifications? Do you find yourself having to change objects on a frequent basis - especially in the same manner? Do you have to create unwieldy macros to automate tasks? Do you have to create multiple queries to get the same information for different columns?

I just noticed that the bookmark file didn't upload. No error. The site just won't upload an .htm file so I zipped it.
 

Attachments

Last edited:
Pat Hartman said:
Great questions:)
I'm afraid you will continue to see poor table schemas and poor object names unless we also convinced some of the developers who sat in on the conference sessions and they convince the team managers.

How hard, really, would it be for them to update Northwind databases with a hyperlink to their support site on the splash screen? On this they could explain theory in simple terms rather than each bad habits and practice by example. That designing a database document they have on the knowledge base is so hidden away that it would benefit being stuck in a .zip file with Northwind, and containing a detailed glossary.

Any word on those DoMenuItem wizards being updated? :D
 
George,

I know just how you feel! Our company db's are all built on samples supplied with Access 97. Over the years they have become really fragmented with dozens of meanless and repeated fields/information.

Trouble is we have approx 6 db's, all doing different things. I am now in the process of redesigning one db that can do it all.

Can anyone recommend a good book available in the UK to help me learn Access and VBA? (macro's just can't do the jobs i want them to!)

Pat, Is there any chance you can post a A97 version of the sample db? It would be most useful. Many thanks.
 
Thanks for the db. I shall have a peek at it this evening.

I definitely think that MS have got the wrong attitude. I started off with a virgin copy of Access 95, a teach-yourself-access book and Northwind. No mention of normalisation.

I then bought a VBA for beginners book, and then a more advanced VBA book. Still no mention of normalisation.

As a result, there is no hint of normalising in what I have done. Also, because I have evolved my knowledge of access without normalisation, my understanding of how to link tables and build things with linked tables is vestigial. Had Northwind and the basic books started me on normalisation from word dot, I would have learned that stuff from scratch. Pah! Curse your lack of foresight, MS!

As it is, I am having to learn that everything I thought was good is bad, and how to do things in a completely different way. Ick.


The relevant issue is maintenance. How much time does it take you to make modifications? Do you find yourself having to change objects on a frequent basis - especially in the same manner? Do you have to create unwieldy macros to automate tasks? Do you have to create multiple queries to get the same information for different columns?

Lordy, yes. I have 53 different queries in my db (a few of which are, admittedly, redundant) plus nearly a matching set of reports and over 40 different forms and sub-forms.

As I say, my db does what I want it to (and it causes amazement and deep envy amongst my luddite peers) but I am sadly beginning to learn that far from being the model of technological sophistication that I thought it was, it looks instead like it was designed by Heath-Robinson.


Do you have to create unwieldy macros to automate tasks?

Hah! Macros are the work of the devil and I shall never let them defile my db!

Or, put more realistically, none of my books covered macros and as a result I don't know how to do them. I have precisely 4 macros in my db, one of which I snarfed from Northwind and the other three simply turn off warnings, run a different delete query and turn the warnings back on. Computer God, or what!

I also haven't got round to reading an SQL book yet, and I therefore have absolutely no idea why or how queries do what they do. I don't inquire too closely, I am just grateful that it works. I think it has something to do with magic and the Dark Arts...

Fortunately, however, the little man who lives in my computer seems to be able to manage without my help on the query front, but no doubt I could make things a lot easier for him if I designed them properly; but as I do not know the correct spells and incantations - and virgin's blood is pretty hard to come by in these parts - I am a bit stuck.

On occasions, when I am getting really frustrated because I cant get a query to do what I want it to, I have considered sacrificing a goat in front of my computer, but I am told that this won't really help. :(
 
Mark-BES said:
Can anyone recommend a good book available in the UK to help me learn Access and VBA? (macro's just can't do the jobs i want them to!)

My bible was/is "Beginning Access 95 VBA Programming" by Smith and Sussman, published by Wrox Press (www.wrox.com). I thought it was excellent, and I think they have published updated versions.
 
I used Wrox Press' Beginning Access 97 VBA. Never did finish it. But it helped to let me see the similarities between VBA and the BASIC I'd used on my Amstrad and Amiga years before...
 
All:
Great thread for beginners like myself. Thanks for the info on the books. Will order one this week.

SJ McAbney:
Thank you for the A97 ver. If only MS could of provided such a db!
 
Leave it to me to be the maverick.

One of the questions relates to the worth of normalization.

Sometimes - and I strongly emphasize the SOME - it makes no sense to fully normalize a database. You make a "don't normalize" choice because you have such a complex situation that it is better to duplicate a small amount of data used for special sort requirements or special report fields. In particular, with the JET engine, it is rare for you to need to go all the way to 5th normal form. Usually, by the time you get to 3rd normal, you've done so much that what is left is RARELY (but not "never") a problem. In other words, as you learn how to do this kind of thing, it becomes a judgement call. The most common cases of refusing to fully normalize have to do with conflicting sort orders on separate reports based on the same table, when a query cannot easily perform the ordering (perhaps because of inconsistent policy within your company.)

Most of the time, normalization is a good idea, though. Normalization allows you to retain data on entities not currently active in your scheme of things because the entity data is in the table for that entity whether it is referenced or not.

Normalization allows you to reduce the size of your tables when the data being normalized is part of a one-to-many relationship (or many-to-one, if you had to turn it around). The savings occurs when you can eliminate the duplicates in tables you can split out to become LOOKUP tables. Also, parent/child tables where the parent can have many children are good sources. An example of LOOKUP tables: A table with state abbreviations and long names, where the prime record stores the abbreviations but sometimes you need the long name. An example of parent/child tables: A purchase order with many line items. The PO is the parent, the line items are the children.

In this case, the higher the cardinality of the ONE-side table, the more space you can save by normalizing it. This "cardinality" is best described as the inverse or opposite of uniqueness. Low cardinality means less savings when eliminating records from the ONE-side table. High cardinality means that you can remove lots of duplicated records from the non-normalized table.

Strictly speaking, normalization can be imposed after-the-fact by a tedious process. But sometimes it is just too darned difficult.

Mechanically, after-the-fact normalization involves an analysis phase to identify desired fields to split apart from the denormalized table. So you write a make-table query to pull out the fields to a new table. If your new table doesn't have anything else to provide it a good prime key, include an autonumber. Now make the table. Run a query to remove duplicates from the new table. The query wizard can help you here. (It doesn't matter that you delete some of the autonumbers when you do this, autonumbers are there for uniqueness only.)

Go back to your original table. Add a LONG field that will become the foreign key that points to a record in the new table. Now write an update query to find the fields in the new table that match specific records in the old table. Update the new FK field in the old table with the PK from the matching record in the new table.

OK, verify that ALL records in the original table now have a value for the FK pointing to the new table. If you successfully do this, you can now remove the fields from the old table because they are in the new table and you have pointers to them. Finally, you can establish the relationship between the old and new tables with the MANY side in the old table being the FK and the ONE side being the PK in the new table. (You cannot properly do this last step until all records have something in the FK field of the old table. Otherwise you get integrity violations.)

OK, you just partly normalized the old table. Go back and do the whole process again for the next possible grouping of fields you can extract. When you have run out of fields to be extracted this way, you are done. <Whew!>
 
George Bowyer said:
- and virgin's blood is pretty hard to come by in these parts - I am a bit stuck.
HaHa!
I used to live in Huntingdon, and if one of our party was too ugly or just too stupid to pull there, it was off to Peterborough! (We reserved Nottingham for extreme cases.)
 
Is there a reason where if you are storing unique values, such as, say, county names, that will always be unique and you will never need in more than one format, that you should not have a one-field table?
 
No. There is no reason. Over time the text values in even "unchangable" lists can change so typically I use the autonumber primary key and text field lookup value so that I never have to worrry about the problem.

I also find that I don't like the proliferation of "lookup" tables so that rather than creating a new table for each separate attribute, I have a standard lookup table along with associated forms and reports that I add to every database. This lookup table is used only for simple lookup data. If there is anything unusual or complex, I create a unique table. My common lookup table looks like:

tblLookup
ItemID (autonumber primary key)
TableID (foreign key to ItemID in this table) - this is a self referencing relationship and allows multiple lookup "tables" to be stored within a single physical table
ShortName
LongName
ActiveFlg
LastUpdateBy
LastUpdateDt

The ShortName and LongName gives me the flexibility of spelling out things where I have the room and using abbreviations where I don't. Similar in concept to CT and Connecticut.
 

Users who are viewing this thread

Back
Top Bottom