Normalization Denormalization of tables. Keep it denormalized when it is possible.

Mike375 said:
As time moves along x% of names go to the reject table and y% stay in the table mentioned in this thread but become a client/policy holder, although that can be a process that takes several months. Thus many of the names spend a fair bit of time at the "half way" point.

Mike

Is this a DATA WAREHOUSE ! :eek:
 
Rich said:
err no, it's a Data Whorehouse :rolleyes:


Similar smart ar**d comment that I cannot think of at the moment but sums up the thread subject

L
 
Mike375 said:
That is why Kevin gave up because he knows his talk was all hobby bullshit.
Mike

Mike -

First of all - your one hell of a person to start calling people out when this WAS NOT EVEN A THREAD YOU STARTED. I gave up because YOU CONTINUE TO POST TO THIS FORUM AND RAILROAD OTHER PEOPLE'S THREADS!!!

After I "gave up'" I reposted a scaled down sample for monkeytunes to get an idea on the relationships for members and contracts... this had nothing to do with you but, as always, you start posting in someones thread, tell all the experts here their methods dont work, at least not in your cases, then continuing to argue for days and days about how your right and their wrong...

As for the "talk is hobby bullshit" speak - I guess if building relational database systems in SQL Server and DB2 with front ends ranging for Access 97-2002, VB6, VB.NET, and ASP for the past 6 years as a full time job and having an 2 degrees in the field make me a "hobbyist" then I guess your right... especially someone with your "qualifications"... I have tried REPEATEDLY to help you in your many posts but you are so far removed you do not see the forest through the trees.... Good Day and Good Luck as I will not be responding to your posts anymore...

Furthermore - some of us have jobs that require us to work during the day and we jump on this forum to try and help others when we have a free minute or two to try and return the help to others as we have received ourselves so just because I don't spam the board with posts like yourself does not mean I'm "not a real Access Person"

Heres a tip: Go to the Start menu -- click Programs -- Click Microsoft Office -- OPEN EXCELL AND HAVE A BALL!!!

Good Luck,
Kevin
 
Len Boorman said:
Similar smart ar**d comment that I cannot think of at the moment but sums up the thread subject

L
the thread subject's fine, it's the idiot that keeps telling everyone else that they're wrong, who defiled the thread :mad:
 
In a desperate effort to get this thread back on track (remember when it was a philosophical discussion of normalization? Way back when? Somewhere around post 50 and before?), I want to thank some people, and then I'm out of this thread, because this discussion has made for some strange bedfellows, and if I don't back out the door, before I know it I'll be trying to wrangle a herd of 300 macros.

Pat, your comments here opened my eyes as wide as dinner plates. Believe it or not, I HAD considered what would/will happen if my sups ask for, say, all components sent in a particular month. "Holy smokes" I thought, "I'll have to write 25 queries per month!" I had no idea, however, to the extent of just how out control it would all get. Thanks for the (extreme) wake up call. Come to think of it, every time you answer one of my silly posts, it's a wake up call. Who else around here has that feeling?

Ken, it will be a couple of days before I can try out your suggestions, but I've bookmarked it and it makes sense. Kevin_S, your comments and pointers are appreciated as well. I'm not looking forward to putting it into action (users get spooked when you tell them "the database will be down for a little while" and/or "there are going to be some changes"), but I'm looking forward even less to any ramifications that would crop up from keeping the databases this way. Thanks to you.



And Len Boorman:
To Normalise when designing a Relational Database Application is not a discussion point.

It is a fact.

No offense taken, but sometimes us gunslingers have to do unpretty things. Sometimes we have to rob trains, rustle cattle, build insurance databases with hundreds of 20 records tables and 300+ macros, and sometimes our supervisors at our REEELLY BEEEG companies tell us "You WILL make it look like THIS" and the only way to do it is "flatten" a table, and then we do the unpretty things we have to do. THAT, friend, is a fact. The trick, I'm finding out a little more every year (and I'm young yet), is knowing just what the hell you're getting into.

Ostracize me if you must, but I have a confession - I build at least two unnormalized databases EVERY WEEK. My excuse: the vice-president gets spreadsheets, needs to see "the big picture" quickly, like, within an hour, and so I just import the spreadsheets to seperate tables, relate common fields, do some spot maintenance, run a query or three, use some template forms I have laying around, and send it over to him for perusal/analysis. On one screen, he can see a project and who's involved, or one person and all their projects, or totals and everything else that you can't get from spreadsheets. I know the database won't be used for more than a couple days, at most. It is an unpretty thing I have to do, it bugs me knowing that I'm breaking rules (not to the point of losing sleep, but it bugs me), but in the interests of time and knowing that we'll be getting a brand new set of spreadsheets in a day or two that I'll have to reimport, taking the time to normalize just doesn't take precedence.

The databases I maintain that are in general continued use, however, I want done right. So, thanks everyone for the help. (And with that, I'll leave Mile, Rich and The Stoat to their haranguing of Mike.)

(And, who else thinks Aleb deserves some dirty looks for airing this dirty laundry in the first place?)
 
Last edited:
Cool. I'd like to see how it turns out...

Remember this invaluable golden rule when doing db: Make sure your "Tables have just the right number of attributes. (Not too many, not too few)"!

ken
 
monkeytunes said:
No offense taken, but sometimes us gunslingers have to do unpretty things. Sometimes we have to rob trains, rustle cattle, build insurance databases with hundreds of 20 records tables and 300+ macros, and sometimes our supervisors at our REEELLY BEEEG companies tell us "You WILL make it look like THIS" and the only way to do it is "flatten" a table, and then we do the unpretty things we have to do. THAT, friend, is a fact. The trick, I'm finding out a little more every year (and I'm young yet), is knowing just what the hell you're getting into.

Even us old codgers get to do a bit of rustling when he who pays the wages say "It will be done this way"

So no problems there.

But given the choice I think you and most would prefer to produce the best possible solution on the basis that we are ALL still learning and therefore what we produce will not be perfect.

Or is there somebody out there who knows it all ?. About Access that is. :cool: Nobody KNOWS what Rich looks like

:D :D :D
 
My suggested method still takes this into account. As the SpouseID field of the extension table would relate back to the primary key of the people table you can therefore pull over all her details and his details in a query.

From what you are saying you are duplicating data with respect to the spouse's date of birth (if both partners are covered) where you have the date of birth in the person's own record and the field of their spouse.


Mile,

Yes, that is correct. It would be like saying that the insured person's "card" has 100% of their details and 5% of the spouses details. In practice the amount of business done on the spouse is quite small and in probably 99% of cases the business is written at a later date.

If a policy is later written on the spouse then a macro makes another record and copies her details to the new record and then the bits relevant to her are added. The reason we add some basics such as date of birth of the spouse in the first place is for when we are talking to the insured on the phone. I have a rate calculator made which runs off the data on the form record that is open and two labels start it. One of the labels is geared for female rates and the spouse date of birth.

If the spouse has policy benefits then we also duplicate them with the second set being given the same ID as the husband. This is so her policy details will be seen in a list with his on a subform. In 99.9% of cases the husband owns the policies benefits that are written on his wife's life. Also, if the spouse has policy benefits written on her then although a "card" exists for her it is a secondary record and in the very large majority of cases it is excluded from mail lists etc.

Mike
 
Hello all, yes it's me - incendiary ;)
I have been watching the thread for the last few days without actually posting anything in it and I guess everyone will agree to the conclusions that :
1. Relational dbases are to be designed with the normalization rules as a prerequisite.
2. What level of normalization you will achieve depends on what is "enough" for developer.
3. Business needs/boss's vision and etc may overrule some or even all of the normalization rules, and access becomes nothing else but "automated spreadsheet" with some good features on top. Here I absolutely agree with Monkeytunes and his last post.
4. Access is just a tool which can make life easier. It is not worse of offending others for way they use access ... the other way can be suggested, reviewed, accepted or rejected.


So :):):) I am going to change the thread's name to "keep it normalized when it is possible"
Thank you all who posted their opinions.
 
Pat Hartman said:
That missing piece is Role. This identifies how this address is used in its relationship with this person.

Can't you just rely on Pat.

Points out the failings precisely

Len
 
I’ve just spent half an hour reading this thread, because in another thread Rich was rude about another member. Someone challenged him on his rudeness, and he replied that the member he was being rude about was basically a disruptive influence, who wouldn’t listen to, or take on board any comments from other members.

Seeing this, I decided I’d better look through the offending person’s posts and makeup my own mind, that’s why I’m here reading this thread.

I now have a mental picture of this person (Name removed), he is sat at his desk, his boss comes along and says hi ****, I’ve got a bit of a problem with the data. I was wondering if you could produce a report for me with X. Y. Z.

Ah, well says ****, the thing is boss I can’t really do that report for you because you can’t do this sort of thing in Access, it’s not the tool for the job, you really need an SQL database, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah, Blah.

After about ten minutes his boss walks off, totally confused, and absolutely convinced that the only way he’s going to get his X. Y. Z. report is by spending thousands of pounds on new software. Obviously he hasn’t got the money available so **** is quite safe, he can carry on with his hobby of Access, without any need to learn anything new just operating it like a glorified macro driven spreadsheet.

Now, if you think I’m being rude, don’t forget I’ve just wasted half an hour reading through all this twaddle, just because one person won’t listen.
 
He's his own boss by the way and has argued with just about every one of the long term consistent posters on this forum. He's not the slightest interest in changing his db because he's flogging it off to some poor unsuspecting unfortunate in a couple of years anyway.
 
Uncle Gizmo

You appear to be the type of person that needs the gov't to protect you from yourself.

Mike
 
I posted early in this free-for-all and then got tapped to actually do some work here at the office. So I've been standing back a bit.

OK, here's the way I see it, and this is the old perfesser puttin' on his flat-head cap with the mildewed gold tassel...

To do this right REGARDLESS of the final form your design takes, you DO NOT START WITH THE DESIGN. Analyze several things. First, what data do you need? Write it down. Second, what kind of reports will you / your boss want to see? Write it down. Third, what kind of form will you / your data entry clerks want to see? Write it down. Fourth, what will your adjunct departments want to see? (By now, you know the drill.) Fifth, are there any laws governing what you keep and how your records must look based on these laws (usually related either to fair reporting or fiduciary responsibility issues)? Sixth, are there layout issues where you can take advantage of overlap, even at a low layer? Seventh, are there data elements that are truly optional at some predictable level of presence/absence? Eighth, are there issues in your published company policy that have any influence on how things are stored/visible? Ninth, does your company have a written style guide for offical company products / publications?

OK, you've looked at requirements. Now go back to your list. Each possible way of laying out your data for each different data user (or provider) might indicate different internal layouts for efficiency. But remember, Access imposes record-size limits. It also imposes other sizing limits. So now look at whether any record you want to build, whether in a table or part of a query, would exceed those limits.

The whole point of all the analysis is to see how you are going to use the data. This analysis tells you whether any of your requirements (and you surely will have more than one) indicate normalization.

OK, you've got several proposed layouts, perhaps because of forms and reports, perhaps also because of legal requirements, perhaps also because of company policy. EVENTUALLY, you'll reduce this to a set of tables (rarely one - but not NEVER - just hardly ever.... do I hear a chorus from H.M.S. Pinafore in the background???) The tables will be based on one or two central items that are at the heart and soul of your application. The other items will depend on details in the primary tables.

Now, you lay out your tables. You might have THOUGHT that it should all have been flat - but what about cases where a person has no dependents 'cause that person ain't married? So what you do is create SPARSE 1-to-1 tables for cases where something doesn't always exist. Then, you store the extended but optional data in the SPARSE 1-TO-1 table linked to the master table using PK/FK relationships.

Another thought: If you have information to be kept private while other parts are more "public" (perhaps a privacy law issue), split the tables in a 1-TO-1 manner and make the private data table have different security settings from the public table. Then if someone tries to get to the private data they can't, even though they can see the public data.

Now, the argument about "the boss wants it to look this way..." or "I can't place my fields on the form if I do it this way..." or other hoopla. That's all a cop-out for an infertile imagination, combined with ignorance of the REAL value of OUTER JOIN queries and/or INNER JOIN queries.

Guys-n-Gals, if you want to export a spreadsheet for the boss, build a query that outer-joins to the sparse table. Use the NZ function to "cleanse" the nulls that occur after an outer join. You can export queries - or directly display them - JUST LIKE TABLES. AND they look JUST LIKE SPREADSHEETS when you do!

You want to move fields around on a form? Build the form from a query. You can move the fields around JUST LIKE TABLE FIELDS if you wish. But the underlying tables can be very trim and compact. You can even do things with JOIN in the underlying tables. It really works! (Here, NZ would not always work, so this might be trickier. The form might have some extra data massaging to do...)

A QUERY is the way to give the appearance of denormalization without actually violating any rules. And who gives a tinker's DAM whether the boss wants a spreadsheet? (Yes, that's correct, not "DAMN" - look up the origin of the phrase.) In my personal experience, there is a learning curve in which the boss is happy - until the first day someone says, "But boss, we need to look at this a different way... can we sort it by X with breaks for every change in Y?" And on that day, a denormalized layout gets in the way of that different view of things.

Now, the business about Address1, Address2, Address3 etc. - as Pat pointed out - is NOT a repeating group 'cause these are actually components of a larger address field. Subdivision is not an excuse for separation. The fact that postal codes, city/state/county,etc can often be used for other purposes (routing or identification of local field offices or something similar) argues for field separation, yet the separated fields still depend on the same prime key. And you would not split out part of the address when the whole address is being used for its original intent.

Perhaps, to some viewpoints, this is a case in point for a LIMITED denormalization - where instead of having a separate table of address components, you fold all the components back into the table - but you would still keep them separate. And as it is a matter of interpretation, I leave that determination to the reader.

Finally, the whole thing, whatever you do, has to make sense in the context of your business. If your db doesn't reflect your business model to at least a fairly close match, you are engaging in hobbyist programming on company time. Oh well, I'll relent a little bit. That could also be exploratory research on design alternatives, maybe... But if your bad design lasts longer than necessary to realize that the model and the business have diverged, you are stealing time and money from your boss. If you are your own boss, fire yourself. If not, start reading the Help Wanted section fast.

If that sounds harsh, so be it. But that is the benefit of wisdom gained through almost 29 years of commercial and government programming.
 
Well put Doc... great points on all accounts...

thanks for the post!
Kev
 
The_Doc_Man

I meet some but of course not all your criteria. The main part I meet is it was laid out on paper first. In fact I basically put on the computer what we use to do with cards in boxes. That probably carries through to today with fields such as spouse etc being blank when they are not married.

Mike
 
In fact I basically put on the computer what we use to do with cards in boxes

A good first step.

Now, the next step is to look for sparse-field possibilities - a case in point being spousal data for an unmarried person. Split the table to separate the "always populated" from the "sometimes populated" and save some space. A JOIN query can be a recordsource just as easily as a table, so you lose nothing - except a shorter table that, when you scan it, scans quicker as long as you don't need to sparse data to be included. See, the shorter the record, the more you fit in a working buffer. The more you fit in a working buffer, the more things you can scan in a single disk read, even when doing a non-indexed search. And there is your performance pay-off, the reason why you look at this type of normalization.

Before you say ... "But I always need that data!" - no, you don't. You only need that data when reporting on spousal issues, searching for persons having spouses, or viewing detailed records. But if you are sorting based on postal codes, you do that sort in one query (thus faster 'cause shorter records) and join the spousal data to the query, not the original table.

THIS is what normalization gives you - the ability to isolate on what is important at the moment; the ability to do as Julius Caesar did so long ago - divide and conquer. Paraphrasing, "Omna data in multa parta divisum est."

If I've forgotten the proper declension in a couple of cases, forgive me - my Latin is high-school variety and that's been ... well, a few years, let's say.
 
The_Doc_Man

I could divide this data base into two broad categories, one being related to clients and the other being related to prospects and telemarketing and it is the latter that is most critical for displays. You need to see as much as you can as quickly as you can and the form layout is very critical to results. This section of the data base is also the largest part of it by far.

Currently I have a table that holds clients, prospects and personal, that is, personal could be friends, my accountant, contacts in insurance companies and so on. Each record type has some entries which allows separartion and determines which form they display on.

When a cold call is made there are 14 possible outcomes such as wrong number, not available, make appointment etc. In the main table there is a field for each of these outcomes and the entry made in the field is the "attempt number" So if we phoned you and on the first attempt we got a recorded message then a 1 would go into the field. If the second attempt was also a recorded message then the 1 would be replaced with a 2. If the third attempt was "not available" then a 3 would go in that field.

In another table a new record is created for each call attempt and it is from this table that all our call statictics and so on come from.

The form from which the calling is done is based on a queryon the main table that selects the category of prospect to be called and it also deselects records based on previous call results or other data that might have been entered. For example, it delsects any records where a call attempt has been made in the last 5 days. Or a doctor's secretary might say "only call on Wednesdays and Fridays and only between 3pm and 4.40 pm. Thus is record will only appear on those days and between those times. All entries are being made on the persons record in the main table.

The form layout is basically the name, address, phone etc acros the top of the form. There are 16 fields in 4 colums of 4.

Below these fields are a set of labels all of which have a macro attached. There are 7 sets of them with 3 in each column. These do things such a open the diary, file notes etc as well as opening other stuff for statistal data and so on.

Dead centre in the form are the outcomes. There are 9 to the left of centre for outcomes where we did not actually speak to the prospect himslef and there are 5 to the right of centre for the possible outcomes after speaking to the prospect. This set up so that if there have been 5 attempts and no contact made then the record no longer appears and is appended to another "holding table"

On the far left of the form and level with the outcomes are the dates/times of the first 5 attempts and to the far right of the form are the 4 drop down lists to select days and time such as only call on Wednesdays between 3pm and 4 pm.

The lower section of the form shows various call results and it updates with each call. This data comes from the table that holds a record for each call and also the table that holds the various objectives. It is not a subform but unbound text boxes and a form opens when "next prospect" is clicked and SetValue macro fills the unbound text boxes.

So that is basically what I have. The main objection I have to subforms is the restrictions on field positioning on the form and also I have a lot of Visible/Invisible that runs from an OnCurrect macro. As an example, some of the fields that appear if the prospect is a medical specialist are different to a GP, solicitor and so on and these are mainly the fields that appear at the top of the form.

Mike
 

Users who are viewing this thread

Back
Top Bottom