Help making Normalization seem normal?
I'm really not trying to ask noob questions, I promise I'm trying, but there are just some things I can't get my head around.
I've finally got an opportunity to make a database the way I want. The boss who demanded messy database structure is gone, the new bosses are sort of looking to me to clean house, and I actually have some time to learn now.
Our company hosts 4 different kinds of meetings. Large Forums, Regional, Beta, and In-House. We host surgeons, sales reps, and in-house personnel at various locations.
I've currently got a compiled Master table. It is rife with duplication. If doctor A went to meeting 1 and meeting 3 and meeting 4, then doctor A is in that database 3 times, often with a different address, or different rep information (we list which rep services a given doctor).
So I'm trying to normalize all of this. I figure the tables I'll need (and the fields, not including the ID field) are:
People (name, degree, address, phone, email)
Events (city, state, start date, end date)
Properties (i.e. hotels. name, city, state, address, maybe phone)
Reps (name, rank, address, phone, email)
Territory (for the Reps - Territory number, territory name, organization name)
Hospitals (name, address, city, state, phone - this is for affiliations, not for actual practice)
I think I got all that nailed down.
People to Events would be many to many, so I'd need a junction table there.
People to Hospitals would be many to many (a doc may be affiliated with several hospitals, and of course a hospital can have billions of docs)
Properties to Events would be many to many (in case we needed to use more than one property for an event, which we've never done, but I'd rather have it and not need it) so another junction table.
People to Reps would be one to many (a doctor may be covered by more than one rep)
Reps to territory would be one to one.
I'm pretty sure I got all that down. But I start running into questions.
For People, some of the doctors work at (not just 'are affiliated with') a hospital, some just out of an address. If I understand Normal forms, I can't be 1NF if I have a blank practice name, but I can't see that slapping 'Private Practice' is going to do me any sort of good, especially when it comes to merging documents. Also, it's impossible to be sure the information we're getting is not just a home address (despite our asking for a practice address). I start thinking I should change the last table from 'Hospitals' to just 'Addresses', but I'm still left with the null field possibility.
And I have a possibility of duplication. A rep may cover a given doctor, AND may be attending a meeting. So I could conceivably have a rep listed under People for having attended a meeting, and under Reps for simply being a rep (and in that case, I'd have nothing to link to under reps, unless I link it to themselves. Would that look hokey?).
I'm still weeding out thousands of duplicates to create my 'Peoples' table, but trying to smooth the way for the rest of the database development. I'm learning lots, but I can't find anything to explain the above. Am I on the right track? Do I need to structure this differently?
I'm really not trying to ask noob questions, I promise I'm trying, but there are just some things I can't get my head around.
I've finally got an opportunity to make a database the way I want. The boss who demanded messy database structure is gone, the new bosses are sort of looking to me to clean house, and I actually have some time to learn now.
Our company hosts 4 different kinds of meetings. Large Forums, Regional, Beta, and In-House. We host surgeons, sales reps, and in-house personnel at various locations.
I've currently got a compiled Master table. It is rife with duplication. If doctor A went to meeting 1 and meeting 3 and meeting 4, then doctor A is in that database 3 times, often with a different address, or different rep information (we list which rep services a given doctor).
So I'm trying to normalize all of this. I figure the tables I'll need (and the fields, not including the ID field) are:
People (name, degree, address, phone, email)
Events (city, state, start date, end date)
Properties (i.e. hotels. name, city, state, address, maybe phone)
Reps (name, rank, address, phone, email)
Territory (for the Reps - Territory number, territory name, organization name)
Hospitals (name, address, city, state, phone - this is for affiliations, not for actual practice)
I think I got all that nailed down.
People to Events would be many to many, so I'd need a junction table there.
People to Hospitals would be many to many (a doc may be affiliated with several hospitals, and of course a hospital can have billions of docs)
Properties to Events would be many to many (in case we needed to use more than one property for an event, which we've never done, but I'd rather have it and not need it) so another junction table.
People to Reps would be one to many (a doctor may be covered by more than one rep)
Reps to territory would be one to one.
I'm pretty sure I got all that down. But I start running into questions.
For People, some of the doctors work at (not just 'are affiliated with') a hospital, some just out of an address. If I understand Normal forms, I can't be 1NF if I have a blank practice name, but I can't see that slapping 'Private Practice' is going to do me any sort of good, especially when it comes to merging documents. Also, it's impossible to be sure the information we're getting is not just a home address (despite our asking for a practice address). I start thinking I should change the last table from 'Hospitals' to just 'Addresses', but I'm still left with the null field possibility.
And I have a possibility of duplication. A rep may cover a given doctor, AND may be attending a meeting. So I could conceivably have a rep listed under People for having attended a meeting, and under Reps for simply being a rep (and in that case, I'd have nothing to link to under reps, unless I link it to themselves. Would that look hokey?).
I'm still weeding out thousands of duplicates to create my 'Peoples' table, but trying to smooth the way for the rest of the database development. I'm learning lots, but I can't find anything to explain the above. Am I on the right track? Do I need to structure this differently?
Last edited: