Am I normalizing too far?

cnstarz

Registered User.
Local time
Today, 06:09
Joined
Mar 7, 2013
Messages
89
So I was reading Roger's "Entity-Relationship Diagramming" document (http://www.rogersaccesslibrary.com/Tutorials/Entity_Relationship.zip), and he mentions that with this method, the fields of the record should depend on the primary key. For example: first name, last name, address, phone number, birthdate... all of those should change when the primary key changes since (theoretically) those are very unique attributes for those records and aren't shared between others.

My database is required to keep track of Missions. These are the attributes for each mission:

  • Mission Day (format: YYAA[ie: 14AA, 14AB, 14AC, 14AD, so on for each day)
  • Date (format: mm/dd/yyyy)
  • Scheduled Start Time
  • Actual Start Time
  • Scheduled End Time
  • Actual End Time
  • Mission # (the # mission for that day. There could be 20 missions in a day, so the mission numbers will range from 1-20)
  • Mission Type (Each mission is a certain type)
  • Unit (who's tasked to accomplish the mission)
  • Priority
  • Location (where mission is being performed)
  • ISO ("In Support Of"... mission could be in support of a bigger objective)
  • Terrain (On *what* the mission is being performed... Examples are Domain Controllers, DMZs, Public Facing Webservers, DNSs, Routers, etc)
  • IP Range (IP range of the Terrain)
  • Comments (anything pertaining to the mission that at this point in time isn't common enough to make a standard field)
  • T-30 Minutes (Yes/No: Units are to notify us 30 minutes before start of the mission that they are ready/preparing to start)
  • Status (current status [calculated]: has it completed, was it cancelled, etc
Right now, we're keeping all that data in one table but now I'm wondering if I should break all that up further. If we're splitting up information based on what attributes can be shared among records, then that would mean shifting from this...


uZsXXEv.png



to this...


hWzRpY7.png


Multiple missions can be performed by the same Unit; multiple missions can take place at the same location, multiple missions can be ISO the same objective; multiple missions can be performed on the same terrain; multiple missions can occur on the same Mission Day; multiple missions can be the same type. All these attributes are shared between missions and therefore not particularly unique to each mission.

I'd like your input and thoughts on changing the design to this. Would the effort be worth it? Would there be any performance differences? Is this overkill or even necessary at all? To give you an idea the size of our database right now, we're approaching record number 1000 in our current mission table, and there ranges from 10-30 missions per day.
 
What exactly is a "mission"? As per the link to RogersAccessLibrary, do you have a clear statement of your "business" in plain English?

What do
Unit_ID (Fk)
Location_ID (FK)
ISO_ID (FK)
Terrain_ID (FK)
Day_ID(YYAA) (FK) and
TYPE_ID (FK) represent? Can you give us some examples?

Have you created a data model?
Do you have some test data? If so, use your test data to vet you model (tables, fields and relationships).

Can a Mission take place in more than 1 Location?
Can a Mission be in support of multiple Objectives?
Can a Mission occur on more than 1 Terrain?

I recommend you adopt a naming convention that does NOT allow spaces and/or special characters in field and object names. Following this advice will save you many syntax errors. Limit your names to alphanumeric characters and the underscore "_".
 
Last edited:
*cracks knuckles and stretches fingers...*

What exactly is a "mission"? As per the link to RogersAccessLibrary, do you have a clear statement of your "business" in plain English?

Yep, we need to task and track missions that occur every day. Each day has a name that follows this naming convention: YYAA. So all 2014 days start with 14, and each day has an alpha-alpha representation. Today is 14TO, tomorrow will be 14TP, the next day will be 14TQ, the day after that will be 14TR, and so on and so forth through the alphabet. Each mission is comprised of a Squdron/Company (aka Unit) performing a specific task (Mission Type), during a specific window of time (Scheduled/Actual Start/End times), on a specific set of networks/devices (aka Terrain) that fall within a specific IP Range, at a specific Air Force base (location). Each mission may or may not be in support of of a bigger operation or objective. We need our ops controllers to be able to easily identify each mission's status (is it complete or cancelled) throughout the day and be able to annotate any pertinent information (comments) so that our other internal divisions can assess the mission info later and make any adjustments for future missions that may be similar.

What do
Unit_ID (Fk)
Location_ID (FK)
ISO_ID (FK)
Terrain_ID (FK)
Day_ID(YYAA) (FK) and
TYPE_ID (FK) represent? Can you give us some examples?

Some quick history... this database wasn't created by me, and was quickly thrown together because our leadership tasked to immediately start keeping track of these missions. When I came across this database, almost everything was kept in one table (like an excel spreadsheet) and they had no forms whatsoever. O_O Having some prior Access experience, I normalized their database, for the most part and created forms for them to make their lives much easier.

SO! I already have separate tables for Units, Locations, ISO, Terrain, Day_ID, and Mission Types so that the forms can have dropdown boxes for those items. I didn't want them being able to type in a Unit/Squadron name... I wanted them to choose it to eliminate errors. Each Field_ID listed above is the primary key of it's respective table, most of them are auto-numbers except for Day_ID. Day_ID consists of the alpha-numeric representation of each day (14TO, 14TP, 14TQ, 14TR, 14TS, etc etc). These tables that I posted earlier...

hWzRpY7.png


... are potential new junction tables to link each mission to the location it's performed at, the unit it's performed by, the type of mission it is, etc etc.

Have you created a data model?
Do you have some test data? If so, use your test data to vet you model (tables, fields and relationships).

I don't have a data model other than using the relationships view in access to establish relationships.

Right now, this database is already in operation. We were given a task by leadership to immediately start doing something (:banghead:) so we had to quickly pull a turd out of our asses and make something out of it. At first this database was more of a proof-of-concept but I interjected myself into it and quickly turned it into something that is sustainable and can be used for the long-haul.

Can a Mission take place in more than 1 Location?
Can a Mission be in support of multiple Objectives?
Can a Mission occur on more than 1 Terrain?

Nope, possibly, and nope. Each mission is very specific. It's possible that a mission may indirectly support multiple objectives or operations, but for the most part, each mission exists to support a specific objective or operation.

I recommend you adopt a naming convention that does NOT allow spaces and/or special characters in field and object names. Following this advice will save you many syntax errors. Limit your names to alphanumeric characters and the underscore "_".

Yeah, the table names and fields I used in my post aren't the real table names and fields. None of my names have spaces or special characters other than underscores.
 
Table structure represents a reality in your data, not some arbitrary rule. So if you have a Mission table, and a Unit table, and many units might participate in a single mission, then you need a MissionUnit table, which stores the reality of what unit participated in what mission. One Mission might have many Units in it, one Unit might participate in many Missions. That is a Many-To-Many relationship, and that is what is meant by the term "relational database." Your table structure models the relationships between the "entities," or things, in your data.

Now, look at your Mission table's scheduled start field. Imagine your system in fact needs to accommodate the fact that different units are scheduled to start at different times in the same mission. So scheduled start no longer belongs to the mission, it belongs to the MissionUnit, the definition of one unit's relationship to the mission, so you move that field over to the MissionUnit table. Does that make sense?

And MissionType. Can one mission have more that one type? Presumably not, and in that case MissionType is a lookup table only, and it should be agnostic of the Mission, so it should not contain any MissionID. It should have MissionTypeID, and MissionTypeName, and Mission should have a MissionTypeID field. A Mission knows about a MissionType, but a MissionType doesn't know about a Mission. Makes sense?

And each thing is like that in it's own case, as dictated by the relationships of the reality that you want to model. Can a Mission have many Terrains? If so, you need a Terrain table and a MissionTerrain table, (as well as your Mission table, which we assume) . . .
tTerrain
'completely Mission agnostic
TerrainID (PK)
Terrain (Data)

tMissionTerrain
'knows about Missions and Terrain, and in fact, joins them
MissionTerrainID (PK)
MissionID (FK)
TerrainID (FK)
Percent (Data) 'if you tracked what percent of mission occured on what terrain, that would go here, and some process would have to check you don't exceed 100

tMission
'completely terrain agnostic, check related table for details

. . . and this is vastly simplified if one mission can only have one Terrain type. In that case we have . . .
tTerrain
'completely Mission agnostic
TerrainID (PK)
Terrain (Data)

tMission
'knows about the one and only terrain
MissionID (PK)
TerrainID (FK)
Hope this helps,
 
So I was reading Roger's "Entity-Relationship Diagramming" document (http://www.rogersaccesslibrary.com/Tutorials/Entity_Relationship.zip), and he mentions that with this method, the fields of the record should depend on the primary key. For example: first name, last name, address, phone number, birthdate... all of those should change when the primary key changes since (theoretically) those are very unique attributes for those records and aren't shared between others.

You should be more careful about your choice of learning materials and much more sceptical about material you find on some random website.

Normalization isn't about whether some item of data changes or repeats for different rows. It's about what business rules (called functional, multi-valued and join dependencies) you want to apply to your data. You cannot and should not base normalization on the notion of a primary key, you need to consider all the keys that ought to apply.
 
I think what the foregoing means is that you ARE on the right track, but the sub-entities are not necessarily dependent on the main entity

eg, a mission terrain might be dependent on the mission loc, rather than on the mission. That's the point
 
Table structure represents a reality in your data, not some arbitrary rule. So if you have a Mission table, and a Unit table, and many units might participate in a single mission, then you need a MissionUnit table, which stores the reality of what unit participated in what mission. One Mission might have many Units in it, one Unit might participate in many Missions. That is a Many-To-Many relationship, and that is what is meant by the term "relational database." Your table structure models the relationships between the "entities," or things, in your data.

Now, look at your Mission table's scheduled start field. Imagine your system in fact needs to accommodate the fact that different units are scheduled to start at different times in the same mission. So scheduled start no longer belongs to the mission, it belongs to the MissionUnit, the definition of one unit's relationship to the mission, so you move that field over to the MissionUnit table. Does that make sense?

And MissionType. Can one mission have more that one type? Presumably not, and in that case MissionType is a lookup table only, and it should be agnostic of the Mission, so it should not contain any MissionID. It should have MissionTypeID, and MissionTypeName, and Mission should have a MissionTypeID field. A Mission knows about a MissionType, but a MissionType doesn't know about a Mission. Makes sense?

And each thing is like that in it's own case, as dictated by the relationships of the reality that you want to model. Can a Mission have many Terrains? If so, you need a Terrain table and a MissionTerrain table, (as well as your Mission table, which we assume) . . .


. . . and this is vastly simplified if one mission can only have one Terrain type. In that case we have . . .

Hope this helps,

Makes perfect sense. Each mission has only one terrain, and each mission is performed by only a single Unit, and occurs at only a single location, so those aren't many-to-many relationships and therefore don't need junction tables.

With that said, by having the MissionType_ID (FK) in the Missions table, is it necessary for a relationship to be established in Access between those fields if the MissionType table is just a lookup table for a combobox?
 
Button Moon

How does this help?

Normalization isn't about whether some item of data changes or repeats for different rows. It's about what business rules (called functional, multi-valued and join dependencies) you want to apply to your data. You cannot and should not base normalization on the notion of a primary key, you need to consider all the keys that ought to apply.

The OP is attempting to Normalise and is having a problem or two. These words appear to have no meaning for the novis.
 
The OP is attempting to Normalise and is having a problem or two. These words appear to have no meaning for the novis.

Which is exactly why I'm suggesting he learn the basics from a more reliable source before going further. This forum is not the place for a tutorial. Any attempt here to solve a specific design problem without the opportunity to analyse the details of the situation is always going involve some highly subjective judgements/guesswork. That's not going to help the OP understand what he is doing. If someone asks "How do I calculate the square root of 676?" the best answer is not "26"; the best answer is "Here's a textbook on mathematical algorithms...".

cnstarz, if you want some alternative books and learning material then please tell us what your area of interest is (data modelling / database design theory / data management generally) and I will be happy to make some recommendations if I can.
 
is it necessary for a relationship to be established in Access between those fields if the MissionType table is just a lookup table for a combobox?
No, it's not necessary, but if you do create a relation and "enforce referential integrity," then Access won't let you edit the MissionTypeID in the Mission table to a value that doesn't exist in MissionType. That is a nice integrity check to have, but it's also not hard to enforce that in other ways.

And mainly then, looking at your tables, you want to get the MissionID out of all the related tables. Location should be mission agnostic, so I wouldn't even call it MissionLocation. It's just a location, and it only becomes a "MissionLocation" when a mission links to it. And then each related table, if you are going to bother to link it, should have data in it, and none of them do. Terrain should have a field called TerrainName, MissionType needs a MissionTypeName, like, the link is wasted if you don't have some data in the linked table to retrieve.

And if you have a bunch of lookups like that, you can also put them all in one table with a type field, so . . .
tblLookups
LookupID (PK)
LookupType (data like "Location", "Terrain", "Type")
Name (data)
. . . and then you don't have 17 lookup tables with one ID and one Text field, you just have one table that you can filter.

Also, never link to a date. A date is a scalar value that measures a single discrete quantity of time, so a date it always a field in a table never a row. And since it's never a row, you'll never link to a date, you'll always just enter it directly into the row it describes.
 
Just scanning through this it seems like this is a little over kill...
 
Right on, Ken!


Buttonmoon,
If someone asks "How do I calculate the square root of 676?" the best answer is not "26"; the best answer is "Here's a textbook on mathematical algorithms...".
I think this is perfect for the academic or philosopher; but not for the guys in the trenches, nor for the guys up to their waists in alligators who are trying to drain the swamp.

Your assuming an understanding of some basic arithmetic and concepts. Some people do not learn by reading. Some need examples; some need to be led by the hand; some need videos, some need real teachers.... There are a lot of very competent people in jobs (including database) who have never had formal training. They will never know Codd or Chen or Date, but they can still do the job. May be not perfect (whatever that is) but certainly adequate and to the satisfaction of their bosses.

Suffice it to say one size does NOT fit all. And to poo-poo anything and everything that isn't full of theory isn't totally constructive. For those who have an interest to understand or at least become aware of the theory then go for it, but let's not suggest you must be able to quote the database masters from intimate and complete understanding in order to work with databases.

Don't get me wrong, I'm sure from your background and experience you mean well. I'm suggesting that everything is not black or white. There are many shades of gray (some blacker than others) that can all play a part in learning.
 
Last edited:
I'm curious to know where the user is in this process now -
-edit- i like normalization threads :)
 
Your assuming an understanding of some basic arithmetic and concepts.

No I'm not at all. I'm assuming based on what the OP said that he wants to learn those concepts.

Some people do not learn by reading. Some need examples; some need to be led by the hand; some need videos, some need real teachers....
Sure, but the material the OP had been reading set a bad example. It was just wrong and as a result he had learnt things wrong. I fail to see why it would be a problem to suggest he learn from an alternative source.

Suffice it to say one size does NOT fit all. And to poo-poo anything and everything that isn't full of theory isn't totally constructive. For those who have an interest to understand or at least become aware of the theory then go for it, but let's not suggest you must be able to quote the database masters from intimate and complete understanding in order to work with databases.

Theory was what the OP was asking about. Theory is what allows knowledgeable professionals to solve real problems time and time again every day. That's because theoretical knowledge equips them with tools and techniques that really work every time - so they don't have to rely on dubious examples on forums and websites because they understand what they are doing rather than simply following a recipe someone told them. A person who doesn't value theory would probably be well advised to choose a profession other than data management.
 
...Theory was what the OP was asking about. Theory is what allows knowledgeable professionals to solve real problems time and time again every day. That's because theoretical knowledge equips them with tools and techniques that really work every time - so they don't have to rely on dubious examples on forums and websites because they understand what they are doing rather than simply following a recipe someone told them. A person who doesn't value theory would probably be well advised to choose a profession other than data management.

Seems like theory is more of a starting point rather than an absolute... Kind of like, let me read up on what somebody else thinks so I don't have to re-invent the wheel :)

edit - My theory is if you ask if you are normalizing to much then you probably are :)
 
Last edited:
Hey guys, still here. I abandoned the idea of splitting up my table like that after thinking about it more and reading everyone's responses. I am interested in further reading about database theory, so I think I'll find some books soon. Most everything I know about database design and practice is piece-mailed advice and responses given from these forums, accessforums.net, and utteraccess.net. I know that's not really a good model of learning, but I haven't had the time lately for any formal training, so I have to rely on quickly finding the answers and making due with what I have access to (no pun intended). I've seen Roger's Access Blog quoted a few times on these forums so I figured it was a trusted resource. Thanks for all of your responses!
 
cnstarz

You have been given a lot of advise to chew on. I would think that a good move now would be to redesign your tables and post the result here for added comment.
 
You should start with a simple list of things the business is trying to accomplish. And oddly enough it should not mention anything about saving data, computers, etc.

Like:

1. When we go on a mission we want to make sure we are not sending the same group out on every trip
2. After we go on a mission we need to know the contact person in case we need to do follow up work in the future
3. We need to know how missions we do in each region so we know which region is growing so we can allocate resources there
4. Etc...


Then pull out the must have’s, etc and design the tables to these specs. If you have any that says something about saving a pc of data in a database you need to drill down further and say why the data is needed and use that as the high level specs.
 

Users who are viewing this thread

Back
Top Bottom