Should I normalize this data?

94Sport5sp

Registered User.
Local time
Today, 12:28
Joined
May 23, 2012
Messages
115
Hi:

In the database I am developing there will exist a table for storing imported data. The data will be imported daily and will come from multiple locations. This will cause the import table to contain duplicate data in several fields. I am thinking of leaving this duplicate data in the import table. The designer in me says that is bad normalization but the user in me says, yes, but I will not be changing it.

Here is and example:


GroupNo GroupCount GroupAmount Location ImprtDAte
Class1 35 1486.95 Area1 41030.36
Class2 15 637.65 Area1 41030.36
Class3 4 128.4 Area1 41030.36
Class4 42 941.95 Area1 41030.36
Class5 102 2359.65 Area1 41030.36
Class6 11 127.45 Area1 41030.36
Class7 584 -163.3 Area1 41030.36
Class8 4346 -527 Area1 41030.36

GroupNo GroupCount GroupAmount Location ImprtDAte
Class1 11 127.45 Area2 41030.34
Class2 274 3309.9 Area2 41030.36
Class3 78 209.75 Area2 41030.36
Class4 12 94.95 Area2 41030.36
Class5 134 474.65 Area2 41030.36
Class6 584 -163.3 Area2 41030.36
Class7 4 128.4 Area2 41030.36
Class8 42 941.95 Area2 41030.36

This data is not generated from within the database and the database will not be changing the information. Hopefully you can see that the location name Area1 and the numerical value for ImprtDate will remain for all the information for that day for that file. The same for location name Area2 for the second file and so on for each imported file. Also note GroupNo will repeat for each file imported.

Any reason to not let it repeat?

Thanks for your thoughts
 
Sometimes you have to use the cards you are dealt.

An unnormalised table can often have it's advantages. I have taken a properly normalised set of tables and created the one unnormalised table purely for speed. A lot of smarter developers do the same thing but don't admit it here.

What you need to look at is the affect these imported tables will have on other tables, queries etc. The answer to this should help you decide. Don't make a decision because someone here quotes someone else who has no idea what you are trying to do.

The ball is in your court.
 
if you want to use this data in the Db, then i would take the data, and write a series of queries to move it into your real tables

you will need a mixture of append and update queries

what you can then do is clear out the temporary file before each import, and then bring in and sort out the new data each time
 
It depends on how and what for you use the data. To blindly go into normalization, and then just denormalise it again on display is a pointless exercise.

Dave is right in the sense that normalized data leaves you with most options , and added requirements tend to turn up during the life cycle of a system. But if you are at the end with no future changes in sight, then again, depends on what you use the data for.
 
Last edited:
if you do not normalise, and someone changes the area designations say - then all of a sudden "area1" becomes "area 1" say, and your data extracts become confusing. if you normalize then you pick up "area 1" as a new loaction and can deal with it

you must be using the data for something, after all?

I prefer to control and validate stuff if I can.
 
If they change to Area 51 then one can always claim the data got gobbled up by aliens :D
 
As the others have said, you know the data and your business environment. You are the only one who can judge whether or not the set up you have will be/ is affected by letting
GroupNo repeat.

You haven't said much about the application and use of the data, so we would only be guessing to offer more details/suggestions.
 
Hi:

Thanks for all the responses.

The basic purpose of the DB is to reconcile cash and stock to sales (from sales terminals) reports and then print reconciliation reports. At the end of the month produce summary reports for the month and output a summary file.

The names of the location and the Group items are stored in the sales terminals and if they change then they are treated as new location or new group and added to the reports above. The sales terminals are the driving force for data input and my DB will not change or create names for locations or Group items.

Many years ago, when I did this for a living, I would push hard for normalization. It is hard to let go of old habits. However, for this DB, I think I will survive if I do not normalize.

Thanks again for your thoughts.
 
I'm a little confused by what you are calling normalization. Are you saying you think you should have lookup tables to convert GroupNo and Location to numeric values? That's not what normalization is all about. This is data that comes to you from an external source and I presume you are not going to allow changes to it. Given that, you take what they gave you. Since you are not allowing updates, you don't need to worry about data anomolies. If you were going to allow updating, I would recommend creating tables with the valid unique values for the two fields and using those to populate combos. That would restrict the updates to valid values which is what you want from normalization anyway.

BTW - I have a little sub-system I add to every application that manages simple lookup tables. It gives me a standard user interface I can allow my users to use to update lookup tables that do not control programming logic. Because of this, my lookups are all numeric values. However, if I were to go with separate tables for each lookup, I would almost certainly stick with text values. The only time I would make an exception would be if I expected the actual values to be changeable because that would require me to specify cascade update to propagate key value changes. An example of this is my State table. I'm going to take my chances that no state will change its name any time soon and so I go with the two digit postal abbreviation for state. The table is there to prevent data entry errors so they don't type NU for NY and to give me the full spelling should I need it for anything.
 
Hi Pat:

First off, thanks for the reply. I understand how difficult it can be to answer questions without full information, so thanks for the reply.

I will use the Northwind sample data base to elaborate. The Location in my data is like customer name and the Group is like inventory items. The imported data then becomes the invoice detail lines. I plan to import the data into a sales detail (it is here that I could normalize the data) table which will then be used to balance imported saled data to cash received. The sales detail table will have additions columns to record adjusments to apply to the group sales data in order to balance cash. The imported data in sales table will not be changed. I plan to create a Location table as a list of what import files I should look for. I can expect one file for each location each day and there will be a varied number records imported from each location daily. On a form I will create a combo box for the user to select a location and balance that location. Then, print a report and move on to the next location. Repeat for all data to process. Weekly or monthly reports , as needed, will be printedas needed.

I am not sure I fully understand you sub-system. My location and group data is much like state names. Will not change but the user does need to select which location to balance and print report for.

Thanks for your thoughts
 
I uploaded a picture of the two forms. The top form is a list of all the "tables". The bottom form shows the contents of the selected "table". When I want to create a query for a combo's RowSource, I select based on tableID so the query would be:
Select * From tblCodes Where TableID = 138;
For the Business Unit table.

Many of these lookup tables should be managed by the users and rather than create individual forms for each table, I use this method. So for each application, I import two tables, two forms, and two reports and I have a complete lookup table management mini-app. The AuthCode indicates whether the table can be maintained by the user or must be maintained by the programmer.
 

Attachments

  • TableMaint.jpg
    TableMaint.jpg
    56.7 KB · Views: 195
Hi Pat:

It is hard for me to believe how much access I have forgotten. Your picture makes sense after I thought about it for a long while. I think later, after I get the database functioning, I could use that idea to allow somebody other than me create a new import table to add to my system.

For now, even though I understand the other opinions I will leave the data repeat and simply add a field to link to the location table where the location name only occurs one time. Later when I add reports I can use the link to get the location name and location on disk.

Thanks
 
This isn't Access specific. I developed this for the first time in 1979 in COBOL - IMS DB/DC:) and have used something similar in every major application I have developed since then.

The users can't add new tables. There would be no way for them to tie them to forms and reports. They can only add elements to existing tables.
 

Users who are viewing this thread

Back
Top Bottom