Importing Data from Download Excel file

patriot76

Registered User.
Local time
Today, 13:42
Joined
Jan 11, 2005
Messages
24
Hi

My question is: how do I set up the table to minimize redundant data. I have several fields that match the column headings in Excel, so the data can be imported, but fields like user name, pick slot, batch # all get redundant every time I import. My file is getting unnecessarily large. I know splitting the one table into many is the right thing to do, but don't have a clue how to import the data than. any help

Thanks
 
Typically under your example, I will setup the table structure properly first.
Setup an inport table, and a bunch of queries, VBA code OR both to move the data from the import table to it's final resting place.
 
Hello

That I think is over my head. Help :confused:

Thanks
 
You create a IMPORT table, it would only hold the imported data.
Now what do you want to do with it?
Do you want update records that already exist?
Add records that do not exist?
Delete records that are not on the imported data any longer?
What?
 
Importing

FoFa

The following is my fields that are imported from Excel. "Batch #", "Pallet ID" "Pick Slot", "Product Code", "Description", "Reserve Slot", "Qty Requested", "Qty replenished", "Return Qty", "Return slot", "Reason if returned", "user name", and "Date/Time".

Redundant data are user name, batch #, Pick slot, product code.

Every day I import to update the database, add to the existing database.

Thanks
 
basically, you would wipe out the IMPORT table, import the new data.
Once you have the new data it is more a matter of running some queries to move the data from the IMPORT table to where it needs to go. You can message the data at that time, and update or append the data as needed from this IMPORT table. Think of the IMPORT table as a staging table to just hold the raw data for processing.
 
Stuck on a table

I make several differnet tables, but I can get the data to from my import table to append these other tables with the data.
The tables are as folloew tblEmpl(EmplNum,username,palletID); tblPickSlot(PickSlot,PalletID,ProductCode); tblReason(RRID, ReasonifReturned); tblProduct(ProductCode, Description); tblPalletID(EmplNum, ReserveSlot,QtyRequested, QtyReplenished, ReturnQty, ReturnSlot, RRID, Date/Time, Batch #)

I get null value error and key errors

Help
 
How do I set up the table to minimize redundant data?

First, you must thoroughly understand the concept of NORMALIZATION. The "minimize redundant data" issue is exactly what led to folks researching ways to normalize tables. You can look up normalization in the Access Help files or on the Internet. If you do a Google-search on "Database Normalization" you will probably get a thousand-plus references that are scholarly in nature and about a bazillion that are part of advertising for a given product, part of a college course, or perhaps appear in a library of article taken from the Journal of the American Computing Society. JACS is a WONDERFUL reference source but you need to be near a university to find the older articles.

Here is your problem: If you are importing from Excel, that dataset is NOT normalized. It is what we call "FLAT" - i.e. neither normalized nor relieved of its redundancies. When you import flat files using the Import Wizards, you get flat tables. Always. Never any exceptions (that I recall). So the technically correct answer is that in a single step, you CANNOT reasonably eliminate redundant data.

But what you CAN do is import the data to a temporary table as a framework for subsequent queries that would eliminate this redundancy in several (and I DO mean several) steps.

Your earlier response suggests that you are so new to Access that the answer FoFa gave you merely confused you. This makes it difficult to help you without writing something for you. I don't have that much time. What I can do is suggest that you start reading about normalization.

You will need to import your data to a holding table. You must build your DB with normalized target (final) tables. Then you can write a series of queries to extract PARTS of the imported data and append the parts to the target tables. Once you have identified the final tables and the columns of your spreadsheet that contribute to each, you can write the queries. The catch will be building the relationships.

I'll summarize something you will learn when you start reading up on normalization: Tables that are normalized often have parent-child relationships (a.k.a. one-to-many) in which one record in a parent table corresponds to many records in a child table. Or, if you prefer other nomenclature: What I call a "parent" table, you might call an "independent" table. What I call a "child" table, you might call a "dependent" table. Using my family nomenclature, it is not unheard of to have multi-generational tables in a hierarchy 3-5 levels deep. In that case, you have parent, child, grandchild, great-grandchild, and great-great-grandchild tables.

The series of queries has to populate the highest level of the hierarchy first - in my model, the parent. You can eliminate all duplicates from the parent table. Once the parent has been populated and all duplicates are gone, you can use a special field known as a key. It has to be unique for the parent (or for the independent side of the relationship) but can be duplicated in the child table. You store this key in the child tables to POINT to a parent record. Look up Autonumber fields for suggestions on how to define this key, or you can compute your own key, but in the latter case, KEEP IT SHORT. The shorter the better.

OK, once you populate the parent and have the keys, you can populate the child tables. Where you would have duplicated a long field that is now defined in the parent, you simply place the record key for the corresponding parent entry. You haven't actually ELIMINATED the duplication but you have radically shortened the key size that will be duplicated.

The same concept applies when you have a grandchild table. Once the parent is populated, populate the child tables. They will ALSO have the potential for a unique key to identify records. Once these keys are defined, you can use ANOTHER query to populate the grandchild tables using the child keys as pointers to the corresponding records in the child tables. This can be carried on to as many levels (within reason) as you might need.

Once you have the keys for each level, you can use JOIN queries to rebuild the records to LOOK LIKE the original input, but they will take up much less space in the final table.

One final thought. If you are doing this for maybe 10-15 Excel records, don't bother converting to Access. You get no significant benefit (except the learning experience) until you start getting hundreds of overall records with dozens of fields for each. A cost-benefit analysis for Excel vs. Access ALWAYS picks Excel first for really small datasets.
 
Last edited:
The Doc man

Useful information, I've always struggled which fields go where in normalization process. I'll work on it some more. I import several hundred records per day from Excel, so it is not a small DB.

Thank You
 
I've always struggled which fields go where in normalization process.

Actually, that one is (relatively) easy.

Take a table. (Any table...). It has a prime key. Doesn't matter whether it is Autonumber or generated by something relevant to the data you store. Like an SKU number in a store that uses bar-coded merchandise tags.

OK, here's the test for inclusion of field Y into table X. Does field Y depend exactly and ONLY on the key of table X?

If the answer is yes, you can include the field. If the answer is NO, you have a candidate for placement into another table. Doesn't matter that the field depends on key X. It cannot depend on ANYTHING ELSE - and further, if key X happens to be a COMPOUND key (i.e. two or more fields acting together as the key), field Y has to depend on the ENTIRE key, not just part of the key. The latter condition isn't a problem for single-field keys but can be a big "gotcha" for multi-field keys.

The CATCH to this seemingly simple test is to watch out for implied or hidden or assumed dependencies.

For instance, in an SKU table, you might have a stock item description. No problem. You might have an aisle number where this item is kept in the store. No problem. You might have a department code used for assigning managerial responsibility. No problem. You might have some category code that defines the type of item it is. No problem, really. Like a light bulb would be category ELECTRICAL but a faucet would be category PLUMBING. (Probably would use that for some report groupings...)

What about unit price? Here is where you MIGHT say, "Sure, it goes into that table." But if the price changes with time, there is an implied/hidden dependency on DATE. And DATE is not part of the key. See how that test works now? It is for this same reason that most "real" inventory systems do not store "amount on hand" in their stock tables. That number is not only DATE dependent, but TIME dependent.

Anyway, if you ask the first question above and realize you have some doubt (already an indication that the answer MIGHT be NO), ask a second question just to be sure.

Does the value of the field change based on some other action, event, or condition? Even if the cause of the change is dependent on the key, you STILL have a dependency on whether (or not) a causative event has occurred. And the indicator of that event is therefore a contributory key.

Hope this helps you understand the normalization decision.
 
I've always struggled which fields go where in normalization process.

The Doc Man

Thank you for the insight, I'll try using it.
 

Users who are viewing this thread

Back
Top Bottom