Looking for some advice about structure and importing data

noaccessidea

New member
Local time
Today, 15:45
Joined
Nov 29, 2019
Messages
14
I need to import a large file of student data in to a database so I have started to set up the structure but things aren't working as I hoped. I have split most fields in to different tables and added an ID field to link them. However, I'm really confused about how I import my data in to this structure. The ID fields are numbers but the spreadsheet I need to import obviously isn't.



I mistakenly thought I could use an update query and send the data to the relevant table values but I quickly notices that this was silly and just created tables with lots of duplicated values.



How do I got about this? I would really appreciate any suggestions as I don't want to go through 5000 records typing in ID numbers for each.



This is the table:
wzQBiNW.png
 
I don't want to go through 5000 records typing in ID numbers for each.
r u aware that excel has an autofill handle control whereby you can type 2 numbers into 2 cells, one after another, and then drag that handle down 5000 more records and it will fill in the autonumbers for you? or IS this an excel file that you're importing? i would guess that a lot of other data management software might have something similar to the fill handle. do u know if ur program does? or, maybe you can import ur data set in excel FROM ur source program? if you get it in excel and it's not all over the place, the autofill handle would work fine.

if not, can you upload the source dataset that uve got?
 
First thing is to import your data into a table in your database. Set the Data Type for all of the fields Short Text in that import table. I've found that often there will be data issues where say a date in Excel (or whatever) is not in the right format and records like that will not be imported into a date field.

Call the imported data table ImportedStudents or whatever. Fix any data issues that arise.

Add an autonumber field to the imported data table. Call it something like ImportedStudentID. Add the field to your tbl_Students. This is a temporary field to facilitate the filling in of the various foreign keys.

Use an append query to import the fields you want in tbl_Students as well as ImportStudentID. But ignore the foreign keys for the time being. Fix any data issues that arise from any text data in ImportedStudents not being proper format for date or numeric fields in tbl_Students.

If you haven't already done so, populate your lookup tables.

To set the RecordType_ID in tbl_Students, use an update query to insert the foreign keys in tbl_Students by joining the tables ImportedStudents, tbl_StudentRecordType on RecordType (the text field, not the ID) and by joining the table ImportedSudents, tbl_Students on ImportStudentID. The update query is to insert the RecordType_ID for each student.

Repeat for each of the other lookup tables.
 
There are ways to approach this. They usually involve something similar to the process I am about to describe.

First, let's address the names. I'm going to make your life easier but you don't see it yet. Trust me...

The name Reapply Every Year? cannot stay that way. Special characters are just not good and can lead to confusion. Further, having spaces in names means you MUST enclose the names in brackets "[]" every time you type them. But if you run the names together and drop the space, that is no longer required. Less typing, and once you get used to it, no loss of readability. Take my word on this fine point.

Now, lets get down to work. You are going to get the best results, I think, if you do this in a "divide and conquer" approach. So in summary, I'm going to get your data into Access, then work on breaking apart the stuff you need to break apart, then get translations for each field you are defining as a lookup, then merge everything back together for the final disposition. Hold on to your hat, it is a bumpy ride.

Start by importing the raw Excel file to Access in a big honkin' table to act as a staging table. No keys, just raw data. Name the fields according to what they are, taking my advice into account regarding special characters and spaces.

Now build a two-field table for each thing you wanted to look up. We sometimes call this a lookup table (which should NOT be confused with a lookup field, something we treat as a big no-no.) Each table should be empty at the moment. In each such table, define an autonumber key for each table to act as your xxxID field and have a corresponding xxxText for your description (or xxxName, suit your fancy). I will use xxxText in my description of what to do.

Now one definition-field at a time, use something like this (and the <<>> elements are directions, so omit them and substitute the indicated values):

Code:
INSERT INTO <<put definition name here>> ([COLOR="Red"]xxxText[/COLOR]) SELECT DISTINCT [COLOR="red"]xxxText[/COLOR] FROM <<put temporary table name here>> ;

Seems like a lot of typing, right? Build one query and save it as a named query. Now before you start to build the next one, COPY the saved query and paste it to a new name, then edit it to drop in the names appropriate to the operation. I have highlighted the parts that would be unique to each field.

As an option, you COULD impose an "ORDER BY xxxText" if you wanted the translations to be alphabetically ordered. If so, that would follow the name of the temporary table. If you don't do this, you will not be able to predict the order of the numbers that will be the key to this table. This step builds your translation tables one at a time.

BEFORE YOU PROCEED!!!!!

Look over each table to be sure that you don't have spelling anomalies that caused you to have near-duplicates that SHOULD have been the same value but weren't. Because using the SELECT DISTINCT syntax rules, if two fields are spelled differently, they are distinct whether they should have been or not. It will be up to you to decide how to reset the bad spelling cases - leave them or correct them in the temporary table.

Now go back to that temporary table. As long as you have only the number of fields you showed, you should not run into Access field-count limits. Add one LONG (integer) numeric field (xxxID) next to each description field (xxxText) but leave all such fields in that temporary table =zero. You don't have to set them to zero; it is automatic. You must save the table when done so that the records will be updated in the proper format, but again, that should be automatic just by doing the save.

Next, we associate the translation or lookup ID back in the main table. For each ID field, do something like this:

Code:
UPDATE <<temporary table name goes here>> INNER JOIN <<definition table name goes here>> ON <<temporary name>>.[COLOR="red"]xxxText [/COLOR]= <<definition name>>.[COLOR="red"]xxxText[/COLOR]
SET <<temporary name>>.[COLOR="red"]xxxID[/COLOR] = <<definition name>>.[COLOR="red"]xxxID[/COLOR] ;

Remember what I said about building one query, then just do a copy/paste to a new one and only change the indicated fields?

Now build the final table with every xxxID field (but no text fields where a definitional field occurred). Anything that isn't in a definition gets imported directly; anything in a definition text is not imported but the corresponding ID will be used. For snorts and giggles, put an autonumber on the final table but remember that you won't actually name that field for the INSERT. That is because Access will fill it in automatically.

Code:
INSERT INTO <<final table goes here>> (Student, StartYear, EndYear, OSSID, ReapplyEveryYear, Identification, RecordTypes_ID, ...., CourseLevel_ID, MentoringTypes_ID) SELECT Student, StartYear, ..., RecordTypes_ID, ..., MentoringTypes_ID) FROM <<temporary table goes here>> ;

Almost done. Look it over, and ... if you got no import errors; if you got no translation errors; if all looks well, then ... delete the temp table and those queries you used to populate the translations. Now perform a backup and then do a Compact & Repair.

Your relationship diagram looks quite reasonable, and remember that if you want to build queries that exercise the lookup relationships, having the field lookup relationships defined like that means that the query builder will automagically build the right kind of JOIN with the right kind of reference for you.

EDIT: Cronk gave you similar advice, but briefer. Mine took a while to type.
 
Last edited:
You've over normalized your data. Tables with only 1 real field of data (autonumbers aren't real data) do not need to exist.

Since all those 1 field tables are on the 1 side of the relationship, those tables do not need to exist. Simply keep the text data in tbl_Students.
 
While I generally agree with plog, I'll add this disclaimer: If that one big table is the ONLY place where you need those values, he's absolutely correct. However, if they appear in multiple tables with the same meaning and translations, then they need to be separate for normalization purposes. You didn't address that, so I'll bring it up as something to consider.
 
Wow, thank you for all the amazing responses, I wasn't expecting so much help and insights. I will have a try tomorrow when I'm back at work and we'll see where we are.

I think linked tables is the key thing I was looking for, it solves a lot of problems I was facing when trying to create forms and queries as well.

I'm glad someone mentioned the normalisation, this something I have been thinking about this weekend. I think some of the fields are not really necessary (just there because it helped in Excel to do some stats when asked). Some of the fields will definitely be used in other tables but a lot will always have a value and won't be needed for anywhere else so I will review this to make it simpler.

I saw something about fields which may not always have a value should be split into their own table, is this correct?
 
I think linked tables is the key thing I was looking for, it solves a lot of problems I was facing when trying to create forms and queries as well.
what exactly r u referring to when you say "when trying to create forms and queries"?? what possible problems could there be? as far as I know, linked tables have nothing to do with the ability to query the data and display the data through forms. Other than slower speed perhaps.
I saw something about fields which may not always have a value should be split into their own table, is this correct?
i seriously doubt it. there's data all the time in records that have blank fields. if fields were be put in a seperate table everytime they were blank or had the chance of being blank, most peoples' dbs would end up getting very large very fast I would think. not to mention the fact that a violation of normalization would probably result.
 
what exactly r u referring to when you say "when trying to create forms and queries"?? what possible problems could there be? as far as I know, linked tables have nothing to do with the ability to query the data and display the data through forms.


It was probably just me not knowing what I was doing and not having the data structure correct. I found that I was getting IDs instead of values and having to create subqueries for both fields and hiding the ID which seemed a bit long winded. I suppose linking tables is doing the same thing (showing the value rather than ID but writing to the ID field when entering a value) at an earlier stage so when I come to make a form I can include the ID field but the user can put values rather than IDs in.



Or am I misunderstanding again?
 
I suppose linking tables is doing the same thing (showing the value rather than ID but writing to the ID field when entering a value) at an earlier stage so when I come to make a form I can include the ID field but the user can put values rather than IDs in.

Or am I misunderstanding again?
in ur original screenshot of all ur relationships, every single table has an autonumber field in it with the string 'ID" somewhere it the field name. when ur users are entering records thru forms, that field can't be manipulated so those number will fill automatically. then obviously ur users will have to fill the rest, which prolly includes the field "value" that ur referring to in ur quoted text from me. the terms 'value' and 'ID' are used all the time, but just make sure ur not confusing the 2 when it comes to identifying ur fiends.
 
I just wanted to say thank you for all the responses to this, it has been an enormous help. I have had to scale down the scope of this project for now but it the tips here have given me a much better idea of how to do it when I revisit it.
 

Users who are viewing this thread

Back
Top Bottom