Looking for some advice about structure and importing data (1 Viewer)

noaccessidea

New member
Local time
Today, 15:58
Joined
Nov 29, 2019
Messages
14
I need to import a large file of student data in to a database so I have started to set up the structure but things aren't working as I hoped. I have split most fields in to different tables and added an ID field to link them. However, I'm really confused about how I import my data in to this structure. The ID fields are numbers but the spreadsheet I need to import obviously isn't.



I mistakenly thought I could use an update query and send the data to the relevant table values but I quickly notices that this was silly and just created tables with lots of duplicated values.



How do I got about this? I would really appreciate any suggestions as I don't want to go through 5000 records typing in ID numbers for each.



This is the table:
 

vba_php

Forum Troll
Local time
Today, 10:58
Joined
Oct 6, 2019
Messages
2,884
I don't want to go through 5000 records typing in ID numbers for each.
r u aware that excel has an autofill handle control whereby you can type 2 numbers into 2 cells, one after another, and then drag that handle down 5000 more records and it will fill in the autonumbers for you? or IS this an excel file that you're importing? i would guess that a lot of other data management software might have something similar to the fill handle. do u know if ur program does? or, maybe you can import ur data set in excel FROM ur source program? if you get it in excel and it's not all over the place, the autofill handle would work fine.

if not, can you upload the source dataset that uve got?
 

Cronk

Registered User.
Local time
Tomorrow, 02:58
Joined
Jul 4, 2013
Messages
2,770
First thing is to import your data into a table in your database. Set the Data Type for all of the fields Short Text in that import table. I've found that often there will be data issues where say a date in Excel (or whatever) is not in the right format and records like that will not be imported into a date field.

Call the imported data table ImportedStudents or whatever. Fix any data issues that arise.

Add an autonumber field to the imported data table. Call it something like ImportedStudentID. Add the field to your tbl_Students. This is a temporary field to facilitate the filling in of the various foreign keys.

Use an append query to import the fields you want in tbl_Students as well as ImportStudentID. But ignore the foreign keys for the time being. Fix any data issues that arise from any text data in ImportedStudents not being proper format for date or numeric fields in tbl_Students.

If you haven't already done so, populate your lookup tables.

To set the RecordType_ID in tbl_Students, use an update query to insert the foreign keys in tbl_Students by joining the tables ImportedStudents, tbl_StudentRecordType on RecordType (the text field, not the ID) and by joining the table ImportedSudents, tbl_Students on ImportStudentID. The update query is to insert the RecordType_ID for each student.

Repeat for each of the other lookup tables.
 

The_Doc_Man

Immoderate Moderator
Staff member
Local time
Today, 10:58
Joined
Feb 28, 2001
Messages
26,999
There are ways to approach this. They usually involve something similar to the process I am about to describe.

First, let's address the names. I'm going to make your life easier but you don't see it yet. Trust me...

The name Reapply Every Year? cannot stay that way. Special characters are just not good and can lead to confusion. Further, having spaces in names means you MUST enclose the names in brackets "[]" every time you type them. But if you run the names together and drop the space, that is no longer required. Less typing, and once you get used to it, no loss of readability. Take my word on this fine point.

Now, lets get down to work. You are going to get the best results, I think, if you do this in a "divide and conquer" approach. So in summary, I'm going to get your data into Access, then work on breaking apart the stuff you need to break apart, then get translations for each field you are defining as a lookup, then merge everything back together for the final disposition. Hold on to your hat, it is a bumpy ride.

Start by importing the raw Excel file to Access in a big honkin' table to act as a staging table. No keys, just raw data. Name the fields according to what they are, taking my advice into account regarding special characters and spaces.

Now build a two-field table for each thing you wanted to look up. We sometimes call this a lookup table (which should NOT be confused with a lookup field, something we treat as a big no-no.) Each table should be empty at the moment. In each such table, define an autonumber key for each table to act as your xxxID field and have a corresponding xxxText for your description (or xxxName, suit your fancy). I will use xxxText in my description of what to do.

Now one definition-field at a time, use something like this (and the <<>> elements are directions, so omit them and substitute the indicated values):

Code:
INSERT INTO <<put definition name here>> ([COLOR="Red"]xxxText[/COLOR]) SELECT DISTINCT [COLOR="red"]xxxText[/COLOR] FROM <<put temporary table name here>> ;

Seems like a lot of typing, right? Build one query and save it as a named query. Now before you start to build the next one, COPY the saved query and paste it to a new name, then edit it to drop in the names appropriate to the operation. I have highlighted the parts that would be unique to each field.

As an option, you COULD impose an "ORDER BY xxxText" if you wanted the translations to be alphabetically ordered. If so, that would follow the name of the temporary table. If you don't do this, you will not be able to predict the order of the numbers that will be the key to this table. This step builds your translation tables one at a time.

BEFORE YOU PROCEED!!!!!

Look over each table to be sure that you don't have spelling anomalies that caused you to have near-duplicates that SHOULD have been the same value but weren't. Because using the SELECT DISTINCT syntax rules, if two fields are spelled differently, they are distinct whether they should have been or not. It will be up to you to decide how to reset the bad spelling cases - leave them or correct them in the temporary table.

Now go back to that temporary table. As long as you have only the number of fields you showed, you should not run into Access field-count limits. Add one LONG (integer) numeric field (xxxID) next to each description field (xxxText) but leave all such fields in that temporary table =zero. You don't have to set them to zero; it is automatic. You must save the table when done so that the records will be updated in the proper format, but again, that should be automatic just by doing the save.

Next, we associate the translation or lookup ID back in the main table. For each ID field, do something like this:

Code:
UPDATE <<temporary table name goes here>> INNER JOIN <<definition table name goes here>> ON <<temporary name>>.[COLOR="red"]xxxText [/COLOR]= <<definition name>>.[COLOR="red"]xxxText[/COLOR]
SET <<temporary name>>.[COLOR="red"]xxxID[/COLOR] = <<definition name>>.[COLOR="red"]xxxID[/COLOR] ;

Remember what I said about building one query, then just do a copy/paste to a new one and only change the indicated fields?

Now build the final table with every xxxID field (but no text fields where a definitional field occurred). Anything that isn't in a definition gets imported directly; anything in a definition text is not imported but the corresponding ID will be used. For snorts and giggles, put an autonumber on the final table but remember that you won't actually name that field for the INSERT. That is because Access will fill it in automatically.

Code:
INSERT INTO <<final table goes here>> (Student, StartYear, EndYear, OSSID, ReapplyEveryYear, Identification, RecordTypes_ID, ...., CourseLevel_ID, MentoringTypes_ID) SELECT Student, StartYear, ..., RecordTypes_ID, ..., MentoringTypes_ID) FROM <<temporary table goes here>> ;

Almost done. Look it over, and ... if you got no import errors; if you got no translation errors; if all looks well, then ... delete the temp table and those queries you used to populate the translations. Now perform a backup and then do a Compact & Repair.

Your relationship diagram looks quite reasonable, and remember that if you want to build queries that exercise the lookup relationships, having the field lookup relationships defined like that means that the query builder will automagically build the right kind of JOIN with the right kind of reference for you.

EDIT: Cronk gave you similar advice, but briefer. Mine took a while to type.
 
Last edited:

Pat Hartman

Super Moderator
Staff member
Local time
Today, 11:58
Joined
Feb 19, 2002
Messages
42,970
After you have built all the lookup tables as Doc described, you need to use those tables as part of your import process. Create a query that joins the linked spreadsheet to each of the lookup tables (using left joins) on the TEXT field. For the append part, pull the ID field from the join rather than the text field from the spreadsheet.

PS - some schema observations.
1. Always use meaningful names as the PK. That means IDentification in tbl_Students should more properly be Student_ID
2. Unless you need two references to the same lookup table in the same child table, always use IDENTICAL names. i.e. Advisor_ID --> Advisor_ID. Don't make one of them plural. It just leads to typos. Remember, you never need both the FK and its related PK in the same query so you won't have a duplicate name problem. The queries should only include the FK field because that is the one that is updateable.
3. some of your relationships are backwards. A student probably has more than one email address so the email table should have a FK of Student_ID and you would remove the Emails_ID from the students table. A student has more than one course, etc. Some of these relationships will be 1-many (email) but others (courses) will be m-m and will therefore require a junction table.
4. Mentoring type might need to relate to studentCourses (the junction table) if the student can be mentored for multiple subjects and the mentoring type might be different.

You need to rethink all your relationships before continuing.
 

plog

Banishment Pending
Local time
Today, 10:58
Joined
May 11, 2011
Messages
11,611
You've over normalized your data. Tables with only 1 real field of data (autonumbers aren't real data) do not need to exist.

Since all those 1 field tables are on the 1 side of the relationship, those tables do not need to exist. Simply keep the text data in tbl_Students.
 

The_Doc_Man

Immoderate Moderator
Staff member
Local time
Today, 10:58
Joined
Feb 28, 2001
Messages
26,999
While I generally agree with plog, I'll add this disclaimer: If that one big table is the ONLY place where you need those values, he's absolutely correct. However, if they appear in multiple tables with the same meaning and translations, then they need to be separate for normalization purposes. You didn't address that, so I'll bring it up as something to consider.
 

noaccessidea

New member
Local time
Today, 15:58
Joined
Nov 29, 2019
Messages
14
Wow, thank you for all the amazing responses, I wasn't expecting so much help and insights. I will have a try tomorrow when I'm back at work and we'll see where we are.

I think linked tables is the key thing I was looking for, it solves a lot of problems I was facing when trying to create forms and queries as well.

I'm glad someone mentioned the normalisation, this something I have been thinking about this weekend. I think some of the fields are not really necessary (just there because it helped in Excel to do some stats when asked). Some of the fields will definitely be used in other tables but a lot will always have a value and won't be needed for anywhere else so I will review this to make it simpler.

I saw something about fields which may not always have a value should be split into their own table, is this correct?
 

vba_php

Forum Troll
Local time
Today, 10:58
Joined
Oct 6, 2019
Messages
2,884
I think linked tables is the key thing I was looking for, it solves a lot of problems I was facing when trying to create forms and queries as well.
what exactly r u referring to when you say "when trying to create forms and queries"?? what possible problems could there be? as far as I know, linked tables have nothing to do with the ability to query the data and display the data through forms. Other than slower speed perhaps.
I saw something about fields which may not always have a value should be split into their own table, is this correct?
i seriously doubt it. there's data all the time in records that have blank fields. if fields were be put in a seperate table everytime they were blank or had the chance of being blank, most peoples' dbs would end up getting very large very fast I would think. not to mention the fact that a violation of normalization would probably result.
 

noaccessidea

New member
Local time
Today, 15:58
Joined
Nov 29, 2019
Messages
14
what exactly r u referring to when you say "when trying to create forms and queries"?? what possible problems could there be? as far as I know, linked tables have nothing to do with the ability to query the data and display the data through forms.


It was probably just me not knowing what I was doing and not having the data structure correct. I found that I was getting IDs instead of values and having to create subqueries for both fields and hiding the ID which seemed a bit long winded. I suppose linking tables is doing the same thing (showing the value rather than ID but writing to the ID field when entering a value) at an earlier stage so when I come to make a form I can include the ID field but the user can put values rather than IDs in.



Or am I misunderstanding again?
 

vba_php

Forum Troll
Local time
Today, 10:58
Joined
Oct 6, 2019
Messages
2,884
I suppose linking tables is doing the same thing (showing the value rather than ID but writing to the ID field when entering a value) at an earlier stage so when I come to make a form I can include the ID field but the user can put values rather than IDs in.

Or am I misunderstanding again?
in ur original screenshot of all ur relationships, every single table has an autonumber field in it with the string 'ID" somewhere it the field name. when ur users are entering records thru forms, that field can't be manipulated so those number will fill automatically. then obviously ur users will have to fill the rest, which prolly includes the field "value" that ur referring to in ur quoted text from me. the terms 'value' and 'ID' are used all the time, but just make sure ur not confusing the 2 when it comes to identifying ur fiends.
 

Pat Hartman

Super Moderator
Staff member
Local time
Today, 11:58
Joined
Feb 19, 2002
Messages
42,970
I disagree with plog regarding the 1-data field tables. If you use combos and you want to restrict the values to what has already been defined, then the tables make sense. You could use value lists but that makes you responsible for maintenance every time a user wants to add a new value to the combo list. The better solution is to use tables. That way you can create a form to allow the user to add an item or change the text for an existing item without having to involve you.

Some applications end up with dozens of these little lookup tables and creating user interfaces to manage all of them can get tedious so more than 40 years ago, I created a table of tables mini-app in COBOL with an IMS database as the BE. Today, the mini-app is Access with either Jet/ACE or ODBC tables depending on your choice. This allows me to use these "little" lookup tables and not be burdened with creating maintenance for them. As long as a list has only a long name and a short name, it fits my definition of a list and my little mini-app includes settings so you can enforce security and allow only certain users to update certain lists and you can tag an item as inactive without removing it from the list. This will just prevent the user from being able to assign the value to a new record but will leave the value in existing records.
 

Attachments

  • TableMaintExample190820.zip
    643.6 KB · Views: 103
  • Code Table MaintenancePPT.zip
    340 KB · Views: 113
Last edited:

Pat Hartman

Super Moderator
Staff member
Local time
Today, 11:58
Joined
Feb 19, 2002
Messages
42,970
I saw something about fields which may not always have a value should be split into their own table, is this correct?
This is an extreme view of normalization. In the real world, many pieces of data are optional. For example, even after all these years, many people still don't know the +4 for their zip code. to uniquely identify their address. Would you put that field in a separate table? NO. Also, Access records are variable length so fields with null values take up only the space required for their "length" code. So even if a field could hold 255 characters, it takes up only a single byte if it is null. Of course if you fill it with spaces, it may look empty but it will now take the full 255 +1 for the length so do be conscious of that as you load data.

If you were creating an Entity table and the entity could be students, teachers, guardians, vendors, etc, you might consider having a 1-1 table for each entity type to hold the attributes specific to the Entity since if the entity is a Guardian, it will have a home address but not a campus address whereas a student or teacher might have both.
 

noaccessidea

New member
Local time
Today, 15:58
Joined
Nov 29, 2019
Messages
14
I just wanted to say thank you for all the responses to this, it has been an enormous help. I have had to scale down the scope of this project for now but it the tips here have given me a much better idea of how to do it when I revisit it.
 

Users who are viewing this thread

Top Bottom