Need some guidance with normalizing data from excel import

Chaga

Member
Local time
Today, 21:01
Joined
Aug 19, 2022
Messages
34
Hello everyone,
I am an ex access developer, but haven't done any work in the past 15+ years so am quite rusty.
Would love it if you could give me some insight as to how to proceed with my issue(s).
I've decided to get out of retirement and help a friend of mine who's in HR, and migrate her data from excel to access. The excel sheet stores mostly CV info about people, and there are around 3,000 records. She just typed in the information the way it was in the resumes.
I need to normalize that data to be able to store it in a db, but am not quite sure as to which work is better done in excel before importing, and which must be done after.
I'll list down all the columns I have in the sheet with my comments next to them.

ID : I added this one to have a record ID
Name : This field contains composite names, always starts with first name, but might have up to 5 middle/last names mostly for Spanish/Portuguese/Arabic/Asian people. The problem I am facing is that for Spanish/Portuguese, the format could be FMML (AYRTON BRENO FARTURA TIAGO), while Arabic it could be FFL(Sheikh Omar Sharif) or FLL(Hassam Abdul Rasheed). What would be the best approach into dealing with that? Use first as F and last as L, with everything in the middle as M?
Phone & Alternate Phone : Here the format varies, but shouldn't be hard to fix. Ive got 1111111,+11111111, (1)111 111 etc...
Location : No Issue
Previous Positions : These cells contain all previous positions in bullet points. I have managed to separate them using the text to column feature in excel, so now I have a sheet with ID, POS1,POS2,etc...
Interview1 Time : NI
Interview1 Date : NI
Interview1 Interviewer : NI
Vaccinated : Has the vaccine name and number of shots taken or simply a yes
Last Salary : Has some that are different currencies
Expected Salary : NI
Contracted : Has Y/N or current company name with date till end of contract.
Visa : Y/N
Applying Position : Has separate positions separated by "/" (Stage Manager / Cast Manager / Creative assistant director / Choreographer)
Feedback : NI
Rate : 1 to 5 with .25 increments
Shortlisted : Y/N
Interview2 Time : Need to be added
Interview2 Date : Need to be added
Interview2 Interviewer : NI
Manager Feedback : NI
Manager Recommended Position : has bullet points and "/" (•Cast Manager / Coordinator •Stage Manager / Coordinator)
Manager rating : Same as Rate
Shortlisted for final interview : Y/N
Past Experience : Bullet points with a list of the previous companies

I also need to add a final interview time and date

I have attached an excel sheet that has the basic table structures I am planning on creating, any input/corrections are more than welcome.
I am using office 2016, but last one I worked on was probably 2006, so have no idea about any new features.
I wish the data didn't have personal information, or else I would've attached a sample.

Thank you in advance.
 

Attachments

The first thing I notice is that you mocked up a database in Excel. For 10% more work you could have actually had something to start with. You would have been able to use the Relationship Tool, been able to throw some sample data in there to test a few things and actually have something you could use.

That aside, it looks good for the most part--other than not being able to see the relationships among tables.. The only thing I see is I'm not a fan of tables with only 1 real field in them (autonumbers are not real pieces of data). tblCountries, tblGenders, tblDivisions etc don't need to exist. Instead of using their ID as a foreign key into whatever table you should just store the actual value.

Make this in Access, throw some sample data at it and then see if it serves your needs. Then import everything into it if it does.
 
Breaking up composite names is not as simple as taking first word as first name and last word as last name. A first name could be Betty Jo. A last name could be Van Buren.

Also, do not use spaces nor punctuation/special characters in object naming.
 
Breaking up composite names is not as simple as taking first word as first name and last word as last name. A first name could be Betty Jo. A last name could be Van Buren.

Also, do not use spaces nor punctuation/special characters in object naming.
I am aware of that, but other than manually entering names, what would be the best approach?
 
The first thing I notice is that you mocked up a database in Excel. For 10% more work you could have actually had something to start with. You would have been able to use the Relationship Tool, been able to throw some sample data in there to test a few things and actually have something you could use.

That aside, it looks good for the most part--other than not being able to see the relationships among tables.. The only thing I see is I'm not a fan of tables with only 1 real field in them (autonumbers are not real pieces of data). tblCountries, tblGenders, tblDivisions etc don't need to exist. Instead of using their ID as a foreign key into whatever table you should just store the actual value.

Make this in Access, throw some sample data at it and then see if it serves your needs. Then import everything into it if it does.
I am an old school access user where we'd make tables for everything to enforce referential integrity.
It's not gonna be a big db, so no harm done here.
I've created a basic db and also added an excel with some data. As I said, I mostly need to figure out what work should be done in excel, prior to importing data, and what can be done after the import.
Thanks
 

Attachments

I am aware of that, but other than manually entering names, what would be the best approach?
However you automate break up of names, really need to do a review of each and every name to make sure the splitting is correct for each. There is no easy way to ensure this data integrity.
 
Last edited:
I am an old school access user where we'd make tables for everything to enforce referential integrity.
It's not gonna be a big db, so no harm done here.
I've created a basic db and also added an excel with some data. As I said, I mostly need to figure out what work should be done in excel, prior to importing data, and what can be done after the import.
Thanks
Using Data - Text to Columns in Excel gives you the following output
 

Attachments

  • Names.PNG
    Names.PNG
    17.9 KB · Views: 223
Or are you saying that it should be stored at the table level in a combo with a value list?

This one. Usually. More often than not, you don't even need a value list.

You don't need a Gender table nor is it hard to administer it with my method. Also, Chaga has tblRatings with an autonumber primary key that is used to look up a numeric field. So instead of storing the numeric value needed he's storing a numeric value related to the actual number needed. Unnecessary.

I've seen this play out time and again--either using a table to relate to just a number or used to hold a virtually immutable list. People read up on normaliziation then over do it. Not everything needs its own table.
 
Well I used to have a Value list for a combo consisting of Male and Female just because that was all that was needed at that time.
That no longer flies these days. :) One would be best having a table for all the new permutations? :)
 
Yikes, I really touched a polygamy/islam/gender nerve with my last response. I apologize. I will try to avoid those topics as well as Trump, racial equality, homosexuality, abortion and the moon landing in this response. But no guarantees.

Your insurance issue is not applicable to this discussion. Codes don't live in a vacuum. By their nature they are references to other data and are data themselves = 2 pieces of real data.

Also, no, I didn't read the link you didn't post in this thread.
 
I am also confused by Plog's statement. Why would you not have a table of countries even if there is only ID and CountryName fields?
 

Users who are viewing this thread

Back
Top Bottom