Your Esteemed Help, Please?

GomerFackworth

New member
Local time
Today, 02:33
Joined
Sep 18, 2006
Messages
5
I have many varied lists of citizens from our town from 1700s to present day. Some have come in Excel, or Lotus format, most are docs. At this point I have been able to parse everything into Excel, however I now have about 60 Fields because of the variety of data garnered from each list. Over the past year I have used formulae to parse the data, usually from a "Notes" field that includes just about anything from genealogy, military, occupational, historical articles and artifacts, to the condition of the gravestone, etc. Obviously, aside from names, birth and death dates, parents names, most of the other data is sparse, sometimes only 50 individuals out of 3000+. At this point, my ignorance about what one can do in Access has kept me from making the changeover, but I need to do something with what I have because it is now far too unwieldy in Excel. Worse yet, I was just informed that another batch of a couple thousand records has been found, and I will probably have to repeat the process with all the new records later.

I have a meager understanding of Access, but am trying to read about it in my spare time. So I have some generalized questions about what Access can do, as it will affect how I next manipulate the data in Excel. I appreciate that your time is valuable, so I’m not asking you how to do this as much as asking you if it is possible to do it in Access. I would appreciate it if you only say “No that is not possible/the right approach/etc” or if you suggested a method of attack by saying “look up ‘how to add a primary key’ or ‘use the ** wizard’”, and I will have a hint of how to attack the problem. Thanks muchly for your help and your patience.

As I have many redundancies in individuals (a parent may have had six children for example), I have sorted all names alphabetically, and then assigned a number to each row. I then repeated the number column and deleted the number from the repeats (i.e. Lee 1, Lee 2, Leed 3 becomes 1,3) and can then sort for unique individuals. I then imported the names into Access and got a primary key for each unique individual, then put that key back into Excel and copied the key to all the Lee 2s that had no key so that all listings have a primary key. Now what is the best way to get all the rest of the data into Excel? (These questions are not necessarily in any order of execution.)

1. I understand that I can have Access take the Excel file in its entirety and make a db out of it.
· What will happen to the Primary Key I have already assigned?
· Can I then import another spreadsheet to the same db from the new files I will have?
· Or should I make a second db from the new files and merge 1st and 2nd together?
· And finally, must all Excel files have all 60 field columns (in the case of the docs, there is generally only names and one or two other items, such as registered voter [yes/no] and nothing else)?

2. As there are a number of repeats of names and so many relationships, I have assigned a Personal ID (not the Access primary key) in Excel to each person so that I could list the parent’s PID with the children and vice-versa, as well as husband/wife, etc. (There are 6 John Smiths with a total of 17 children.) And finally, there are many names of people from other places who will not need to be in the db for the town (the names of parents of a woman who married a man from our town, for example).
· Can Access search for those extra names if they are only noted in the Father/Mother fields, or should they be listed as Lastname, Firstname and given a primary key of their own?
· Though I think it might end up being redundant in the db once I have a relationship table set up, should I keep the PID with each individual? (The problem is that unlike a primary key, my key has numbers and letters [PID123], and I’m not sure if it is possible to format an auto-incrementing number.)

3. I noticed that in the forms samples in Access, one can use a number of tabs, or macros to switch to different forms. As I am hoping that this db will outlast me, and as most of the data entry will be done by those with little or no experience in db construction, I want to keep this db as simple as possible. Since tabs seems very simple to use, is there any advantage of using the macro buttons vs tabs?

4. There are a number of dates (birth, death, marriage, enlistment, discharge, etc) but they are not all “related” in the sense that enlistment and discharge could be in a military table and birth and death on a personal info table and marriage on a relationships’ table, so would these dates be better normalized or just left as items in different tables? (The issue is a memory one in that few have enlistment/discharge dates, and some have only death date, etc.)

I think that is enough for this forum thread. Once again, thank you for your time and patience.
 
A couple of suggestions to help you do your homework.

It sounds like you are compilating a list of individuals who has lived in a particular area for a given time? Go to Microsoft.com and look at samples of Contacts database. They should make a good base for you, and you can modify the tables or add for your own needs and give you a good idea of what are the relationships.

Since you will want to relate childrens to parents and so forth, and both are same entities (e.g. they have same attributes), the best approach is to self-join the relationship. Take a look at this excellent article by Allen Browne that will help you understand how you can relate children to parents and other things.

When you've looked at those, go ahead and modify one of template until you have all datas you know you will need, then post a picture of the relationship then we can then help you better and work through the normalization.

I hope that helps.
 
Hi Gomer

Lots of information in your post.

I've helped literally hundreds of businesses to make the big jump from Excel spreadsheet to Access so you are in a very big club. Sometimes it is hard to make the jump from flat file mindset to relational mindset but once you've done it you'll wonder why you stayed with Excel for so long.

You ask a lot about primary keys. My firm advice is to simply make your PK's meaningless autonumbers. You can police uniqueness with indexes if you need to (there's a thread here on meaningless/meaningful if anybody wants to take up this issue).

To get the data OUT of your Excel sheets and INTO Access set up an ODBC link to your spreadsheet and use Append/Make Table queries to bring in the data you need... and that brings me to the next bit!

You can't learn Access the way you can lean Excel, Word and PowerPoint. Access is a tool for relational database design and another for putting data in and out of the database once you've designed it. You need a lot of theory to get it right. The core skill you need is to be able to drive the query designer to the max. When you can do that you'll go a long way towards your objective simply by using the designer in conjunction with Excel (Office Tools-->Analyse it with Excel). Get there first... and only then progress into form and report design.

Getting the schema right will be difficult for you but (as suggested by Banana) start by putting all of the field (column) names in the spreadsheet into one big table (you can do this automatically with a make-table query but you are on a learning curve of at least a day to master that). Post your table as a schema on this forum and we'll show you how to convert it into a relational model... easy!

When I teach Access classes it takes one day to get people going with Access, another day to get them to "expert" level (interactive) and three more days to get the bare bones of programming with VBA. Reading your post you should be able to self-teach what you need in a day (apart from the scema design and we're here to help with that).

Good luck with your project
 
Mike has hit this on the head - spot on
I have had to learn as I went along
a day understanding the prinicples would have made life so much easier

-In some of the post (and you have done this ie you understand the problem and asked correctly )I reply to them show you understand what you are trying to do and you have - strip it down to its simplist element again you have now look at it -
you have done all that these Access Gurus (and I am not one of them ) have asked - with this you will certainly get a comprehensive answer-solution

G
 
just a few thoughts here

First, the SEARCH function of this forum will reveal a few hits on "Geneology" (if you spell it correctly, which I never do... :p )

Second, one thing that might be useful is to understand that importing from Excel to an existing table is sometimes wrong. Sometimes it is better to import to a separate table and let it blow up in your face in order to "experiment" with it - to find bad records and such. Also, once Excel data has been imported, there is no need to repeatedly import it. You can just write queries to append from the imported table (as a temporary) to your new table or tables - with fairly good assurance that you can maintain the format in the destination table regardless of what the original import did. When finally you have drawn it all out, you can delete the temp table. If you use this approach, non-uniform imports from Excel don't matter. It is what you do with the subsequent extraction/append/insert/update queries that become important.

Third, regarding names. You are correct in this case (because the names are not unique) to assign an arbitrary unique key. Here is a guideline for layout. If you are going to EVER want to search by last name, then NEVER EVER import first name, middle initial, last name AS A SINGLE FIELD. It is very easy to import them separately and concatenate them for reports. It would be infernally difficult to import them as a single field and separate them later. This is basically the principle of making your DB's structure support actions you might later want to perform. BUT you are probably NOT doing yourself a favor with that PIDxxx format. Just give them numbers, which take up less space digit for digit than a mixed-format number, and supply the "PID" text on reports as a constant - which it is, isn't it? NEVER store a constant of that class in a table. Always supply it in places where you would SHOW the constant. Simplify.

Tabs vs switch macros: When a form has tabs on it, the general rule of thumb is that all tabs should relate to the same general record. In my case, I had a server database with one record showing me computer configuration, network information, disk information, user (aggregate) information, and maintenance information - separate tabs for each. In some cases, the tabs held sub-forms. In other cases, the tab just showed data straight from the server table. But the point was that the tabs all referred to the same server.

Use a switch macro when you are going to point to someone else by following a relationship between the person you are viewing and the person you wanted to see.

As to dates... there are reasons to go either way. I might build a "significant dates" table to show data of birth, death, marriage, service record, etc. The size of the table isn't that much of an issue but its sparseness might be. Not every person in your DB would have dates for marriage, service separation, graduation from college, etc. etc. etc. But here is the deciding factor... the dates in question are so numerous in possible meaning that either you have to reject storing some of the dates or give yourself the maximum flexibilty in doing so. A parent/child table does that.

The other side of the coin is that it complicates the issue when you want to validate that, say, person A, listed as the father of person B, was actually born before person B. But that sort of thing is possible using multiple JOIN queries to produce a synthetic record telling you what you wanted to see.

As to whether you would assign PID to people not from the town... YES. Consistency is the key to simplifying the relationships you must express, even for persons who might not have any other records.

This leads to issues of just how much you keep in the main table vs. how much supplemental information you would keep in separate tables. There is an onging argument about the meaning of one to one relationships. It seems to me that the people not residing in your town make a case for one/one relationships in your specific case, because you want to minimize things to be kept about persons when they are not proper candidates for full inclusion into your database, but you still want to know their names.

Just remember that one and ONLY one table can be "central" to this whole operation. All other tables will probably be one/one with it, except that since there is a case where some records WON'T exist in other tables, you need to make all such tables one/many. This is because of the "dirty little secret" associated with one/many relationships. They actually encompass one/many, one/one, and one/NONE!

To me, your person's name/ID table is the central table, and all other tables would depend on it when referring to a person by number.

As to how you would express relationships between two people in the same table, look up topic JUNCTION tables in this forum. You will also see good descriptions on this concept as many/many tables.
 
Last edited:

Users who are viewing this thread

Back
Top Bottom