Desperate for help- 2 problems

eyal8r · Aug 30, 2007

Hey guys-
I am building a database for my own use- I download info from the Realtor MLS system and then run queries against it. I don't have control over user inputs whatsoever. I download the data into a DBF file- but can save it as an excel or whatever as well. I then am importing THAT data file into Access 2002. I have run into 2 problems right off the bat...

1- the Field names in the data file I am importing are very difficult to understand. There's nearly 100+ field names, so it could get confusing when I'm programming/creating queries. I'd like to change the field names upon import- or- have some way to match the confusing field names in the data file to the easier field names in the table. What's the best way to go about doing this? I tried using Captions- but they dont display the name of the field when using queries. Is there a field matching wizard upon import (maybe a set template for this) or something?

2- One of the fields I will be using to query a lot is the subdivision name. Since the users type this in, I have no way to control what they type. Sometimes they type in 'Willows', sometimes it's 'The Willows', 'Willows Unit 2', etc. Basically, I need them to be restricted into a common subdivision name upon import. I am guessing I would have a table with the 'acceptable names' in the list- and upon importing, take the ones that do not match and ask me to match them up manually. -OR- have a way to where it keeps track of every mispelled or variation of the name in a list so that it accepts/matches is up in the future without problems. Occasionally there may be a NEW subdivision name, so I would need to add it to the subdivision table. I think this is going to be the hardest part. I assume I am going to use the LIKE statement- but have no clue where to start with this. There are literally THOUSANDS of records going into this- so it needs to be able to recognize them all. Obviously I'd like to cut down on the manual overriding aspects as much as possible.

I think I can handle the rest of it, once I get past this (it's just forms and queries thereafter). But it's these 2 things I'm stuck on. Any ideas?
THANKS!

The_Doc_Man · Aug 30, 2007

Problem #1

Import the EXCEL file into Access keeping the confusing names.

Then write queries to pull out the stuff you want as Insert or Append or Maketable, which would allow you to include this syntax:

{action-verb}.... [oldname] AS [newname], ....

where the action-verb might be INSERT or APPEND as appropriate and the oldname comes from the raw import and the newname is what you will call it in the target table.

This would allow you to selectively populate tables and run all sorts of other queries on the raw file to delete nonsense records before you do the final import. You would do your INSERT and/or APPEND queries LAST. Other things to do first, like addressing your number 2 problem while you still have the junky table to work with.

Problem 2

Oh, my GAWD is that ever a difficult problem. Filtering unguided user input is SUCH a tough problem that if you ever solved it on your own, you could retire with a software product as pervasive as Windows itself. Hell, Bill Gates would buy you out in a heartbeat.

OK, so how might you at least approach this problem?

Consider the import process I described earlier. You've got this humongous table that you want to import to your "real" tables so you can discard the hunk of junk in the ugly format. Before you do that final set of APPEND/INSERT queries that extract the jewels from the rough, so to speak, ... add one more field to the hunk of junk. Call it "SubdivX" - which will be the subdivision after you validate it. For this field, make its default value equal to an EMPTY string ( "" ) or perhaps a blank ( " " ) - note the space between the quotes in the latter case.

Then take that table you described where you would put the common wrong answers for subdivision issues. As you correctly surmise, this might have to be updated frequently to show any new mistakes people start making (just when you got used to the old ones...) To save yourself the headache, start this table having entries in which the correct subdivision name appears in both fields. I.e. if the schlemiel entering the data got it right you can tell that it is right. Doing it this way makes it a little easier.

Do an update query based on this table that shows the correct name as field GoodName and a common incorrect name as field BadName. Then you could step through the hunk of junk table with an update query for that one field, SubDivX.

The update would be to the value of something like this (and look up DLookUp if you are not familiar with it):

DLookUp( "[GoodName]", "SubDivNameRef", "[BadName] = """ & Trim$([subdiv-name-from-hunk-of-junk]) & """" )

You put this in the UpdateTo cell of the query grid for the update query that updates SubDivX field.

OK, you aren't done yet...

Now run a query that shows you only the hunk-of-junk entries that still have blanks in the SubDivX field - side-by-side with the raw subdivision info. You have some choices here, including that you might enter the new variants into your lookup table and try again, OR you can manually enter the corrected answers through this query and update your lookup table later. Whatever floats your boat.

When all is said and done, you can dump your hunk-of-junk table (and don't forget to compact your DB after any large-table deletion).

eyal8r · Aug 30, 2007

ok- just so that I am clear...
I understand the first part of having 2 tables (hunk of junk and good one).

I would then have a separate table to store all the variations of the subdivision names in them. It might look like this:

BadName GoodName

The Willows The Willows
Willows 2 The Willows
Willows Unit AMD The Willows

I would then run a query for any possible matches on the BadName field from the imported table- then substitute in the GoodName field to the imported (HunkJunk) Table's SubDivX field.

Now- all the ones that come up as blank- I have to manually add them into that Subdivision Name table after I figure out where they go. Over time, I would eventually have nearly every name variation typed in, as the subdivision name table grows. Then I wouldn't have to manually enter in too many, except very weird variations or new subdivision names all together.

Is that right?
Makes sense to me! Well- sort of in Theory. Now, I gotta figure out just how to go about all this in practice! LOL.
Thanks for the help!!!

zyxwvu44 · Aug 30, 2007

for the querying, what about something like this? http://allenbrowne.com/AppFindAsUType.html

The_Doc_Man · Aug 31, 2007

Yes, that is pretty much what I had in mind. Over time, your list of known bad abbreviations will grow until very few bad abbreviations need attention. It will get better as it goes along.

The first few iterations of this, you might have to do this update several times, adding new entries to the good/bad translation table as you go. You'll run the update query fewer times the next time, and the next, until the number of unknown subdivisions is small enough that you would be satsified to just do the few remaining updates by hand through the query window that finds the blanks for you.

Desperate for help- 2 problems

eyal8r

Registered User.

The_Doc_Man

Immoderate Moderator

eyal8r

Registered User.

zyxwvu44

Registered User.

The_Doc_Man

Immoderate Moderator

Similar threads

Users who are viewing this thread