Cleaning up old data (removing unwanted characters, etc.)

memilanuk

Registered User.
Local time
Yesterday, 21:49
Joined
Apr 8, 2009
Messages
33
Not sure if this would be better off in another sub-forum or not...

So... I'm dealing with importing and cleaning up old data from a Works DB. I saved it as a CSV file, imported it as a new table, and now I'm working on 'scrubbing' it - cleaning up the cruft and generally making it presentable.

Most of my experience is with Excel, and honestly I'm about half-tempted to export it and do my clean-up there and re-import it.

The scrubbing falls into a couple main areas:

1. The person maintaining the data before insisted on adding an asterisk to the end of the 'Last Name' field to track a specific attribute rather than add another field. Now... I'm having some trouble filtering for '*', being a wild card and all ;)

2. Same person saved the phone numbers in about every conceivable manner - so now I have lots of '/' and '-' in the middle of strings. Any suggestions on how to get rid of 'em?

3. Most (but not all) of the text fields like 'Last Name', 'First Name', 'City', etc. were entered in ALL CAPITALS, which makes the mailing labels look pretty screwy. In Excel I would do some tinkering with ucase(), left(), etc. and use one column (field) as the source for another, and then copy the newly massaged data to a third, leaving the formatting commands behind. Not really sure how to go about it in Access?

I realize there will be a fair amount of hand massaging that has to occur anyways... things such as initials, two-part city names, etc. Starting to wonder if it might be simpler to just convert everything to upper case rather than the other way around. Suggestions?

TIA,

Monte
 
To deal with the asterisk and the other characters, use an update query and try out the Replace function. You should be all go handle all of those issues with that.

For the case issue, if everything is in upper case, you can use the mid function (you will also need to use the Len() function) and the LCase function to update every thing past the first character to lower case.

Take a look at the help file to learn more about these functions.
 
as this is a one time thing - if you ARE more comfortable tidying the data in excel, i would do it there.

access can probably do it quicker, and certainly you can automate it easier, but if its just a one-off theres no real advantage.

you may only be able to achieve some tidying by inspection, anyway
 
Given that there is no guarantee that the Fine gentlemen doing the data entry won't either revert to his old behavior before I get the input masks and data validation fully sorted out or conveniently 'lose' the new file... I may have to do this again. Might as well learn to do it this way if I can save it as a query to update things if it'll save me some headaches down the road.

Update queries, eh? Well that gives me a direction to read in...

Thanks,

Monte
 
Monte,

To give you a little better idea of what I meant, to get rid of that asterisk, create a query on your table. Add the Last Name field (you can add any other fields that you want to update, as many as you need) then make the query into an "Update" type query (use an option from the tool bar). Then in the "Update to" row of the Last Name column, place the following:

Replace([tblTestData]![Item1],"*","")

You can also use the replace function (just modify the example above) to get rid of the characters in the phone number field. You many have to create more that one version to get rid of all of the different characters in the field.

Hope this helps.
 
That was a huge help, thanks! Got the unwanted characters stripped out and the input masks tightened down a skosh for the phone fields, looks much better!

I'll have to spend some more time deciding what to do with the case on the names... I'm thinking it might be less hassle to make everything uppercase and call it a day.
 
I'll have to spend some more time deciding what to do with the case on the names... I'm thinking it might be less hassle to make everything uppercase and call it a day.

Did the StrConv function not work for you?

StrConv({expression},3) will capitalise the first letter of each word.
 
Just haven't had a chance to focus on that. I've been going from more or less ground zero and importing the data to mostly functional with various forms and reports and queries and macros to tie them together over the last two nights. Proper case was moved down the priority list a tad ;) It's coming up fast, though.

Thanks,

Monte
 

Users who are viewing this thread

Back
Top Bottom