Help needed to run monthly cleanup of contact details in membership database

JeffBarker

Registered User.
Local time
Today, 10:18
Joined
Dec 7, 2010
Messages
130
Hi guys,

We've just recently taken on a new membership database from one of our clients, which currently has around 1,000 existing members.

Their database came to us in the shape of an excel spreadsheet with contact details for all members (Title, First Name, Surname, Job Title, Company Name, Address 1 etc), and we've sucked this data in to our main CRM database so that we're able to send out the bi-monthly membership magazine, email circulars and the like.

NB: It's worth noting here that their data has been split out into the relevant fields in our main SiteContacts and SiteInformation tables (contact details in the former, company name and address etc in the latter) for greater use, so although we have their excel spreadsheet sitting in our CRM as a table, that's now just there for information purposes.

Part of our remit is to keep their data as clean as possible, so every month they send us a new version of their excel spreadsheet which contains the data of all their members - and we're required to crosscheck the new data against that which we're currently holding.

The kicker here is that literally any part of a members' details could change - Surname (due to marital status etc), Company Details, Job Title etc...and we're also led to believe that members will receive a new Membership Number each time they renew at the end of their 12-month membership period.

So I'm really stumped on the best way to check the data for any updates or amendments, and I'm hoping that the helpful people in Access World Forums can assist please!!

Thanks in advance!

Jeff.
 
Is the new version a complete list of all their members? If so, why not simply delete the data you have in the tables and import the updated version?
 
Is the new version a complete list of all their members? If so, why not simply delete the data you have in the tables and import the updated version?

Thanks for the reply, TJPoorman! :)

Theoretically, that's the best idea - the latest spread they send us will always be the master version to use - but we'd potentially like to keep a record of any contacts who are no longer members, so although they no longer receive membership benefits they would remain on our system in case they ever joined again.

So in this instance, just removing their magazine subscription from tblCirculation (which holds all our subscription info and links to SiteContacts by ContactID) would suffice.

Currently, the only way we have of identifying members is if their record in SiteInformation has 'CGC Member 2014' in the Description field, and if they also have a corresponding record relating to a CGC Magazine subscription in tblCirculation.

SiteInformation links to SiteContacts via SiteID, and then SiteContacts links to tblCirculation via ContactID.

We do have the previous master spreadsheet, so comparing that against the new one shouldn't be a problem - if that's actually relevant?

EDIT: Thinking more about the fact that we'll be keeping all the previous versions of the master sheet, I've come up with a possible solution!

When we import the new master sheet, we compare it against the previous version and somehow mark any records that have changed with either a chk field or a text box (describing the change(s)), which we then go in and update manually on the system?

It's not the ideal solution, as I'd prefer it to be automated, but realistically I can't see that there will be thousands of changes to the data every month, so it wouldn't be too much of a problem if we had to do it manually?
 
Last edited:
When we import the new master sheet, we compare it against the previous version and somehow mark any records that have changed with either a chk field or a text box (describing the change(s)), which we then go in and update manually on the system?
The problem you have is that everything can potential change - even the membership number and you can't even create a composite key over a number of fields because these can change as well. Anything that is potentially fixed such as date of birth or gender (and even these can be corrected) is insufficient to uniquely identify a single record.

Although this may be sufficient
Currently, the only way we have of identifying members is if their record in SiteInformation has 'CGC Member 2014' in the Description field, and if they also have a corresponding record relating to a CGC Magazine subscription in tblCirculation.
what happens when it changes to 'CGC Member 2015'?

I think you may need to go back to your client and see if they can modify what they are doing so that you can have a unique key - it might be a new number like an autonumber, or retain the existing membership number in another column called 'initial membership number' for example

Once you have that then by comparing the excel data with the db data on this key, you can identify, what is new, what has changed and what 'is no more', i.e. deleted, and take the appropriate action.
 
The problem you have is that everything can potential change - even the membership number and you can't even create a composite key over a number of fields because these can change as well. Anything that is potentially fixed such as date of birth or gender (and even these can be corrected) is insufficient to uniquely identify a single record.

Although this may be sufficient
what happens when it changes to 'CGC Member 2015'?

I think you may need to go back to your client and see if they can modify what they are doing so that you can have a unique key - it might be a new number like an autonumber, or retain the existing membership number in another column called 'initial membership number' for example

Once you have that then by comparing the excel data with the db data on this key, you can identify, what is new, what has changed and what 'is no more', i.e. deleted, and take the appropriate action.

Hi CJ_London, thanks for the reply!

As far as I'm aware, that field will always say 'CGC Member 2014' - at the moment there are no plans to change that, although I guess any additional members from now on will need to be referred to as 'CGC Member 2015'...I hadn't thought of that until now!

I will have a word with my MD, although I think he will be loathe to ask our client to modify the data their end - he very much has a 'can do' approach when it comes to our business (which is good), and feels that we can find a solution for anything to fit around the client!
 
he very much has a 'can do' approach when it comes to our business (which is good), and feels that we can find a solution for anything to fit around the client!
I agree with the 'can do' approach, but to do what you want, you need a unique key and your client is not providing one.

If it helps, I have a client where they send me data and I produce a print ready pdf they can send to their printers and initially I had the same problem, no unique key.

So I took their spreadsheet and added a unique key column (just a number incrementing by 1 for each row) which was then hidden and I locked the worksheet down so they could only enter data in the necessary columns. I also added data validation so they couldn't enter data in the wrong columns. They have about 20000 rows so I created 30000 unique keys - this will be extended once they start to approach this number.

So the spreadsheet looks the same from their perspective and they see and appreciate the data validation as added value.

And I save time because I don't have any data validation issues.
 
I agree with the 'can do' approach, but to do what you want, you need a unique key and your client is not providing one.

If it helps, I have a client where they send me data and I produce a print ready pdf they can send to their printers and initially I had the same problem, no unique key.

So I took their spreadsheet and added a unique key column (just a number incrementing by 1 for each row) which was then hidden and I locked the worksheet down so they could only enter data in the necessary columns. I also added data validation so they couldn't enter data in the wrong columns. They have about 20000 rows so I created 30000 unique keys - this will be extended once they start to approach this number.

So the spreadsheet looks the same from their perspective and they see and appreciate the data validation as added value.

And I save time because I don't have any data validation issues.

Hi CJ_London, thanks again for the reply!

That's a very interesting solution to your problem there, I have to admit I was completely unaware that we could add an autonumber field to Excel, or even data validation - and I can see how that could be applied to us here, providing the data sheet isn't just exported out of a website every month and then forwarded to us.

I think I've found a compromise though (assuming the data is updated manually at client end):

The starters and leavers should be fairly easy for us to identify, as the first list we received from the client had 944 records, while the latest one has 925 - but I had a thought that, somewhere along the line, there must be someone at our client's end entering this information, so I've asked our MD exactly that.

My thinking is that, if there is someone who manually updates data for our client, they must receive some sort of notification from the member to make the change, so perhaps they could copy us in on that information or keep a separate worksheet on the master list detailing changes.

So it could be a decent compromise that doesn't completely put the onus back on the client, but asks for a bit more information each month to help us ensure we run the membership side of things as smoothly as possible.
 
I
have to admit I was completely unaware that we could add an autonumber field to Excel
It is not actually an autonumber field, you just prepopulate it for each row plus some more for new entries.

Assuming the first row of data is 2, enter 1 in A2. Then in A3 enter the formula =A2+1 and copy down as many rows as you want. Then copy and paste special the column onto itself as values to convert the formula to values

downloads from other systems can be frustrating for importing to systems since they frequently don't have a unique ID

If your client is using a download from a website, I would have thought that would have a unique ID within its own dataset otherwise their customer would have to re register every year or every time they changed something, rather than renew/change. It may be that your clients are getting that information but ignoring it since it is a 'meaningless number' from their point of view. If they are not getting it, they may be able to arrange for their download to be modified to include it.

Also consider that your client may be taking a download, reinterpreting it and sending to you (which is what happened with my client) and what you actually require is the uninterpreted data - so you need to trace back to the original source and map each transformation before the data gets to you.

My guess is your client is doing things which are not necessary for what you require - Things like changing column headings, reformatting values or adding calculated values which can all be handled by yourselves.

You have to ask, what is the client changing, and why? And if they don't have a unique ID, how are they deciding that something has changed?

Perhaps you can add value by doing this work yourselves and sending them the spreadsheet rather than the other way round!
 
Perhaps you can add value by doing this work yourselves and sending them the spreadsheet rather than the other way round!

Hello again CJ_London, and thanks for your reply!

So the MD has replied, and apparently the main membership database is operated by our main competitor (which I didn't know) - anyway, long and awkward story short, the data files we receive are not going to be changing any time soon in terms of data, and any requests to change the process would be pointless.

My MD did have an idea, however, of keeping a separate database with the spreads that the client sends us, and running a series of queries against the previous sheets to see what's changed in the latest version, probably using chk boxes or text boxes to say what's changed.

So if you can help with the correct way of running those queries to determine what's changed each time, that would be great! :)
 
It is the same problem - how do you identify a record in one spreadsheet with one in another? Without a unique identifier it is not possible with any degree of certainty.

You could perhaps use the row number - but if they delete a record or insert one in the middle somewhere, that won't work.

You could try firstname/lastname, but what if you have two people with the same name, or a person changes their name

you have the membership id, but that changes every year

You could do sense check inspections - say compare on firstname and date of birth - the odds of that combination repeating is pretty low but having identified a potentially changed record, look at other data in the records to see what else has changed and make a decision - but this could be time consuming.
 
It is the same problem - how do you identify a record in one spreadsheet with one in another? Without a unique identifier it is not possible with any degree of certainty.

You could perhaps use the row number - but if they delete a record or insert one in the middle somewhere, that won't work.

You could try firstname/lastname, but what if you have two people with the same name, or a person changes their name

you have the membership id, but that changes every year

You could do sense check inspections - say compare on firstname and date of birth - the odds of that combination repeating is pretty low but having identified a potentially changed record, look at other data in the records to see what else has changed and make a decision - but this could be time consuming.

Hey CJ_London, thanks again for the response!

Well, for the most part we'll be able to match against individual membership numbers, but obviously sooner or later those will all change as the members renew.

I'm not sure if there's a specific period when memberships run from/to (Jan-Dec, for example) or if a member can sign up in the middle of the year and still get a 12-month membership. I'm hoping it's the latter as that will mean we won't have one month of each year where everyone gets a new membership number!

But yeah, we're both agreed that the way I've been advised to do it will take more time - but the MD is convinced we'll be looking at around 5 changes a month, including new starters/leavers.

Any suggestions on the best way to automate all this in a query or three, please?
 
to find changes your query will look something like
Code:
 SELECT newTable.*
 FROM oldTable INNER JOIN newTable ON oldTable.key1=newTable.key1
 where oldTable.Fld1<>newTable.Fld1 OR oldTable.Fld2<>newTable.Fld2 OR ....

to find additions
Code:
 SELECT newTable.*
 FROM newTable LEFT JOIN oldTable ON oldTable.key1=newTable.key1
 WHERE oldTable.key1 is null

to find deletions
Code:
 SELECT oldTable.*
 FROM oldTable LEFT JOIN newTable ON oldTable.key1=newTable.key1
 WHERE newTable.key1 is null
 
to find changes your query will look something like
Code:
 SELECT newTable.*
 FROM oldTable INNER JOIN newTable ON oldTable.key1=newTable.key1
 where oldTable.Fld1<>newTable.Fld1 OR oldTable.Fld2<>newTable.Fld2 OR ....

Hey CJ_London, this works perfectly, but this query has returned 60 records that have changed in some way - do you know of some way of highlighting only the fields that have changed at all, please?

Thanks in advance!
 
you would need to change the first line of the query to

SELECT *

which will bring both tables through

Then you would need to create a form using this query as a recordsource

you can then either drag the fields of one table or the other or both (depending on your requirements) onto the form

quick tip - select the query but don't open it, click on the Create tab in the ribbon and select more forms - multiple items to create a form quickly

then for each control you would need to conditionally format - you will need to do this one control at a time and some controls cannot be conditionally formatted - e.g. check boxes

To conditionally format, see this link

https://msdn.microsoft.com/en-us/library/office/gg508986(v=office.14).aspx

Basically, for newtable.fld1 you would set

Value ... <>oldtable.fld1

and for oldtable.fld1 you would set

Value ... <>newtable.fld1

then choose a back colour. Tip - if you are showing both old and new tables in your form, use a different colour for each table (e.g. red for newtable fields and yellow for oldtable fields), it will be easier to read
 

Hey CJ_London!

Thanks for all this info, it's really useful - I've managed to cut down the 60 records to the 14 that actually have changes that are relevant to us (contact and site info)!

I should have mentioned that I'm rocking 2007 though...will the above link still be relevant to me?

Also, I have had a thought of having a table in my database that acts as a 'master' table for all of the data - so as new people come on they append there, and it also overwrites the data we already have with any changes to those records...

I thought this way it might be easier to compare any new sheets we receive from the client, instead of potentially have to cross-reference against all of our previously-received sheets?

Do you have any tips on the best way to achieve that, please?
 
will the above link still be relevant to me?
yes - the look might be slightly different, but the principle remains the same

Also, I have had a thought of having a table in my database that acts as a 'master' table for all of the data
I thought that was what you were doing, but investigate normalisation before you develop anything

instead of potentially have to cross-reference against all of our previously-received sheets?
without a unique ID, you will still have the same problem

so as new people come on they append there, and it also overwrites the data we already have with any changes to those records...
I would just append, not update existing records. Ensure your table has a timestamp field (just a datetime field with a default value of now()) so you can see when a record was appended - then you just choose the latest one, but also have the history if required - however you still need to determine a uniqueID - again, see normalisation and also investigate relationship principles, use of primary keys and family keys etc
 
I would just append, not update existing records. Ensure your table has a timestamp field (just a datetime field with a default value of now()) so you can see when a record was appended - then you just choose the latest one, but also have the history if required - however you still need to determine a uniqueID - again, see normalisation and also investigate relationship principles, use of primary keys and family keys etc

Hi CJ_London,

So sorry for not replying to this sooner - been pulled away to a different project so this one has been on the back burner!

In the end I had to create a list of additions/amendments/deletions as per your instructions, but update the records on the database manually...a lovely Saturday job that turned out to be. :(

I will be checking out your latest response next week!
 

Users who are viewing this thread

Back
Top Bottom