Best approach for merging records from one table

Stevele

New member
Local time
Today, 15:27
Joined
Jan 2, 2012
Messages
4
[FONT=&quot]Hello fellow Access programmers and best wishes![/FONT]

[FONT=&quot]I have built quite a lot of queries in the past, but I'm not too sure how to attack this one exactly.[/FONT]

[FONT=&quot]I'm looking to have one central table containing customer data. Until now, instead of using such a central table containing all data, a new record has been created for every exchange with that customer, including all the data that he or she provided at that time, every time again.[/FONT]

[FONT=&quot]I've already created a query to extract all the customer data from those tables. However, there are many duplicates in the table resulting from this query (every customer that has several exchanges will be there several times). Unfortunately the classical duplicate removal techniques won't work if you take everything I'm looking for into account.[/FONT]

[FONT=&quot]So basically I have now one table named Table1 with the following fields:[/FONT]

[FONT=&quot]
Name
Firstname
Address
Postalcode
Town
Mail
Phone
Phone2 (all text fields)
Lastcontact (date)
[/FONT]
[FONT=&quot]Examples of two records:[/FONT]

[FONT=&quot]
Smith - John - Mulholland Drive 1 - 1000 - Washington - johnsmith (at) gmail.com - 04854527 - 157852158 - 01/01/2012
Smith - John - Fifth Avenue 6 - 2000 - New York - johnsmith (at) gmail.com - 02300155 - (Null) - 31/12/2011
[/FONT]
[FONT=&quot]One can assume that all records with the same Name AND Firstname AND (Postalcode OR Mail OR Phone OR Phone2) are referring to the same customer. Phone = Phone2 should also be accepted in the last part.[/FONT]

[FONT=&quot]A basic approach could be to group data and use Max or Min to pick only one (random) value for every field where the different values are not identical. However, I'm looking to go a lot further. I would like to obtain the following Table2:[/FONT]

[FONT=&quot]
AutoID
Name
Firstname
Address
Address2
Address3
Postalcode
Postalcode2
Postalcode3
Town
Town2
Town3
Mail
Mail2
Mail3
Phone
Phone2
Phone3
[/FONT]

[FONT=&quot]The idea behind this table is that no data gets lost: if there are several values for the same field (Postalcode, Address etc) for the same customer, the extra values should be stored in the field Address2, Address3 etc. The field Lastcontact (which contains a date) determines which field will be used (the most recent contact into Address, the next most recent into Address2 etc). So unfortunately "simply" grouping data doesn't help me. [/FONT]

[FONT=&quot]Example of how I would like the two above-mentioned records to be merged:[/FONT]

[FONT=&quot]
(AutoID) - Smith - John - Mulholland Drive 1 - Fifth Avenue 6 - (Null) - 1000 - 2000 - (Null) - Washington - New York - (Null) - johnsmith (at) gmail.com - (Null) - (Null) - 04854527 - 157852158 - 02300155
[/FONT]
[FONT=&quot]I've tried quite a lot of things, but haven't found a solution that really does what I need. Now I'm thinking of using an awful lot of IIfs and Dlookups to make this Table2, but I hope that there is another, more efficient way. I'll take any suggestion that I can develop myself, so I'm not necessarily asking for a complete query. [/FONT]

[FONT=&quot]Many thanks![/FONT]
 
Your first step should be to allocate a ClientID to every record in the existing table. Once you have this the movement of data to the new structure will be straightforward. Never lose this table because you may need to refer to the original data again to resolve any errors.

For the new structure I would highly recommend you move the Addresses and Phone numbers to two related tables. This allows you to store an unlimited number of these records and a complete history. Include a date field from the original record.
 
Your first step should be to allocate a ClientID to every record in the existing table. Once you have this the movement of data to the new structure will be straightforward. Never lose this table because you may need to refer to the original data again to resolve any errors.

For the new structure I would highly recommend you move the Addresses and Phone numbers to two related tables. This allows you to store an unlimited number of these records and a complete history. Include a date field from the original record.
I can of course allocate some ID to the records to the existing table, but I rather thought of doing that with the new table (one customer = one autoID), since the existing table has a lot of duplicates. But how would allocating an ID to the records in the existing table help moving the data?

I might indeed consider creating related tables for the adddresses and phone number, I'd have to check if it is worth the while.

However, I'm looking for an efficient way to move the data in the first place... :)
 
Not RecordID but CustomerIDs. It could be done while you are generating your new CustomerID in the new table. Just write that number to a new field for all the records in the existing table that are to be connected to that ID in the new structure.

This way you can easily find where any of the orignal data was refiled in the new system. That is important as a trail in case you end up with some unexpected matching or conversion errors.

One you have this in place the new and old data is easily joined allowing a very simple transfer to the new structure.

The jobs definitely need to be done in steps if you are to use the related tables for addresses and phones. I highly recommend you do it that way for proper normalization.
 

Users who are viewing this thread

Back
Top Bottom