demetriusad
New member
- Local time
- Today, 14:02
- Joined
- Jan 25, 2009
- Messages
- 1
All,
This is my first time posting and I come to the forum with a rather nasty duplicate records dilemma. I am an intern dealing with a department who does not want to migrate to better technology. Here is an example of my issue.
The contacts table has several duplicates in it that I discovered. This was based on the AffiliationID, Name, Address, etc. Each of these records contain a unique ID called ContactID.
The calls table relates to the contacts table based on the ContactID. This is a one to many relationship.
Because of the duplicates in the Contacts table, we have skewed data in the Calls table.
For example, the contacts table could have the following records:
ContactID AffiliationID FirstName LastName Address City State Zip
100 12345 John Smith 123 Main Place Ohio 46239
150 54321 John Smith 559 Charleston Maine 45399
175 12345 John Smith 123 Main Place Ohio 46239
179 41454 Rowan Idie 355 Joh York Utah 84757
From where we can see that contact id 100 and 175 are really duplicates.
In the Calls table there could be the following records for contact ID 100 and 175:
CallsID ContactID Date Count
301 100 11/05 1
305 100 11/22 3
311 175 10/16 5
314 175 12/01 6
What I need to do is figure out which ContactID I will keeping as the “good” record and which one I will delete. If I decide to keep ContactID 100 as the good record then I need to change the ContactID in the Calls table from 175 to 100.
Once the Calls table is fixed to point to the “good” ContactIDs then we can delete the “bad” ContactID out of Contacts.
I started by creating a duplicates query to confirm that I have duplicates (although a unique contactID) in the contacts table. After that I create a query to find the distinct records; I wanted to do this separate the good contacts from the duplicates. Actually, I wanted to keep the most recent record. I am stuck at this point. I want to delete those duplicates somehow and merge them with calls table. I did not design the database; nevertheless, it is a mess structurally.
This is my first time posting and I come to the forum with a rather nasty duplicate records dilemma. I am an intern dealing with a department who does not want to migrate to better technology. Here is an example of my issue.
The contacts table has several duplicates in it that I discovered. This was based on the AffiliationID, Name, Address, etc. Each of these records contain a unique ID called ContactID.
The calls table relates to the contacts table based on the ContactID. This is a one to many relationship.
Because of the duplicates in the Contacts table, we have skewed data in the Calls table.
For example, the contacts table could have the following records:
ContactID AffiliationID FirstName LastName Address City State Zip
100 12345 John Smith 123 Main Place Ohio 46239
150 54321 John Smith 559 Charleston Maine 45399
175 12345 John Smith 123 Main Place Ohio 46239
179 41454 Rowan Idie 355 Joh York Utah 84757
From where we can see that contact id 100 and 175 are really duplicates.
In the Calls table there could be the following records for contact ID 100 and 175:
CallsID ContactID Date Count
301 100 11/05 1
305 100 11/22 3
311 175 10/16 5
314 175 12/01 6
What I need to do is figure out which ContactID I will keeping as the “good” record and which one I will delete. If I decide to keep ContactID 100 as the good record then I need to change the ContactID in the Calls table from 175 to 100.
Once the Calls table is fixed to point to the “good” ContactIDs then we can delete the “bad” ContactID out of Contacts.
I started by creating a duplicates query to confirm that I have duplicates (although a unique contactID) in the contacts table. After that I create a query to find the distinct records; I wanted to do this separate the good contacts from the duplicates. Actually, I wanted to keep the most recent record. I am stuck at this point. I want to delete those duplicates somehow and merge them with calls table. I did not design the database; nevertheless, it is a mess structurally.