Using vba to remove old duplicates from an Access table (1 Viewer)

lucky245

Registered User.
Local time
Today, 22:29
Joined
Sep 19, 2009
Messages
16
I have 31 tables which as they are updated are appended to the Master table (bad name should call holding but for this example I called it Master).

The Master Table could have upwards of 5000 records depending on when it is updated. I need to remove the duplicates before moving the data to the Output Table.

I can automatically import the data and get it to append the data that has not been appended since last use to the Master Table.

I cannot figure how to efficiently remove the duplicate records.

In the sample database attached the key fields are Spec_number, spec_result, and date_authorised.

if the spec_number and spec_result is the same I need to delete any record (row) which is older that the newest date_authorised.

However if the date_authorised are the same but the spec_result is different then all different spec_results need to be kept.

Basically spec_Number and spec_result uniquely identify the record after which I only need the latest version (newest date_authorised).
 

Attachments

  • CF Data.accdb
    552 KB · Views: 108

theDBguy

I’m here to help
Staff member
Local time
Today, 14:29
Joined
Oct 29, 2018
Messages
21,467
Hi. Just as a test, if you run this query, would it show you all the records you want to keep?
Code:
SELECT * FROM MASTER T1 
INNER JOIN (SELECT Master.Spec_Number, Master.SPEC_RESULTS, Max(Master.Date_last_Auth) AS LastDate
FROM Master
GROUP BY Master.Spec_Number, Master.SPEC_RESULTS
ORDER BY Master.Spec_Number) T2
ON T1.Spec_Number=T2.Spec_Number 
AND T1.Spec_Results=T2.Spec_Results
AND T1.Date_Last_Auth=T2.LastDate;
 

Micron

AWF VIP
Local time
Today, 17:29
Joined
Oct 20, 2018
Messages
3,478
Rather than remove duplicates then append elsewhere, why not just set the target table to not allow dupes via composite index (or composite PK if that is preferred) and just append? Access will balk if this is a query but if you run it anyway, it will tell you how many records were not appended and why. If it's not a query but is vba, just suppress the error. This all assumes you don't wan to pick which record of a dupe gets appended.

Going forward, the "master" table should be designed to prevent dupes if it is to be repeatedly used.
 

lucky245

Registered User.
Local time
Today, 22:29
Joined
Sep 19, 2009
Messages
16
Hi. Just as a test, if you run this query, would it show you all the records you want to keep?
Code:
SELECT * FROM MASTER T1 
INNER JOIN (SELECT Master.Spec_Number, Master.SPEC_RESULTS, Max(Master.Date_last_Auth) AS LastDate
FROM Master
GROUP BY Master.Spec_Number, Master.SPEC_RESULTS
ORDER BY Master.Spec_Number) T2
ON T1.Spec_Number=T2.Spec_Number 
AND T1.Spec_Results=T2.Spec_Results
AND T1.Date_Last_Auth=T2.LastDate;

Yes this works perfectly. I was playing with using a recordset and a cloned recordset but this does the job without all the code.
 

lucky245

Registered User.
Local time
Today, 22:29
Joined
Sep 19, 2009
Messages
16
Rather than remove duplicates then append elsewhere, why not just set the target table to not allow dupes via composite index (or composite PK if that is preferred) and just append? Access will balk if this is a query but if you run it anyway, it will tell you how many records were not appended and why. If it's not a query but is vba, just suppress the error. This all assumes you don't wan to pick which record of a dupe gets appended.

Going forward, the "master" table should be designed to prevent dupes if it is to be repeatedly used.

I do understand where you are coming from however the problem is they are not true duplicates as the later records will be an updated version but the original record is required to ensure tests are in the system. Only a few fields will be identical and I need to be able to remove the old ones once the updated version is released. As there are 100's output each day and often only with minor changes if the system is not viewed for a few days the older records could be repeated hence my term Duplicates.
 

1268

Registered User.
Local time
Today, 16:29
Joined
Oct 11, 2012
Messages
44
If that is the case can you not grab max?

Sent from my SM-G950U using Tapatalk
 

isladogs

MVP / VIP
Local time
Today, 22:29
Joined
Jan 14, 2017
Messages
18,212
In that case, why not update the existing records rather than append new ones and delete the original records.
 

theDBguy

I’m here to help
Staff member
Local time
Today, 14:29
Joined
Oct 29, 2018
Messages
21,467
Yes this works perfectly. I was playing with using a recordset and a cloned recordset but this does the job without all the code.
Hi. Hope you can manage from here. If not, please don't hesitate to holler. Cheers!
 

Mark_

Longboard on the internet
Local time
Today, 14:29
Joined
Sep 12, 2017
Messages
2,111
I'm with Colin for this.

Basic logic is
SELECT Master WHERE Spec_number = yourSpecNumber AND spec_result = YourSpecResult
IF error THEN ' It doesn't exist
AddRecord
Else
UpdateRecord
EndIF

That said, are you SURE there is no reason to keep previous history for a given Spec_Number? I've encountered too many cases where someone says "We don't care about X" only to have them ask later "Why can't you tell me about X??? What's wrong with your program!!". Keeping history is often far easier than deleting it as you'll often find you need to look at it just when you'r sure you won't.
 

lucky245

Registered User.
Local time
Today, 22:29
Joined
Sep 19, 2009
Messages
16
I'm with Colin for this.

Basic logic is
SELECT Master WHERE Spec_number = yourSpecNumber AND spec_result = YourSpecResult
IF error THEN ' It doesn't exist
AddRecord
Else
UpdateRecord
EndIF

That said, are you SURE there is no reason to keep previous history for a given Spec_Number? I've encountered too many cases where someone says "We don't care about X" only to have them ask later "Why can't you tell me about X??? What's wrong with your program!!". Keeping history is often far easier than deleting it as you'll often find you need to look at it just when you'r sure you won't.

The following is in no way meant to be or sound condescending

Ok to explain a little more. The original record could output one line or more depending on current results for specimen as the output delivers 2 lines if 2 comments are added.

This could then change to no result eg only 1 line needed or there may be 3 organisms detected therefore 2 additional lines required. As this result may be out 4 or five times we could end up with 11 lines of data where only 3 are needed. A “no result yet” line is pretty useless once an organism is detected. The reason it’s over Multiple days is that some take longer to detect than others and some have mid stages so the result is output at each stage. The end result is the only result ever needed to treat a patient on and history of its stages will never be useful or they would have been kept in the main system.

Lastly eg 11 lines instead of 3 * 1000s of results will be a waste of space and no doubt slow the search process down exponentially.

Hope that helps
 

gemma-the-husky

Super Moderator
Staff member
Local time
Today, 22:29
Joined
Sep 12, 2006
Messages
15,651
I have to say that this doesn't sound right. If you have duplicate records, there is most likely going to be something amiss with your data structure. A proberly noramlised data structure would detect dupiicates before they can be entered.
 

lucky245

Registered User.
Local time
Today, 22:29
Joined
Sep 19, 2009
Messages
16
I have to say that this doesn't sound right. If you have duplicate records, there is most likely going to be something amiss with your data structure. A proberly noramlised data structure would detect dupiicates before they can be entered.



This problem is more about getting the data in from multiple automated inputs into a flat file table which will be the source file for another system.


Sent from my iPhone using Tapatalk
 

Users who are viewing this thread

Top Bottom