Deleting Duplicate Records (1 Viewer)

markcrobinson

Registered User.
Local time
Yesterday, 21:40
Joined
Nov 28, 2017
Messages
14
I realize that there have been several threads on this. I've looked through them can can't find a simple way to delete duplicates.
Running a DELETE Query on a Find Duplicates query throws an error not solved by setting the Unique record to Yes.

I found this online and I like the logic, I've plugged in my info, although I don't understand why the HAVING COUNT(*)>1 throws an error.


Must admit, I don't understand some of the code right after the "With rs".

Of course, if there's an easier way, I'll all "ears".

----------------------------------------------------------------------------------
Function Dedupe()
Dim sSql As String
Dim sOrder_number As String
Dim sItem_id As String
Dim sItem_name As String
Dim sItem_sku As String
Dim sItem_meta As String
Dim lRows As Long
Dim con As ADODB.Connection
Dim rs As ADODB.Recordset

sSql = "select Order_number, Item_id, Item_Name, Item_Sku, Item_Meta " _
& "from 2017WooOrdersIn " _
& "group by Order_number, Item_id, Item_Name, Item_Sku, Item_Meta " _
& "having count(*) > 1"

Set rs = New ADODB.Recordset

With rs
.ActiveConnection = CurrentProject.AccessConnection
.CursorLocation = adUseClient
.CursorType = adOpenForwardOnly
.LockType = adLockReadOnly
.Source = sSql
.Open

While Not .EOF
sOrder_number = .Fields(1).Value
sItem_id = .Fields(38).Value
sItem_name = .Fields(39).Value
sItem_sku = .Fields(42).Value
sItem_meta = .Fields(48).Value

sSql = "delete * " _
& "from 2017WooOrdersIn " _
& "where Order_Number = '" & sOrder_number & "' " _
& "and Item_Id = '" & sItem_id & "' " _
& "and Item_name = '" & sItem_name & "' " _
& "and Item_sku = '" & sItem_sku & "' " _
& "and Item_meta = '" & sItem_meta & "' " _
& "from 2017WooOrdersIn " _
& "where Order_Number = '" & sOrder_number & "' " _
& "and Item_Id = '" & sItem_id & "' " _
& "and Item_name = '" & sItem_name & "' " _
& "and Item_sku = '" & sItem_sku & "' " _
& "and Item_meta = '" & sItem_meta & "' "


con.Execute sSql, lRows

.MoveNext
Wend
End With
End Function
 

jdraw

Super Moderator
Staff member
Local time
Today, 00:40
Joined
Jan 23, 2006
Messages
15,393
One way to remove duplicates:

Create a new table with all fields
Identify the field that is the PrimaryKey --uniquely identifies each record in the table), Identify that field as the PK and ensure NO duplicates.

Then using your original data as the source of data, add records to the new table.

Access will not add any duplicates into the new table.

Make sure you understand PK.

PS: You can upload a copy of the database if you attach it as a zip file.
 
Last edited:

isladogs

MVP / VIP
Local time
Today, 05:40
Joined
Jan 14, 2017
Messages
18,254
Its remarkably complicated to delete duplicates keeping one record for each

Jack's method works well.

Here's another method so you have a choice.
There are 4 steps.

a) Identify the dupes using the duplicate query wizard.
Make sure you include the PK field
I'll call that qryA

b) Create an aggregate query based on qryA to find the first PK of each duplicate record. Call that qryB. These are the records you will keep.

c) Use an unmatched query to find all records in qryA NOT in qryB.
These are the records you are going to DELETE
Call this qryC.

d) Now make a delete query
If you can delete records in qryC, do so BUT that's unlikley
So the final step is to create a query linking your original table to qryC
Then change that to a DELETE query and run it

Actually there is a very important step 5... whether you use my method or Jack's:
Identify the cause of the duplicates & fix it so you hopefully never need to repeat the process

BUT you may wish to create a procedure that runs each query in turn just in case you do have to do it again.
 

Pat Hartman

Super Moderator
Staff member
Local time
Today, 00:40
Joined
Feb 19, 2002
Messages
43,408
I use Jack's method. It is the simplest and it doesn't require any code. Just make the table and identify the unique columns. I make a unique index but use an Autonumber as the PK. Then the append query will append only the first of the duplicate records.
 
Last edited:

Cronk

Registered User.
Local time
Today, 14:40
Joined
Jul 4, 2013
Messages
2,774
Me too (use Jack's method). However in some instances of data cleansing, I've come across situations where additional information is in other non key fields.

For example one record might have a work phone number and no home phone number whereas the other has it the other way around.

I have set it up so all the dupes are shown for the user who can edit the record to be kept before deleting the others.
 

isladogs

MVP / VIP
Local time
Today, 05:40
Joined
Jan 14, 2017
Messages
18,254
Normally I would also use Jack's method.
Why do 4 steps when one would do the job?

However, if for some reason you need to keep existing PK field because they are used as FK fields in other tables, then my alternative is worth considering.

But as previously stated you then need to identify the cause of the duplicates & fix it so you hopefully never need to repeat the process
 

Pat Hartman

Super Moderator
Staff member
Local time
Today, 00:40
Joined
Feb 19, 2002
Messages
43,408
ridders,
If you want to keep an existing PK, include the PK in the append query when copying from the old table to the new.

An append query is the ONLY way you can populate an autonumber with an existing value. You cannot do it using DAO/ADO or by typing into the field.
 

arnelgp

..forever waiting... waiting for jellybean!
Local time
Today, 12:40
Joined
May 7, 2009
Messages
19,246
you Must add an Autonumber field
in your table.

then you can delete the duplicate using SQL:

DELETE (SELECT COUNT(*) FROM 2017WooOrdersIn AS T1 WHERE (T1.Order_number & T1.Item_id & T1.Item_Name & T1.Item_Sku & T1.Item_Meta =2017WooOrdersIn.Order_number & 2017WooOrdersIn.Item_id & 2017WooOrdersIn.Item_Name & 2017WooOrdersIn.Item_Sku & T1.Item_Meta) AND T1.ID <= 2017WooOrdersIn.ID) AS Expr1, * FROM 2017WooOrdersIn WHERE ((((SELECT COUNT(*) FROM TABLE1 AS T1 WHERE (T1.Order_number & T1.Item_id & T1.Item_Name & T1.Item_Sku & T1.Item_Meta =2017WooOrdersIn.Order_number & 2017WooOrdersIn.Item_id & 2017WooOrdersIn.Item_Name & 2017WooOrdersIn.Item_Sku & T1.Item_Meta) AND T1.ID <= 2017WooOrdersIn))>1));
 

isladogs

MVP / VIP
Local time
Today, 05:40
Joined
Jan 14, 2017
Messages
18,254
ridders,
If you want to keep an existing PK, include the PK in the append query when copying from the old table to the new.

An append query is the ONLY way you can populate an autonumber with an existing value. You cannot do it using DAO/ADO or by typing into the field.

Agreed
Mind you I also stressed to include the PK field in my 4-step approach.
As I kept the original table, the PK field values were unchanged

My approach was offered as an alternative so the OP had a choice of methods.
I didn't suggest it was better ....:)

Hopefully the OP will return to the thread at some point ....
 

markcrobinson

Registered User.
Local time
Yesterday, 21:40
Joined
Nov 28, 2017
Messages
14
Definitely like this solution best, but when I paste it into an SQL query, "Syntax Error in Query Expression". In the error message the query ends before "AS Expr1, *"

The autonumber field is named ID

you Must add an Autonumber field
in your table.

then you can delete the duplicate using SQL:

DELETE (SELECT COUNT(*) FROM 2017WooOrdersIn AS T1 WHERE (T1.Order_number & T1.Item_id & T1.Item_Name & T1.Item_Sku & T1.Item_Meta =2017WooOrdersIn.Order_number & 2017WooOrdersIn.Item_id & 2017WooOrdersIn.Item_Name & 2017WooOrdersIn.Item_Sku & T1.Item_Meta) AND T1.ID <= 2017WooOrdersIn.ID) AS Expr1, * FROM 2017WooOrdersIn WHERE ((((SELECT COUNT(*) FROM TABLE1 AS T1 WHERE (T1.Order_number & T1.Item_id & T1.Item_Name & T1.Item_Sku & T1.Item_Meta =2017WooOrdersIn.Order_number & 2017WooOrdersIn.Item_id & 2017WooOrdersIn.Item_Name & 2017WooOrdersIn.Item_Sku & T1.Item_Meta) AND T1.ID <= 2017WooOrdersIn))>1));
 
Last edited:

Pat Hartman

Super Moderator
Staff member
Local time
Today, 00:40
Joined
Feb 19, 2002
Messages
43,408
Make sure you make at least two backups (one should be zipped) before you start following the solution you like. And then check carefully that you haven't lost anything. Personally, I prefer the positive approach of adding the records I want rather than deleting the records I don't want.
 

markcrobinson

Registered User.
Local time
Yesterday, 21:40
Joined
Nov 28, 2017
Messages
14
I got as far as this on my own. This compiles without error

SELECT Count(*) AS Expr1
FROM 2017WooOrdersIn AS T1, 2017WooOrdersIn
WHERE (
(
(
([T1].[order_number]) &
([t1].[item_id]) &
[T1].[Item_Name] &
[T1].[Item_Sku] &
[T1].[Item_Meta])
=
([2017WooOrdersIn].[order_number]) &
([2017WooOrdersIn].[item_id]) &
([2017WooOrdersIn].[item_name]) &
([2017WooOrdersIn].[item_sku]) &
([2017WooOrdersIn].[item_meta])
)
)
AND T1.ID <= [2017WooOrdersIn].[ID]
;

stumped on this part
-------------------------------
AS Expr1, * FROM 2017WooOrdersIn WHERE ((((SELECT COUNT(*) FROM TABLE1 AS T1 WHERE (T1.Order_number & T1.Item_id & T1.Item_Name & T1.Item_Sku & T1.Item_Meta =2017WooOrdersIn.Order_number & 2017WooOrdersIn.Item_id & 2017WooOrdersIn.Item_Name & 2017WooOrdersIn.Item_Sku & T1.Item_Meta) AND T1.ID <= 2017WooOrdersIn))>1));
 

arnelgp

..forever waiting... waiting for jellybean!
Local time
Today, 12:40
Joined
May 7, 2009
Messages
19,246
i forget to mention you substitute your table name
to "TABLE1".

anyway, have made changes and tested and it passed
(on my own data).

DELETE
(
SELECT COUNT(*)
FROM
2017WooOrdersIn AS T1
WHERE
(T1.Order_Number & T1.Item_Id & T1.Item_Name & T1.Item_Sku & T1.Item_Meta)
=
(T2.Order_Number & T2.Item_Id & T2.Item_Name & T2.Item_Sku & T2.Item_Meta)
AND
T1.ID <= T2.ID
)
AS Expr1, T2.*
FROM 2017WooOrdersin AS T2
WHERE
(
SELECT COUNT(*)
FROM
2017WooOrdersIn AS T1
WHERE
(T1.Order_Number & T1.Item_Id & T1.Item_Name & T1.Item_Sku & T1.Item_Meta)
=
(T2.Order_Number & T2.Item_Id & T2.Item_Name & T2.Item_Sku & T2.Item_Meta)
AND
T1.ID <= T2.ID
) > 1;
 

markcrobinson

Registered User.
Local time
Yesterday, 21:40
Joined
Nov 28, 2017
Messages
14
Thank you! Sorry for being such a noob.

The 2017WooOrdersIn has 14 records with duplicates and 2 records with with double dupes. for a total of 34 records. After running this I am expecting 16 records to remain.
Unfortunately, I'm still doing something wrong and running the query deletes all the records.

When I view this query, it adds a "Expr1" field where duplicate items have a 2 in the field and some items with 4 duplicates have a 4 in the field.
It appears to be effectively identifying and counting the duplicates. YAY!

I've uploaded the file in case you have a moment to help me figure it out. If you'll accept me as a contact, I'll PM you the password.
 

Attachments

  • 2017WooOrdersIn.xlsx
    17 KB · Views: 107
Last edited:

Pat Hartman

Super Moderator
Staff member
Local time
Today, 00:40
Joined
Feb 19, 2002
Messages
43,408
Let us know when you are ready to try the simple solution.
 

markcrobinson

Registered User.
Local time
Yesterday, 21:40
Joined
Nov 28, 2017
Messages
14
arnelgp provided the solution in a private message, most likely because he didn't want to highlight my ID-10-T error for the world to scoff at.
The reason I was having trouble with his modified solution was that mt ID field was not set to "Indexed: Yes (No Duplicates)"
Once that was fixed, the query worked perfectly.

In answer to some of the other comments:

Sometimes avoiding duplicates is impossible. I'm importing from a shopping cart download that I can't control. Although I could take the downloaded csv file and clean it up before importing into Access, this would require an additional, unnecessary step. Running arnelgp's query solves the problem quickly and elegantly.

I've got this process down to a "push one button" application that non-tech people can use to download the shopping cart orders, make the necessary changes to the data, print sales reports, and prepare it for import into Quickbooks.

Yes, I'm appropriately annoyed at the shopping cart people for their duplicate download bug, but now with a single SELECT statement, I'm free from their tyranny!

Mark Robinson
HandicappedPets.com
 
Last edited:

Users who are viewing this thread

Top Bottom