Deleting Duplicates

lemo · Nov 28, 2012

hello.
i've been struggling with this seemingly simple task - deleting duplicates from a table.

here is the situation -
i have a table with 6 fields. first is a unique id (auto primary key), the rest 5 are just variables, which are being manually updated/added/deleted on a regular basis. when people update those, sometimes they create duplicates, and even triplicates (theoretically it can go on, but highly unlikely).
the question is - is there a delete query that gets rid of dupes/triples/etc in one convenient swoop?

right now i am using a relatively clunky method - first i create a 'make table query' on the 5 variables grouped, with the counter with >1 criteria and max id -
SELECT Max([Inventory].[ID]) AS [MaxOf ID], [Inventory].[f1], [Inventory].[f2], [Inventory].[f3], [Inventory][f4], [Inventory].[f5] INTO InventoryDupes
FROM [Inventory]
GROUP BY [Inventory].[f1], [Inventory].[f2], [Inventory].[f3], [Inventory][f4], [Inventory].[f5]
HAVING (((Count(Inventory.[f1]))>1))
ORDER BY [Inventory].[f1], [Inventory].[f2], [Inventory].[f3], [Inventory].[f4], [Inventory].[f5];

this gives me a table of dupes only BUT not if i have triplicates, since it finds only 1 max id. then i use this table to run the 'delete query' in this form -

DELETE FROM Inventory
[FONT=&quot]WHERE Inventory.[ID] IN (SELECT InventoryDupes.[MaxOf ID] FROM InventoryDupes)

then, if there were triplicates, i have to reran the whole procedure once more.

thanks in advance for help.
len
[/FONT]

plog · Nov 28, 2012

Flip your thinking on this and it becomes easier. Right now you are identifying records to eliminate, you should instead find records that should remain.

You're query right now is close to giving you that. You should eliminate all fields from your SELECT clause except MaxOfID, remove the HAVING clause and for no reason other than make the SQL simpler and it serves no purpose; eliminate the ORDER BY clause:

Code:

SELECT Max(ID) AS MaxOfID
FROM Inventory
GROUP BY f1, f2, f2, f3, f5;

That query tells you which ID numbers are "unique". Bump it against your Inventory table and all the second, third, fourth, etc. "duplicates" get eliminated.

lemo · Nov 28, 2012

thanks for the quick reply, plog.
i actually thought about it (except i used MIN function, not MAX, to keep older original records), but i am not sure what to do with the result.
what do you mean by 'Bump it against your Inventory table...'?
l

Deleting Duplicates

lemo

Registered User.

plog

Banishment Pending

lemo

Registered User.

Similar threads

Users who are viewing this thread