View Full Version : Deleting duplicates but keeping totals
I have a table of 60,000 records many of which are repeated. Each record has two fields - a url, and a number.
I need to delete all duplicate names, but I need also the totals of the duplicates to be added to the total of the url which will remain.
For example: Name Totals
www.access.com 12
www.access.com 2
www.access.com 9
I need to be left with
www.access.com 23
I can't find a way to do this by QBE. Perhaps sql would manage it? Is it a job for coding? As I am very far from that far on with VBA, would anyone have a code that could be used for this please?
I need to perform this process with many similar tables.
Hopefully
Stressed Out of Warwickshire
Been there and done that, MoP, but I can't find how to make it help. It shows 431 duplicates and then a field with the number of duplicates.
I could manually delete each of the records - but in a table of 25,000 records, it could be a long night's work, and the weekend. And I need to add the totals as well.
Alternatively, I can't just delete the duplicates in the automatic way (by duplicating the table etc as per Access Help) because the totals won't be added up.
It's a bit of a dilemma. But I do appreciate your help.
Mile-O 05-16-2003, 07:19 AM Sorry, deleted my post when I realised hyperlinks weren't included.
Are you wanting to do this programmatically? I know how to do it manually. With a bit of thought I should be able to do it in code.
When I couldn't find your reply, I thought this problem had become a bit too heavy.
I really don't know the answer. I would have thought manually would work, but totalling the totals that will be deleted makes it a challenge. Or impossible.
SQL would probably work, and I'm sure VB would, but it's much further on than my studies so far. Cheers, MoP.
Mile-O 05-16-2003, 07:26 AM Assuming your fields are SiteLink and Total in a table called tblData.
Make a query with this SQL and call it qryInitial.
SELECT SiteLink, Total, CStr([SiteLink]) AS NewLink
FROM tblData;
Make a new query with this SQL and call it qrySecondStep.
SELECT DISTINCTROW First(NewLink) AS [NewLink Field], Count(NewLink) AS NumberOfDups, Sum(Total) AS SumOfTotal, Left([NewLink],InStr(1,[NewLink],"#")-1) AS Site
FROM qryInitial
GROUP BY Left([NewLink],InStr(1,[NewLink],"#")-1)
HAVING (((Count(NewLink))>=1));
One more query with this SQL and call it qryFinal.
SELECT Site, SumOfTotal AS [Number] INTO tblFixed
FROM qrySecondStep;
Run the third query. You'll have a new table called tblFixed.
Enter its design view and change the data type of the field called Site to a hyperlink.
Rename the table if you wish.
All done.
Looks great, MoP. What a star! Being dragged into a meeting, will report back asap. But it looks really good. Very grateful.
Hi, MoP. The first query works fine.
Access stumbles with the second, and the trouble seems to be around
First(NewLink)
Be very grateful if you could look your sql over and see if you can see the error. I've tried any number of variations without success. It needs an expert, and sadly I am still far from that. Cheers
Mile-O 05-19-2003, 01:00 AM Have a look at this
Apologies for the poor naming conventions :rolleyes:
It looks good ... only the problem is that the number of duplicates has to be added to the total.
So the result from your table1 which displays in tblFixed would be 27 for www.mileout.org.uk, because the total would be added to the number of times the url has been repeated.
Similarly www.bt.com would total 9.
Sorry that it isn't all straight forward, MoP. It's really very good of you to work over this. I hope your patience will last a little longer. Thanks enormously.
Mile-O 05-19-2003, 02:38 AM Change the SQL on the Make Table query to this then:
SELECT [Find duplicates for Query1].Site, [SumOfNumber]+[NumberOfDups] AS [Number] INTO tblFixed
FROM [Find duplicates for Query1];
A couple of hours later, and I seem to be back to Saturday's delights where no matter what I did with the second qsl, I was told nothing but "invalid prodedure call". Frustrating is one of the words that leaps to mind.
I just mention this in case you might have thought of something since ...
Mile-O 05-19-2003, 06:11 AM I don't understand where you are going wrong as the example I posted does what you want (after the third SQL statement is amended to what I rewrote above.
You and me both, MoP. I'll get there yet. I do appreciate all that guidance. Must be a mental collapse. But, I think, I'm almost there ...
Access is happy with yours, but won't let me through with the second query. You'll spot the mistake, of course, but I can't. All I get is the message 'invalid procedure call'. It was identical to yours in sql, and now it's identical via qbe. Same result. If you'd take a quick look, that would be brilliant of you. Cheers.
Mile-O 05-19-2003, 08:32 AM Make a copy of your database - remove everything except the table.
Leave only a few duplicate records in the table.
Post it here.
It's zipped down to 137kb but the forum won't accept it. Perhaps it's the fates saying I'll solve it myself given time. But appreciate very much all the guidance.
Mile-O 05-19-2003, 08:54 AM Send it to this address then: Mile-O-Phile (takeahikefatboy@hotmail.com)
WayneRyan 05-19-2003, 12:55 PM kupe,
I think Mile's queries are the best approach, but if
this is really driving you crazy, you can try this
code from a Command Button.
I haven't tested it, but it should work.
BACK UP YOUR DATA FIRST THOUGH !!
Dim dbs As Database
Dim rst As Recordset
Dim sql As String
Dim LastSite As String
Dim ThisTotal As Long
'
' Define a recordset
'
sql = "Select * from tblData Order by SiteLink"
Set rst = dbs.OpenRecordset(sql)
'
' Get the First Record
'
rst.MoveFirst
LastSite = rst!SiteLink
'
' Process all records
'
rst.MoveNext
While Not rst.EOF and Not rst.BOF
'
' If the current record is a duplicate, get it's total + 1 and
' add it to the previous record, then delete it.
' If not a duplicate then save the Key field
'
If rst!SiteLink = LastSite Then
ThisTotal = rst!Total + 1
rst.MovePrevious
rst.Edit
rst!Total = rst!Total + ThisTotal
rst.Update
rst.MoveNext
rst.Delete
Else
LastSite = rst!SiteLink
End If
'
' Move to next record
'
rst.MoveNext
Wend
Set rst = Nothing
Set dbs = Nothing
hth,
Wayne
Trying here again. Thanks for the patience, MoP.
Sorry, try a different approach. (I used to wonder why people changed their names regularly on these forums.)
Mile-O 05-20-2003, 12:45 AM Can you convert it to '97, please?
Mile-O 05-20-2003, 01:09 AM The table you have: tblTable1
You have the field SiteURL which is a text field - was this always a text field? If so, I had thought that the field was a hyperlink field.
Actually, you mentioned that the other day, I think. But, no they are hyperlinked because they are just being totalled. Cheers.
Mile-O 05-20-2003, 01:28 AM I've put the make table query in an autoexec macro.
I'm back. Dragged off to a meeting as I was sending that message, which ought to have confirmed they were not hyperlinked. Will look at this with great interest, and come back. Cheers.
My beginning was not all that dissimilar. (I'm almost too nervous to mention the following for fear it happens again. ) But when grouping the sites for duplicates, there still were duplicates. It was as if the query saw numbers of "hits" in the record, saw they were different, and assumed the string was different too.
I offered the result to the duplicates wizard and it found, I think, about 400 duplicates still remained out of the 24,000 urls.
Despite the great respect I have for you, I was a bit anxious, MoP, when I put yours to the duplicate wizard test.
But, brilliant! Not a one. It's perfect. It's been a painful learning curve this one. I'm really glad you were the teacher, and very grateful for it - and for your patience. Many many thanks.
Mile-O 05-20-2003, 02:45 AM The problem from querying on the original table came from it having an AutoNumber field.
No probs - have fun in Access Land! :cool:
Hi, Wayne, thanks very much for the code and for the trouble you went to. Mile-O-Phile's works really well. I will borrow your code if that's okay and use it in another place. So your thoughtfulness is being put to good use, and is appreciated. Many thanks.
|
|