Separate not discard duplicates

mohobrien

Registered User.
Local time
Yesterday, 23:37
Joined
Dec 28, 2003
Messages
58
I have a table with two fields. Disp_Num represents the name of a parcel of land and PPID represents a unique parcel of land. Sometimes, there are duplicate PPIDs with different Disp_Num. These would represent overlapping areas. The PPID field is sorted alphabetically.
Disp_Num PPID
S-140651 0165000HC7
S-139669 0165000HCS
S-142397 0165000HD0
S-140365 0165000HD8

I want separate tables each with unique PPID only and not discard any overlaps. I do this by looping through the records in the table and seeing if each PPID is different from the record before. If it is different then it writes a counter value in a new field called Counter. If not it increments the counter and checks the next record. It works well using the code below but it is extreemly slow. About 180 records per second. Not too bad for 100K records but I'm heading for some tables with >1 M records. I know access isn't intended for those types of record numbers, but are there better techniques than what I'm using?
Code:
Public Sub AddCounterField(lsdDb As String)

Dim db As DAO.Database
Dim tdf As DAO.TableDef
Dim fld As DAO.Field
Dim rs As DAO.RecordSet
Dim strSQL As String
Dim qdf As DAO.QueryDef
Dim PPIDVal As String
Dim PPIDOldVal As String
Dim Counter As Long
Dim FieldExists As Boolean

On Error GoTo EH
Set db = DBEngine.OpenDatabase(lsdDb)
Set tdf = db.TableDefs("tbltmpNewMinLands")
For Each fld In tdf.Fields
If fld.Name = "Counter" Then
FieldExists = True
End If
Next
If Not FieldExists Then
tdf.Fields.Append tdf.CreateField("Counter", dbLong, 12)
End If
strSQL = "SELECT PPID, Counter FROM tbltmpNewMinLands;"
Set rs = db.OpenRecordset(strSQL)
    PPIDOldVal = ""
    Do While Not rs.EOF
        PPIDVal = rs!PPID
        If PPIDVal <> PPIDOldVal Then
        Counter = 1
        PPIDOldVal = PPIDVal
        Else
        Counter = Counter + 1
        PPIDOldVal = PPIDVal
        End If
            With rs
            .Edit
            rs!Counter = Counter
            'Debug.Print rs!PPID; "  "; Counter
            .Update
            .MoveNext
        End With
    Loop
    rs.Close
    db.Close
    Set fld = Nothing
    Set tdf = Nothing
    Set db = Nothing
Exit Sub
EH:
  MsgBox Err.Number & "  " & Err.Description
      rs.Close
    db.Close
    Set fld = Nothing
    Set tdf = Nothing
    Set db = Nothing
End Sub
 
For a start when you for next through your table def for a specific field name once it finds it you can use Exit For to drop out of the loop. Lets say there are 100 fields and it is field 2 it still performs an additional 98 loops unneccessarily.

You can also get rid of the With - End With statements

Use :

Rs.Edit
Rs("Counter") = Counter
Rs.Update
Rs.MoveNext


Address these two changes and test performance to see if it increases speed

David
 
I agree, I should jump out of the For Next loop once the Counter field is found, however, consistently,
before change, 17 minutes
removing the With-End With but keeping Rs!Counter, 20 minutes
removing the With-End With and changing to rs("Counter"), 29 minutes
reverting to original, 17 minutes.

I tried this twice to be sure.
 
I have heard say that when using recordsets it is best to use Select * instead of nominating certain fields, even with large field counts. Microsoft say it retreives the table quicker. You could try that.
 
21/01/2009 7:48:06 AM Starting....
21/01/2009 8:01:43 AM Finished Adding field
Down to 13.5 minutes.
Substantial improvement. Thanks Dave
 

Users who are viewing this thread

Back
Top Bottom