Fuzzy Business Logic to prevent site duplicates

Rx_

Nothing In Moderation
Local time
Today, 00:48
Joined
Oct 22, 2009
Messages
2,795
Ideas welcome: My Problem -- A table tracks site names in a single text field (with an auto counter for the primary key).
There is a loose business rule to enter a new site name based on some fuzzy rules. So, as new sites are added there are business naming duplicates when people don't communicate.
Example: Critter Well NV 033 <--- > Critter 33 Wells NV could be entered by two different offices and refer to the same site.
These are not actual Database duplicates, but represent a business site naming duplicate. It creates the same kind of problems.
Solution sought: So, as someone enters a new site name, the idea is to use a function to produce some output of all simular names so users can determine that site name already exist. Maybe like a Yahoo search "You typed "<blaa>" did you mean... <bla> or <BllA>
With offices across the nation, when two people enter a new site name, there are 45 associated forms to track other things about that business site. And when they discover the duplicate site name a month later, they just want the database guy to merge the different data (and often conflicting data). With a 1,000 new sites a month across 5 offices, duplicates are growing. Names like "Robert" vs. "Bob" are not an issue since the sites somewhat describe a legal description of sorts.

At least it is a single text field.


If anyone has ideas, please share. In a couple of weeks, I hope to post my solution.
 
Try the following on for size:

Code:
Public Function checkForSimilar(strTableName As String, strFieldName As String, strTestString As String) As String
' Takes a table name, field name and input string as parameters.  Checks for similar phrases in the table fields and
' presents the user with a choice of which entry to use if similar entries are found
 
Dim rst As DAO.Recordset
Dim strMsg As String
Dim strFieldWordsArray As Variant
Dim strInputWordsArray As Variant
Dim i As Integer
Dim j As Integer
Dim varReturn As Variant
Dim blnFoundSimilar As Boolean
Dim strTemp As String
 
strInputWordsArray = Split(strTestString, " ") 'Splits the input string into an array of words
 
Set rst = CurrentDb.OpenRecordset(strTableName)

With rst

    Do While Not .EOF
        strFieldWordsArray = Split(.Fields(strFieldName).Value, " ") 'splits the field string into an array of words
        'For each word in the test string, check to see if it is contained in the Field value string

        For i = LBound(strInputWordsArray) To UBound(strInputWordsArray)
            If instrArray(strFieldWordsArray, strInputWordsArray(i)) = True Then 'indicates that the word was found somewhere in the field value string
                j = j + 1 'increment number of positive hits
            End If
        Next

        If j >= i * 0.75 Then 'indicates that more than 75% of the test string words were found in the field string - change this percentage if you want
            strTemp = .Fields(strFieldName).Value
            strMsg = "A similar entry: " & strTemp & " was found.  Do you want to use this entry instead?"
            blnFoundSimilar = True
            Exit Do
        End If

        .MoveNext
    Loop

End With

rst.Close
Set rst = Nothing

If blnFoundSimilar = True Then
    varReturn = MsgBox(strMsg, vbYesNo, "Similar Entry")

    If varReturn = vbYes Then 'Use the entry found in the database
        checkForSimilar = strTemp
    Else 'Use your new entry
        checkForSimilar = strTestString
    End If

Else 'no similar entry found - use the new one
    checkForSimilar = strTestString
End If
 
End Function
 
Function instrArray(strArray, strWanted, Optional blnCaseCrit As Boolean = False) As Boolean
  'Check for existance of a member within an array
  
  Dim i    As Long
  Dim strA As String
  Dim strB As String
   
10:    instrArray = False
   
20:    For i = UBound(strArray) To LBound(strArray) Step -1
   
30:      Select Case blnCaseCrit
   
           Case Is = True
40:          strA = strArray(i)
50:          strB = strWanted
     
60:        Case Is = False
70:          strA = LCase(strArray(i))
80:          strB = LCase(strWanted)
     
90:      End Select
   
100:     If InStr(1, strA, strB) > 0 Then
110:       instrArray = True
120:       Exit Function
130:     End If
   
140:   Next i
   
End Function
 

Users who are viewing this thread

Back
Top Bottom