Suggestions for Name Algorithm

Lightwave

Ad astra
Local time
Today, 20:54
Joined
Sep 27, 2004
Messages
1,537
Dear All

I was wondering if anyone can point me in the direction of any algorithms that would convert a name eg

Barack Obama

into a number eg

345

In some of the sports races I've been timing we sometimes number the competitors and it would be useful if the numbering had some hard relationship to the person rather than just through a stored database somewhere.

Might be a bad idea but thought I'd investigate it anyway
 
I seem to remember (but can't remember what it's called!) a way of converting alphabetical data, into numerical data. I can't remember what it's called, but I'll have a Google. Unless, you use each persons UID as their competitor number aswell?
 
what you need is called a hashing function, i think

you need something that examines and processes each character, one at a time

the problem is that you need an algorithm that will generate a unique result, as otherwise any anagram of a name will give the same result, as well as probably many other names - so it certainly wont produce a small number such as 345. (although the videoplus system seems to produce some surprisngly small numbers at times)

[edited - ive just checked - a hashing function genarally generates a limited number of possible values for a given hash (termed a bucket) - so you then need a way of distinguishing the selection from the candidate values - wikipedia article was interesting and thorough

i suppose one way of doing this would be to take just the first two chars of each name BaOb - and convert these to a number in some way eg
alphabetposition(first letter)*1 + alphabetposition(secondletter)*2 + + alphabetposition(thirdletter)*3 + + alphabetposition(fourthletter)*4
it just depends how often somethingl ike this produces the same hashed value for different names - and how you THEN distinguish between them.

ie if Barack Obama generates 345, fine - but
if Tommy Smith then also generates 345, how do you resolve the clash?
 
Last edited:
its called soundex. here is an example and a helper function

Code:
Function Soundex(LastName As String)

    Dim i As Integer, j As Integer, Str_Len As Integer
    Dim SCode As String, PrevCode As String, strResult As String, CharTemp As String * 1
    
    If LastName = "" Then
        Soundex = ""
        Exit Function
    End If
    
    If Len(LastName) < 3 Then
        Soundex = LastName
        Exit Function
    End If
    
    LastName = Get_Name(LastName)
    Str_Len = Len(LastName)
    
    j = 0
    i = 0
    PrevCode = "0"
    Do While (i < Str_Len And j < 4)
        i = i + 1
        
        CharTemp = Mid$(LastName, i, 1)
               
        Select Case CharTemp
            Case "R"
                SCode = "6"
            Case "M", "N"
                SCode = "5"
            Case "L"
                SCode = "4"
            Case "D", "T"
                SCode = "3"
            Case "C", "G", "J", "K", "Q", "S", "X", "Z"
                SCode = "2"
            Case "B", "F", "P", "V"
                SCode = "1"
            Case Else
                SCode = "0"
        End Select
        
        If CharTemp = "H" Or CharTemp = "W" Then
            SCode = PrevCode
        End If
        
        If (SCode > "0" Or j = 0) Then
            If (SCode <> PrevCode Or j = 0) Then
                strResult = strResult + SCode
                j = j + 1
            End If
        End If
        
        If j = 0 Then
            j = j + 1
        End If
        
        PrevCode = SCode
    Loop
    
    i = j
    Do While (i <= 4)
        strResult = strResult + "0"
        i = i + 1
    Loop
    
    Soundex = Left(LastName, 1) + Mid$(strResult, 2, 3)

End Function


'------------------------------------------------
'                                                |
'  This function gets the name and cleans it up  |
'  so that it can be soundexed.                  |
'                                                |
'------------------------------------------------

Function Get_Name(inLastName As String) As String
Dim i As Integer, Str_Len As Integer
Dim LastName As String, Str1 As String, Str2 As String, ch As String * 1, inString As String

inString = UCase$(Trim(inLastName))
Str_Len = Len(inString)

    If (Mid$(inString, 1, 3) = "ST.") Then
        inString = "SAINT" + Right$(inString, Str_Len - 3)
        Str_Len = Str_Len + 2
    End If
    
    If (Mid$(inString, 1, 3) = "ST ") Then
        inString = "SAINT" + Right$(inString, Str_Len - 3)
        Str_Len = Str_Len + 2
    End If
      
    For i = 1 To Str_Len
        ch = Mid$(inString, i, 1)

        If (ch >= "A" And ch <= "Z") Then
            LastName = LastName + ch
        End If
        
        If ch = "," Then
            i = Str_Len
        End If
        
    Next i
    
Get_Name = LastName

End Function
 
Last edited:
Soundex is really cool - the algorithm is nearly a century old

It also won't generate unique values though - in fact, there isn't going to be an algorithm that generates unique, shorter output for every possible input (Shannon's Theorem).
 

Users who are viewing this thread

Back
Top Bottom