Fuzzy Search (1 Viewer)

llkhoutx

Registered User.
Local time
Yesterday, 22:31
Joined
Feb 26, 2001
Messages
4,018
I'm using the Soundex algorithm (free on the Internet at http://allenbrowne.com/vba-Soundex.html) which does a phoenitic "fuzzy" comparison of names. Unfortunately, with this method, multiple distinct names produce the same result.

Examples of which I want to match are


ABC, INC. ABC CO INC
aaa bbb ccc aaa bbb
x corp x corporation

Does anyone have any ideas? There are too many rows to compare by inspection.

Thank you in advance for your responses.
 

vbaInet

AWF VIP
Local time
Today, 04:31
Joined
Jan 22, 2010
Messages
26,374
Examples of which I want to match are


ABC, INC. ABC CO INC
aaa bbb ccc aaa bbb
x corp x corporation
I don't quite follow which one is your search criteria and which are your results and in what fields. Please elaborate.
 

llkhoutx

Registered User.
Local time
Yesterday, 22:31
Joined
Feb 26, 2001
Messages
4,018
I'm trying to match names in the left column to names in the right column.
 

jdraw

Super Moderator
Staff member
Local time
Yesterday, 23:31
Joined
Jan 23, 2006
Messages
15,378
Could you please elaborate on the answer you gave to vbaInet's request?

Could you give us 4 or 5 actual records, and show the contents of Left Column and Right Column? Be specific. There are 4 or 5 columns in your response.

You can set up some strategies.

Is the Right Column a substr of the left column?
You could remove spaces and punctuation and try the match on all chars, or the shortest of Left Col and Right col.
You could try the leftmost 5 chars or 8 or your choice.

Since they are "fuzzy" matches, what constitutes successful match?

Other methods such as Levenshtein distance may be useful.

see
http://forums.aspfree.com/microsoft-...ss-101773.html
http://www.vbforums.com/showthread.php?t=518095
 
Last edited:

spikepl

Eledittingent Beliped
Local time
Today, 05:31
Joined
Nov 3, 2010
Messages
6,142
Fuzzy search requires a fuzzy definition :D
 

llkhoutx

Registered User.
Local time
Yesterday, 22:31
Joined
Feb 26, 2001
Messages
4,018
My example was clear to me

Table A - 1 column
ABC, INC.
aaa bbb ccc
x corp

Table B - 2 columns
ABC CO INC 25
aaa bbb 76
x corporation 42

I want ABC, INC. to match ABC CO INC and return 25, aaa bbb ccc to return 76, and x corp to return 42.

The Soundex phonetic match return too many duplicates.

I'll taken a look at Levenshtein distance method.

Thanks for the suggestions.
 

vbaInet

AWF VIP
Local time
Today, 04:31
Joined
Jan 22, 2010
Messages
26,374
I don't think the soundex algorithm has that in its logic. If you say you're getting duplicates, have you considered removing the duplicates?
 

Users who are viewing this thread

Top Bottom