ryan.gillies
Registered User.
- Local time
- Tomorrow, 06:14
- Joined
- Apr 8, 2011
- Messages
- 53
Hi all
I'm attempting Fuzzy Matching for the first time and have two tables of data to match - a master file of company names (tMaster) and a list of user-inputted company names (tUser) from a survey. As you can imagine the user-inputted table has suffered greatly in the data integrity department.
I want to try and match as many of the fields from tUser against tMaster as possible. Some of the worst case examples of the data I'm facing are as follows:
Currently I have both tables in an unjoined query, in which I set the criteria for soundex([user name]) to soundex([master name]). It seems to be working reasonably well and has matched the above examples I gave. However it has had some unwanted results, such as matching PPI-Claimback.co.uk to PPI Claimline, and Kevin McGee to Key Financial Claims Ltd.
Has anyone used Soundex successfully on something like this before? Any tips on how to go about fine tuning my results?
Thanks in advance!
I'm attempting Fuzzy Matching for the first time and have two tables of data to match - a master file of company names (tMaster) and a list of user-inputted company names (tUser) from a survey. As you can imagine the user-inputted table has suffered greatly in the data integrity department.
I want to try and match as many of the fields from tUser against tMaster as possible. Some of the worst case examples of the data I'm facing are as follows:
- Investor Compensation (Claims) LLP match to
- Inveator Compensation
- Investor Compensation Claims LLP
- Investor Compensation LLP
- i-Sm@rt Consumer Services Ltd match to
- I-Sm@art Consumer Solutions
- Isamrt Consumer Solutions
- I-Sm@rt Consumer Services
Currently I have both tables in an unjoined query, in which I set the criteria for soundex([user name]) to soundex([master name]). It seems to be working reasonably well and has matched the above examples I gave. However it has had some unwanted results, such as matching PPI-Claimback.co.uk to PPI Claimline, and Kevin McGee to Key Financial Claims Ltd.
Has anyone used Soundex successfully on something like this before? Any tips on how to go about fine tuning my results?
Thanks in advance!