Matching Text Strings?

Tommy B

Registered User.
Local time
Today, 21:36
Joined
Feb 26, 2001
Messages
43
Hi Guys,

I am soon to start a new job (hurray!) and my first task will be to look at their marketing/leads database. Basically what it does is take a list of address's from mail-outs and promotions and match those address's to the actual sales, thus enabling them to measure the relative success of a campaign. At the moment I believe the system first trims the names/address's and then creates two tables who's primary key is the combination of say Address1&Address2&Address3 etc, and then matches them. They may be doing this with select statements also. This process is taking a longtime as there are alott of records to compare. I belive that some of the data is on SQL Server and the rest is in the form of imported flat files. Has anyone out there done anything similar? Can anyone think of a quicker method of string comparison/matching? Any thoughts greatly appreciated :p

Cheers,

Tommy B
 
Usually this is done another way than to compare the raw addresses. Companies that do this in bulk have some sort of hashing algorithm that allows them to compress the address info into smaller strings. And I'm not talking about Trim functions, either. Once you have the strings hashed, you can use that as a partial filtration algorithm.

Further, I don't know about your situation, but usually in the USA the third address line is City, State, Zip (Postal code, for our UK and European members not familiar with ZIP codes.) Again, we have seen many ways of compressing this, including having a lookup table for that line, because (unlike the first 2 address lines) that one is most likely to be repeated A LOT. In which case the key for the third address line might just be a number.

You might try a web search on 'text hashing algorithms' to see if you get a useful article on the subject.
 
Many thanks for the heads-up Doc Man. I'll go hunting on that subject now :p

Cheers mate,

Tommy B :D
 

Users who are viewing this thread

Back
Top Bottom