match letters in one field to letters in another and display % match

Only just noticed your other replies.

Show me one result line of real data based on your character-by-character function and I will give you a circumstance where it could fail.

So the replacement, does it do it to the string on the left or on the right? Also, where is the string on the right coming from a different table? I think you mentioned this earlier.
 
Show me one result line of real data based on your character-by-character function and I will give you a circumstance where it could fail.

))))))))))))))))))) i can give you many scenarios where it would fail, that's why i'm looking for a better way.

So the replacement, does it do it to the string on the left or on the right? Also, where is the string on the right coming from a different table? I think you mentioned this earlier.

it replaces it on both sides so that way they both end up with the same abbreviations. the data on the left comes from a website, i get a list of facilities from a website and then try to match them up to facilities in our database to see if they're in our network.

so the data on the right comes from our database.
 
heres the data

Code:
Matched%	FacilityName	FacName
50.00%	St. Mary’s Warrick Hospital	Audibel Better Hearing Center
55.32%	St. John Hospital and Medical Center	J J Humes Md And Assoc
55.38%	Columbia St. Mary’s Hospital Columbia	Sacred Heart Rehabilitation Institute
56.25%	St. John Macomb-Oakland Hospital	Oakland Industrial Clinic
56.72%	St. John Macomb-Oakland Hospital	Occupational Health Services
59.38%	St.Vincent Seton Specialty Hospital	St Elizabeth Medical Center
60.61%	Saint Agnes Hospital	St Agnes Home Care
61.82%	St.Vincent Indianapolis Hospital	St Vincent Mobile Mammography
63.83%	Lourdes Hospital	Our Lady Of Lourdes Memorial Hospital
65.38%	Mount St. Mary’s Hospital and Health Center 	Mt St Marys Hospital
65.57%	Columbia St. Mary’s Hospital Columbia	CSMCP Housecalls For The Homebound
67.74%	St.Vincent Indianapolis Hospital	St Vincent Hospital & Health Care Center
68.09%	Columbia St. Mary’s Hospital Columbia	Columbia Hospital
68.57%	Saint Agnes Hospital	St Agnes Healthcare
68.85%	University Medical Center Brackenridge	Dell Childrens Medical Center
72.34%	St. Mary’s of Michigan Medical Center	St Marys of Michigan
73.17%	St. Mary’s Hospital at Amsterdam	St Marys Hospital
75.86%	Mount St. Mary’s Hospital and Health Center 	Mount Saint Marys Hospital
76.36%	Carondelet Holy Cross Hospital	Carondelet Holy Cross  Swing Beds
77.27%	Carondelet Holy Cross Hospital	Holy Cross Hospital
77.55%	St. John Hospital and Medical Center	St John Medical Center
77.78%	St. Mary’s Warrick Hospital	St Marys Warrick
82.76%	St. Mary\'s of Michigan Standish Hospital	Standish Community Hospital
84.62%	Columbia St. Mary’s Hospital Ozaukee 	St Marys Hospital-Ozaukee
85.71%	St.Vincent Seton Specialty Hospital	St Vincent Seton Specialy Hospital
86.21%	Columbia St. Mary’s Hospital Milwaukee	St Marys Hospital Of Milwaukee
89.80%	Providence Health Center	Providence Healthcare Network
90.00%	St. John Macomb-Oakland Hospital	St John Macomb-Oakland Hospital
97.22%	University Medical Center Brackenridge	University Medical Center at Brackenridge
 
Right, we're getting somewhere.

Do you perform the replacement in the function just before the search, or you've already replaced them in your tables?
 
and here's the best example of failure

St. Mary’s Warrick Hospital Audibel Better Hearing Center


it matched 50% because some letters matched but in reality no words matched. but this method can be somewhat useful if i can come up with the cut off %. like if it's under 60% then it's not a match.
i don't know, this is all very shady and i don't like it.

i will speak to the lady tht matches this by hand on Monday and will pick her brain, get the most detailed step by step description of how she decides what's a match and what's not and try to replicate it

like for example, in carondelet, why did she choose holy cross and not the swing beds and so on. maybe then i can come up with something
 
Right, we're getting somewhere.

Do you perform the replacement in the function just before the search, or you've already replaced them in your tables?

i didn't do it in my tables yet, but it's a separate function so i can do it whenever.
 
one more thing, i'm matching on the last 2 columns, forgot to mention that
they're the first 2 columns without commas, spaces, slashes, dashes and any non-letter and non-number characters

Code:
Matched%	FacilityName	FacName	FacilityNameNS	NameNS
50.00%	St. Mary’s Warrick Hospital	Audibel Better Hearing Center	StMarysWarrickHospital	AudibelBetterHearingCenter
55.32%	St. John Hospital and Medical Center	J J Humes Md And Assoc	StJohnHospitalandMedicalCenter	JJHumesMdAndAssoc
55.38%	Columbia St. Mary’s Hospital Columbia	Sacred Heart Rehabilitation Institute	ColumbiaStMarysHospitalColumbia	SacredHeartRehabilitationInstitute
56.25%	St. John Macomb-Oakland Hospital	Oakland Industrial Clinic	StJohnMacombOaklandHospitalMadisonHeights	OaklandIndustrialClinic
56.72%	St. John Macomb-Oakland Hospital	Occupational Health Services	StJohnMacombOaklandHospitalMadisonHeights	OccupationalHealthServices
59.38%	St.Vincent Seton Specialty Hospital	St Elizabeth Medical Center	StVincentSetonSpecialtyHospitalLafayette	StElizabethMedicalCenter
60.61%	Saint Agnes Hospital	St Agnes Home Care	SaintAgnesHospital	StAgnesHomeCare
61.82%	St.Vincent Indianapolis Hospital	St Vincent Mobile Mammography	StVincentIndianapolisHospital	StVincentMobileMammography
63.83%	Lourdes Hospital	Our Lady Of Lourdes Memorial Hospital	LourdesHospital	OurLadyOfLourdesMemorialHospital
65.38%	Mount St. Mary’s Hospital and Health Center 	Mt St Marys Hospital	MountStMarysHospitalandHealthCenter	MtStMarysHospital
65.57%	Columbia St. Mary’s Hospital Columbia	CSMCP Housecalls For The Homebound	ColumbiaStMarysHospitalColumbia	CSMCPHousecallsForTheHomebound
67.74%	St.Vincent Indianapolis Hospital	St Vincent Hospital & Health Care Center	StVincentIndianapolisHospital	StVincentHospitalHealthCareCenter
68.09%	Columbia St. Mary’s Hospital Columbia	Columbia Hospital	ColumbiaStMarysHospitalColumbia	ColumbiaHospital
68.57%	Saint Agnes Hospital	St Agnes Healthcare	SaintAgnesHospital	StAgnesHealthcare
68.85%	University Medical Center Brackenridge	Dell Childrens Medical Center	UniversityMedicalCenterBrackenridge	DellChildrensMedicalCenter
72.34%	St. Mary’s of Michigan Medical Center	St Marys of Michigan	StMarysofMichiganMedicalCenter	StMarysofMichigan
73.17%	St. Mary’s Hospital at Amsterdam	St Marys Hospital	StMarysHospitalatAmsterdam	StMarysHospital
75.86%	Mount St. Mary’s Hospital and Health Center 	Mount Saint Marys Hospital	MountStMarysHospitalandHealthCenter	MountSaintMarysHospital
76.36%	Carondelet Holy Cross Hospital	Carondelet Holy Cross  Swing Beds	CarondeletHolyCrossHospital	CarondeletHolyCrossSwingBeds
77.27%	Carondelet Holy Cross Hospital	Holy Cross Hospital	CarondeletHolyCrossHospital	HolyCrossHospital
77.55%	St. John Hospital and Medical Center	St John Medical Center	StJohnHospitalandMedicalCenter	StJohnMedicalCenter
77.78%	St. Mary’s Warrick Hospital	St Marys Warrick	StMarysWarrickHospital	StMarysWarrick
82.76%	St. Mary\'s of Michigan Standish Hospital	Standish Community Hospital	StMarysofMichiganStandishHospital	StandishCommunityHospital
84.62%	Columbia St. Mary’s Hospital Ozaukee 	St Marys Hospital-Ozaukee	ColumbiaStMarysHospitalOzaukee	StMarysHospitalOzaukee
85.71%	St.Vincent Seton Specialty Hospital	St Vincent Seton Specialy Hospital	StVincentSetonSpecialtyHospitalLafayette	StVincentSetonSpecialyHospital
86.21%	Columbia St. Mary’s Hospital Milwaukee	St Marys Hospital Of Milwaukee	ColumbiaStMarysHospitalMilwaukee	StMarysHospitalOfMilwaukee
89.80%	Providence Health Center	Providence Healthcare Network	ProvidenceHealthCenter	ProvidenceHealthcareNetwork
90.00%	St. John Macomb-Oakland Hospital	St John Macomb-Oakland Hospital	StJohnMacombOaklandHospitalWarren	StJohnMacombOaklandHospital
97.22%	University Medical Center Brackenridge	University Medical Center at Brackenridge	UniversityMedicalCenterBrackenridge	UniversityMedicalCenteratBrackenridge
 
Things would run faster if the Replacement was done on the table, but I don't think you can do that.

You perform the replacement on both strings before you run into the search to match against words. I will have a play with replaced data and get back to you.
 
Things would run faster if the Replacement was done on the table, but I don't think you can do that.

which table? the table in our database or my tables? i can't do it in our database, i do it in my tables. but how would it be faster if i did it in our database?

You perform the replacement on both strings before you run into the search to match against words.

what do you mean by this?
sorry, english is not my first language
you're saying that i should replace both strings (which i do) before i do what?
 
It's like this. For everytime you want to perform a match, in the function you have to:

1. Replace the abbrevs in both strings
2. Then check both strings

If however the abbrevs had already been replaced, then you simply perform step 2. That's minus one step which will equal faster computation. Don't worry about this because you can't do it since it's how the data should be.

Before I go any further, I need clarity on something:

Does the order in which the words appear matter? That is will this match be correct?

Code:
100% Saint Mary's Warrick Hospital               Mary's Warrick Saint Hospital
Should that be 0% or 100%?
 
oh, i got it, yes, that's what i do. i replace all the abbreviations everywhere first and then go on to the matching part

it's 100% match, order doesn't matter
 
you don't have to do the work for me, just pointing me the right way is good enough. if you can explain whatever i don't get - even better, but you don't have to do the whole thing)))))))))))
besides, i like learning and i like challenges
i mean if you got so into it that you're doing it for yourself as well - then it's different, but if you're doing it just for me - then no need, really, you've helped so much already. you can just tell me what you had in mind and i will try to write it.
up to you
thank you so so much either way
 
Here it is:

1. Perform the replace of both strings and save into a string
2. Split the strings into an array for both
3. Perform the comparison word-by-word

The last code I gave you does that already, except you need to incorporate the Replace in there.

I have to think how the number of words on the right would affect that on the left.
 
So what would this be?

Code:
Saint Mary's Warrick Hospital           Mary's Warrick Saint Hospital [B]The[/B]
 
i'm not sure if this has already been dealt with, but comparing words for similarity can also be done by a number of compression algorithms.

One example is the soundex alogrithm which removes many of the insiginficant letters (normally vowels I think) to leave a word "kernel" which can be used for comparison

thus smith, smyth smythe etc probably all produce an equivalent soundex code.



try wikipedia for this, and similar functions - you may find this more practical than the way you are currently examining.
 
What has actually been partially re-invented here in this thread is referred to as Levenshtein Distance. Here's an explanation of it and a VB implementation, easily adaptable to Access:

http://www.merriampark.com/ld.htm

It's also implemented in VBA as Simil().

It can also be useful in combination with Soundex and Soundex2 and other methods of fuzzy matching.

It's a big topic, actually, and most of the modern techniques for this are not really available in Access. That is, you know that Google is not using any of those methods in creating its indexes and cross-references.
 
i'm not sure if this has already been dealt with, but comparing words for similarity can also be done by a number of compression algorithms.

One example is the soundex alogrithm which removes many of the insiginficant letters (normally vowels I think) to leave a word "kernel" which can be used for comparison

thus smith, smyth smythe etc probably all produce an equivalent soundex code.



try wikipedia for this, and similar functions - you may find this more practical than the way you are currently examining.


is that what it's called, soundex alogrithm?
 
What has actually been partially re-invented here in this thread is referred to as Levenshtein Distance. Here's an explanation of it and a VB implementation, easily adaptable to Access:

http://www.merriampark.com/ld.htm

It's also implemented in VBA as Simil().

It can also be useful in combination with Soundex and Soundex2 and other methods of fuzzy matching.

It's a big topic, actually, and most of the modern techniques for this are not really available in Access. That is, you know that Google is not using any of those methods in creating its indexes and cross-references.

hey, you're in new york!!!!!!!
thank you so much for the link
what is google using, do you know?
can you give me an example of a modern technique that's not available in access?
 

Users who are viewing this thread

Back
Top Bottom