odd duplicates

a.mlw.walker

Registered User.
Local time
Today, 23:36
Joined
Jun 20, 2007
Messages
36
Hi. I have two tables, of which i have run a duplicate query, removed the duplicates by doing an "is null" criteria, and keeping that new table.
I know though there are still more duplicates but the names may not be exactly the same, say barclays and barclays plc. so i run another query on both tables to show me the first five letters of table 1 and table 2. I can run another "is null" criteria query to find the non duplicates there. these are the unique values.

But even if there are duplicates up to the first five letters the next letter may change, for instance barcleys tractors and barclays bank. So these arent duplicates.
So can anyone come up with a way of taking the duplicate ones according to the first five letters, and maybe duplicate ones up to 10 letters, and working out which ones are the unique ones. my brain is frying.

thanks
alex
 
use the LIKE operator ie LIKE "*barcl*" OR LIKE "*bercl*"
 
mmm. This would work if i knew the companies which might be duplicates, I want it to show me any company that is almost like another one for instance. I dont know which ones are duplicates
 
There is no substitute for manual inspection when you face this specific problem. It has been around for a long time. First reported in text parsing when "natural language" experiments began on computers in the 1960s. Or even earlier.

The best you could do is define a query to pull the LEFT$() of the field in question and build an inspection table sorted by this extract first and the entire name field second. Pick a size for the extracted sub-string. From this list, you would be able to identify records with problems. Because of the string function, I'm not sure if the query would be directly updateable.
 
yeah thats what i want, a filtered down table, that i can inspect, but 120000 records is a lot. what method would you suggest
 
you could try using a soundex function this converts your word to numbers excluding ai,e,o,u.
 
create a query and call a soundex function on the name
you may then find that many similar sounding words are given the same code.

then you can use another query to filter the recods.
 

Users who are viewing this thread

Back
Top Bottom