Finding partial matches using SQL

marthacasting

Registered User.
Local time
Yesterday, 20:19
Joined
Oct 17, 2011
Messages
67
I need to run a Duplicate Query that searches for LastName duplicates and FirstName duplicates--however, I want to find partial matches for the FirstName, finding those with the first 3 letters the same. (For example, it would pick up both Sam Smith and Samuel Smith.)

Did I read this has to be done in SQL? (This is my first foray into SQL!)

Thank you!
 
You don't give your table nor field names so I will use TableName, FirstName and LastName for those.

Code:
SELECT LastName, SUBSTRING(FirstName, 1, 3) AS FirstName3, COUNT ([LastName]) AS Records
FROM TableName
GROUP BY LastName, SUBSTRING(FirstName, 1, 3)
ORDER BY COUNT([LastName]) DESC;

That will tell you how many records have the same last name and first 3 characters of the first name the same.
 
Assuming you are using access you'll need to substitute 'mid' for 'substring'
 
Thank you! I've been trying it now. For some reason it tells me there is a syntax error in my FROM clause. (The table name is Student Mailing List. I have tried it with and without spaces, with and without caps!)
 
if you have spaces in you table or field names you nee to use square brackets

[Student Mailing List]
 
THANK YOU, THANK YOU Both!!
I did get the query to run! Yay! That's the good news.
The not-so-good news is that it is running all LastNames and ALL FirstNames (but using just the first 3 letters of the FirstNames. I was hoping to have it be that it would pick up only those names that had duplicate LastNames and duplicate for three letters in FirstNames.

Example: It would pick up
Sam Johnson & Samuel Johnson
but not
Sam Johns & Sam Johnson

Is this possible?!

Again, my thanks.
 
My SQL should do that. What is your SQL?
 
[FONT=trebuchet ms, sans-serif]SELECT LastName, MID(FirstName, 1, 3) AS FirstName3, COUNT ([LastName]) AS Records[/FONT]
[FONT=trebuchet ms, sans-serif]FROM [Student Mailing List][/FONT]
[FONT=trebuchet ms, sans-serif]GROUP BY LastName, MID(FirstName, 1, 3)[/FONT]
[FONT=trebuchet ms, sans-serif]ORDER BY COUNT([LastName]) DESC;[/FONT]
 
Sam Johns and Sam Johnson will not be picked up as duplicates in the query you posted.
 
When I ran the query, I got EVERY name, meaning every LastName and the first 3 letters of every FirstName!
 
You need to put a filter in to exclude records where there is only one record
Code:
SELECT LastName, MID(FirstName, 1, 3) AS FirstName3, COUNT ([LastName]) AS Records
FROM [Student Mailing List]
GROUP BY LastName, MID(FirstName, 1, 3)
ORDER BY COUNT([LastName]) DESC
WHERE COUNT([LastName])>1
 
THANK YOU so very much for all of your help and patience!
When I type that in, I get the message: Syntax error in ORDER BY clause.
If I could respectfully ask for further guidance?
 
The post from CJ_London has the order by, group by, where in the wrong order.

Try this:


SELECT LastName, MID(FirstName, 1, 3) AS FirstName3, COUNT ([LastName]) AS Records
FROM [Student Mailing List]
GROUP BY LastName, MID(FirstName, 1, 3)
HAVING COUNT([LastName])>1
ORDER BY COUNT([LastName]) DESC
 
Thank you! That definitely narrows the list down. However, I am still getting LastNames that show up only once, and first three letters of all FirstNames, irrespective if they there is more than one.

For example, I will be getting back:
Brown, Joe
Buster, Abe
Buster, Bill
Buzy, Jon
Buzy, Jonathon

What I was hoping for (since there are 10s of 1000s of names in the db!) to have a query that would know that I am looking for (only) Buzy, Jon and Buzy, Jonathon.

Does this make sense? Is it even possible?
 
Again, I thank you for your help and consideration.
I have to confess that I don't see the correlation. Again, this whole SQL thing is completely new to me!
Much appreciated . . .
 

Users who are viewing this thread

Back
Top Bottom