INNER JOIN query - need to exclude duplicate entries in control table

jameslagan

Registered User.
Local time
Today, 11:24
Joined
Jan 13, 2012
Messages
10
I have a table with people that have disease X - tbl:[Disease X]that are being matched to disease-free "controls" tbl:[Controls] ensuring that they match gender and location.

Currently I have an inner join that looks something like this:

SELECT blah blah blah

FROM [Controls] INNER JOIN [Disease X] ON ([Controls].Location = [Disease X].Location) AND ([Controls].Gender = [Disease X].Gender);

Easy!

This returns multiple controls for each disease X. Naturally, some of which are actually the same control.

For instance, [Disease X] ID 1 might match with [Controls] ID 3, but [Disease X] ID 4 might also match with [Controls ID] 3 (as they are the same gender and location).

Is there any way of ensuring that once a control has been matched once with a Disease X - they are then no longer allowed to match with another Disease X?

Perhaps using DISTINCT, UNIQUE or DISTINCTROW might be an option but I just don't know how to use these.

Many thanks for any help on this subject,

Cheers,

James
 
What exactly do you want the output of this query to look like? My initial suggestion is to make this an aggregate query and GROUP BY all the fields you want to eliminate duplicates on.
 
I will look into the use of grouping tomorrow.

I want the output of the query to be:

Disease X 1 -> Matched Control 1
Disease X 1 -> Matched Control 2
Disease X 1 -> Matched Control 4
Disease X 2 -> Matched Control 2 <----- I do not want to use this control again
Disease X 2 -> Matched Control 3
Disease X 2 -> Matched Control 5

etc...

Does that help?

James
 
Not really, less generic data examples would be helpful.

When you use data like 'Disease X 1', Disease X 2' you've assumed an order that doesn't really exist in your data. I'm trying to find out how you want to determine which Disease X record gets shown as assigned the Matched Control data. Essentially home come you don't want Matched Control 2 to show for Disease X 2 (and all other Disease X's), but you do want it to show for Disease X 1.

Actual data would be helpful.
 
Thank you for your continued help. I can try and get some data in due course.

Your point of "how come you don't want Matched Control 2 for Disease X 2" is because there is a large excess of matched controls and once one is "used up" on another disease X I don't want it to appear again as there are plenty of others controls to choose from. I feel that the power of the study would be improved if controls weren't duplicated.
 
Actually, my point wasn't why only one gets to show the Matched Control, it was how do you determine which one.
 
It's a good question.

I guess it would simply be first come first served. So ID 1 get's a pick of them all and ID 100 has much less to choose from.

It is possible that I do not need to worry about having duplicate controls - it is not the end of the world - I just wondered that by avoid duplicates I would get a more statistically meaningful result.
 

Users who are viewing this thread

Back
Top Bottom