Selecting only records where both of two values exist

With those database constraints I was wrong. Cheeky's solution will work. Sorry.
 
SQL:
SELECT    c.*
FROM     Contacts c
WHERE EXISTS (SELECT 'x'
                FROM tblContact_Categories cc1  
                INNER JOIN tblCategoryTypes t
                    ON cc1.CategoryTypeID = t.CategoryTypeID
                WHERE t.[Category Type]  = 'car'
                AND cc1.ContactID = c.ContactID)
AND EXISTS (SELECT 'x'
                FROM tblContact_Categories cc1  
                INNER JOIN tblCategoryTypes t
                    ON cc1.CategoryTypeID = t.CategoryTypeID
                WHERE t.[Category Type]  = 'bike'
                AND cc1.ContactID = c.ContactID)
Winner.
 
Rather than calling the EXISTS predicate and multiple subqueries, which could be tedious if the value list were a long one, the attached little demo file illustrates how rows can be returned in the context of a binary relationship by selecting values in a multi-select list box. In the demo this is done by filtering a report's output by means of the following criterion:

Code:
strCriteria = "StoreID IN(SELECT StoreID " & _
    "FROM StoreStock " & _
    "WHERE StockID IN(" & strStockIDList & ") " & _
    "GROUP BY StoreID " & _
    "HAVING COUNT(*) = " & ctrl.ItemsSelected.Count & ")"

To do the same thing completely in a query would be:

SQL:
SELECT
    Store,
    Stock
FROM
    Stocks
    INNER JOIN (
        Stores
        INNER JOIN StoreStock ON Stores.StoreID = StoreStock.StoreID
    ) ON Stocks.StockID = StoreStock.StockID
WHERE
    Stores.StoreID IN (
        SELECT
            StoreID
        FROM
            StoreStock
            INNER JOIN Stocks ON Storestock.StockID = Stocks.StockID
        WHERE
            Stock IN ("Apples", "Oranges")
        GROUP BY
            StoreID
        HAVING
            COUNT(*) = (
                SELECT
                    COUNT(*)
                FROM
                    Stocks
                WHERE
                    Stock IN ("Apples", "Oranges")
            )
    );

This is really just an example of relational division, which was one of Codd's original algebraic operators when he first proposed the database relational model back in 1970. The usual published examples of this generally divide one table into another, but the principle is much the same with a value list.
 

Attachments

No, for GDPR reasons. :(
That can be easily fixed on a copy of the DB, so not a great excuse :(
 
on the basis that a contact can only have one qualification per category (i.e. a contact cannot have more than one bike qualification), this can simply be done with joins

Code:
SELECT A. ContactID
FROM tbl_Contact_Category As A INNER JOIN tbl_Contact_Category As B ON A.ContactID=B.ContactID
WHERE A.[Category Type] = 'Bike' AND B.[Category Type]='car'

Or if you want all the contact details

Code:
SELECT Contacts.*
FROM (tbl_Contact_Category As A INNER JOIN tbl_Contact_Category As B ON A.ContactID=B.ContactID) INNER JOIN Contacts ON Contacts.ContactID=A.ContactID
WHERE A.[Category Type] = 'Bike' AND B.[Category Type]='car'
 
CJ, Since CategoryType should not be a field in tblContact_Category, use CategoryTypeID in WHERE clause for the first SQL.

This approach involves a Cartesian relationship which may be slower than nested query.
 
I was going by the original post - Agree using FK would be better. I don’t see a Cartesian query in my suggestion
 
Really? 2 Q and then another? Just to do this?
Yes. You have three separate questions so they need to be asked separately. Does A exist (save answer)? Does B exist (save answer)? And finally are both A and B true?

The logically simple solution is as proposed. But as you can see, a single more complex, multi-part query can produce the result using Count(*)
 
CJ, a self-join on only ContactID will involve Cartesian product. If a contact has both a car and bike record, joining will return 4 records, not 2. If there are other categories, the multiplication gets worse. If there are 3 types, then 9 records result. Use of WHERE clause with AND operator restricts output to display of 1. In a large dataset, Cartesian product can cause very slow performance.
 
Last edited:
Please clarify what you mean by Cartesian product. My understanding is no joins
 
Cartesian Product: the number of rows in the result-set is the product of the number of rows of the two tables.

Joining on ContactID reduces the multiplication but doesn't eliminate. Joining restricts which records will associate and thereafter a Cartesian product is employed.

In a table where ContactID 1 has 3 records, a self-join means ContactID 1 has 3 records in each dataset so the output without filter criteria will be 3 x 3 = 9.
 
Last edited:
so my qualifier is irrelevant? Perhaps need to look at the query plan to see how it is optimised
 
Please clarify what you mean by Cartesian product. My understanding is no joins

The term Cartesian derives from the name of the French mathematician and philosopher René Descartes. A major step in the development of mathematics was his introduction of a coordinate system for defining points on a Euclidean plane by the intersection of two straight axes. This unified algebra and geometry by enabling algebraic definitions and analyses to be applied to geometric shapes.

The Cartesian product of two sets comprises all of the ordered pairs of the elements of each set. Think of a chess board. The two sets have values A to H and 1 to 8 respectively. The ordered pairs are thus A1,A2,A3...H8. The Cartesian coordinates of any square on the board are defined by the intersection of the two axes, e.g. C4. This is one of the ordered pairs which comprise the Cartesian product of the sets. In relational terms we can think of this square being the restriction of the Cartesian product to x=C AND y=4. Prior to the SQL97 standard this was how tables were joined, by including the JOIN criteria in the WHERE clause. However, this could not handle OUTER JOINs and the JOIN clause was introduced to allow these. I can remember at that time many SQL developers persisted in putting the JOIN criteria in the WHERE cause, particularly if other restrictions were being applied by criteria in the WHERE clause.

The use of the Cartesian product is particularly useful still where we have two sets where the full set of the ordered pairs is to be returned as a preliminary to further restriction. I had an interesting example of this some years ago when helping out a MEDEVAC unit who needed to generate check lists for medications carried on their aircraft. The Cartesian product of the Aircraft and Medications tables provided the starting point from which they could then elect which to carry on each aircraft.

Note that mathematically speaking the Cartesian product of two sets comprises the ordered pairs of the elements of each set. Mathematics does allow for unordered pairs, however, and often this is what will be returned when we include two tables in a query without a specific JOIN. I'm not aware of a term for all unordered pairs, and we tend to use the term Cartesian product rather loosely to cover both.
 

Users who are viewing this thread

Back
Top Bottom