Useful TIP-Using <Where> and <Having> in the same query. (1 Viewer)

plog

Banishment Pending
Local time
Today, 04:50
Joined
May 11, 2011
Messages
11,646
Reader's Digested version:

WHERE - use when applying criteria to individual records
HAVING - use when applying criteria to an aggregation of records (MIN(), SUM(), COUNT(), AVG(), etc.)

...WHERE Table.FirstName = 'Jeff'
...WHERE Table.Quantity > 7

...HAVING SUM(Table.Quantity) > 100
...HAVING MAX(Table.Score) < 50
 

Pat Hartman

Super Moderator
Staff member
Local time
Today, 05:50
Joined
Feb 19, 2002
Messages
43,275
WHERE - Before aggregation
HAVING - After aggregation

When you make a totals query, Access ASSUMES you want a HAVING. It is up to you to change it to a WHERE if that is what you need.

If you are mistakenly using HAVING when you should be using WHERE, fixing the query could lead to a dramatic speed improvement.
 

ByteMyzer

AWF VIP
Local time
Today, 02:50
Joined
May 3, 2004
Messages
1,409
Here is a more concrete illustration of the differences between HAVING and WHERE clauses.

Table Name: StudentTests
Code:
StudentID | TestID | TestDate
----------+--------+----------
10123     | 1012   |  3/1/2022
10123     | 1023   |  9/6/2022
10123     | 1035   |11/17/2022
10123     | 1057   | 2/27/2023
10135     | 1012   |  3/1/2022
10135     | 1023   |  9/6/2022

SELECT ST.StudentID, MAX(ST.TestDate) AS TestDate
FROM StudentTests AS ST
GROUP BY ST.StudentID;

Result:
StudentID | TestDate
----------+----------
10123     | 2/27/2023
10135     |  9/6/2022

Task: Write a query that shows, for tests that took place in 2022, each Student ID with the latest test date.
Code:
SELECT ST.StudentID, MAX(ST.TestDate) AS TestDate
FROM StudentTests AS ST
WHERE ST.TestDate BETWEEN #1/1/2022# AND #12/31/2022#
GROUP BY ST.StudentID;

Result:
StudentID | TestDate
----------+----------
10123     |11/17/2022
10135     |  9/6/2022

Task: Write a query that shows each Student ID and the latest test date for whom the latest test date occurred in 2022.
Code:
SELECT ST.StudentID, MAX(ST.TestDate) AS TestDate
FROM StudentTests AS ST
GROUP BY ST.StudentID
HAVING MAX(ST.TestDate) BETWEEN #1/1/2022# AND #12/31/2022#;

Result:
StudentID | TestDate
----------+----------
10135     |  9/6/2022
 

ebs17

Well-known member
Local time
Today, 11:50
Joined
Feb 7, 2020
Messages
1,946
Without specifically testing @ByteMyzer's example, I'll say his statement is wrong, because different query logic was stored. Different logic, different results (depending on given data).

In such a simple design, with the same logic the result should always be the same, regardless of whether you use WHERE or HAVING.

The huge difference between the two lies in the order in which queries are evaluated and the resulting performance.

WHERE takes place immediately with the execution of FROM including JOINs and reduces the amount of data available. Only then does grouping take place. Grouping is a complex process because each record within the recordset under consideration must be compared with every other record to ensure that the content of the field or field combination is identical.
It makes a dramatic difference whether you use WHERE to reduce a set of, for example, 10,000 records to 300 and then group them or whether the grouping has to run across all 10,000 records; the number of comparison operations is significantly different.
When grouping, the support of an index would often be used and the comparison process mentioned would be accelerated again. Often, however, groupings are not just done using key fields or specifically indexed fields, but rather using a long list of fields where no index is guaranteed to be available.
A lot of work takes a lot of time.

HAVING then takes place after grouping and on aggregates formed.

Conclusion: With a simple operation (WHERE) you reduce the amount of data before executing a more complex operation and thus increase efficiency. This cannot happen when using HAVING.
 
Last edited:

isladogs

MVP / VIP
Local time
Today, 10:50
Joined
Jan 14, 2017
Messages
18,225
Adding on to the detailed explanation given by @ebs17, see my article comparing the efficiency (speed) of using WHERE vs HAVING

For the differences to be significant, use indexing on the fields used in the filtering.
 

ByteMyzer

AWF VIP
Local time
Today, 02:50
Joined
May 3, 2004
Messages
1,409
Without specifically testing @ByteMyzer's example, I'll say his statement is wrong, because different query logic was stored. Different logic, different results (depending on given data).
Why not test it, then, and prove me wrong?

ebs17 said:
In such a simple design, with the same logic the result should always be the same, regardless of whether you use WHERE or HAVING.
The result will not be the same between the two queries. As @Pat Hartman pointed out,
Pat Hartman said:
WHERE - Before aggregation
HAVING - After aggregation

In my example with:
Code:
WHERE ST.TestDate BETWEEN #1/1/2022# AND #12/31/2022#
...the criterion is applied to the dataset before the grouping takes place, so you will see the results for the two students, ONLY from the tests with the dates taking place in 2022.

In my example with:
Code:
HAVING MAX(ST.TestDate) BETWEEN #1/1/2022# AND #12/31/2022#;
The grouping takes place, with the MAX derivations on the TestDate field, and THEN the criterion is applied to the resulting dataset. So, because the maximum test date for Student ID 10123 is 2/27/2023, no result is returned in the HAVING query for this student.
 

Josef P.

Well-known member
Local time
Today, 11:50
Joined
Feb 2, 2023
Messages
826
WHERE - Before aggregation
HAVING - After aggregation
I would extend this with:
Having only for conditions with aggregate functions - everything else belongs in the Where section.

And now I'll add some confusion or clarity, depending on how you want to look at it ;)
The DBEngine (Jet/ACE) makes a Where out of Having when creating the show plan, if the condition allows it, such as a filter on a certain value.

Example:
Query TestTab_GroupBy_Where:
SQL:
SELECT T, Count(*) AS Cnt
FROM TestTab
WHERE N = 1
GROUP BY T

ShowPlan output:
Code:
--- TestTab_GroupBy_Where ---

- Inputs to Query -
Table 'TestTab'
- End inputs to Query -

01) Restrict rows of table TestTab
      using rushmore
      for expression "N=1"
02) Group result of '01)'

Query TestTab_GroupBy_Having:
SQL:
SELECT T, Count(*) AS Cnt
FROM TestTab
GROUP BY T, N
HAVING N=1;

ShowPlan output:
Code:
--- TestTab_GroupBy_Having ---

- Inputs to Query -
Table 'TestTab'
- End inputs to Query -

01) Restrict rows of table TestTab
      using rushmore
      for expression "N=1"
02) Group result of '01)'

Nevertheless, I always recommend using Where for "normal" conditions, as this makes the query logic more visible.

/edit: Colin shows this well in his article. (Link in #6)
 
Last edited:

ByteMyzer

AWF VIP
Local time
Today, 02:50
Joined
May 3, 2004
Messages
1,409
@Josef P.
What does the plan show for the following query?
SQL:
SELECT T, Count(*) AS Cnt
FROM TestTab
GROUP BY T, N
HAVING Count(*) > 1;
 

Josef P.

Well-known member
Local time
Today, 11:50
Joined
Feb 2, 2023
Messages
826
Unfortunately only Group table ... Showplan does not show the use of Having Count(*)>1.
Code:
- Inputs to Query -
Table 'TestTab'
- End inputs to Query -

01) Group table 'TestTab'

You can see this better in the SQL server:
Code:
SELECT T, Count(*) AS Cnt
FROM TestTab
WHERE N = 1
GROUP BY T
HAVING count(*)>1;


SELECT T, Count(*) AS Cnt
FROM TestTab
GROUP BY T, N
HAVING N=1 and count(*)>1;

=>
Code:
|--Filter(WHERE:([Expr1002]>(1)))
       |--Compute Scalar(DEFINE:([Expr1002]=CONVERT_IMPLICIT(int,[Expr1005],0)))
            |--Stream Aggregate(GROUP BY:([TestDb].[dbo].[TestTab].[T]) DEFINE:([Expr1005]=Count(*)))
                 |--Index Seek(OBJECT:([TestDb].[dbo].[TestTab].[IX_N]), SEEK:([TestDb].[dbo].[TestTab].[N]=(1)) ORDERED FORWARD)

  |--Filter(WHERE:([Expr1002]>(1)))
       |--Compute Scalar(DEFINE:([Expr1002]=CONVERT_IMPLICIT(int,[Expr1005],0)))
            |--Stream Aggregate(GROUP BY:([TestDb].[dbo].[TestTab].[T]) DEFINE:([Expr1005]=Count(*)))
                 |--Index Seek(OBJECT:([TestDb].[dbo].[TestTab].[IX_N]), SEEK:([TestDb].[dbo].[TestTab].[N]=(1)) ORDERED FORWARD)
 
Last edited:

Pat Hartman

Super Moderator
Staff member
Local time
Today, 05:50
Joined
Feb 19, 2002
Messages
43,275
Nevertheless, I always recommend using Where for "normal" conditions, as this makes the query logic more visible.
The two are used for different reasons and are not interchangeable. You use WHERE to select/eliminate detail records. This can reduce the aggregation process dramatically depending on the criteria.

HAVING is used on aggregated data, not on details. So, you would most commonly use it to find customers whose average order was > 1000 or Orders that were more than $400. In both examples, data needs to be aggregated before you can apply the criteria. When you are looking for Orders placed in Dec that were for more than $400, you need both a WHERE and a HAVING. The WHERE filters the details down to only orders placed in Dec. Then that smaller set of data is summed and the criteria applied. If you used only a HAVING rather than both, then the query engine would have to aggregate possibly 10's of thousands of orders over many years and then only at the end chose the ones from last December that were > $400. The WHERE lets you eliminate all records other than those from last Dec. Then you only have to sum the remaining records and use the HAVING to make the final selection.

The problem Access developers run into is that the QBE ASSUMES you need a HAVING when you are working with a totals query. That means YOU, the developer, MUST make a conscious determination that a WHERE is called for and change the HAVING to a WHERE. The QBE is not smart enough to figure out that a WHERE might be more efficient or that you actually should use both clauses.
 

ebs17

Well-known member
Local time
Today, 11:50
Joined
Feb 7, 2020
Messages
1,946
The QBE is not smart enough
As Josef showed, the SQL optimizer is correspondingly smarter and can in many cases clean up the mess that the QBE and thus another Access component creates.
But I also wouldn't want to rely on errors being compensated for internally.

That means YOU, the developer, MUST make a conscious determination that a WHERE is called for and change the HAVING to a WHERE.
But that doesn't show any help or operating instructions. The real problem is that many believe and are encouraged to believe that the QBE produces decent SQL code that can be adopted without criticism - and there is no need to really bother with the actual SQL programming language.
I can only shake my head.
 

Pat Hartman

Super Moderator
Staff member
Local time
Today, 05:50
Joined
Feb 19, 2002
Messages
43,275
Why are you shaking your head? You think that Joseph proved that the Access query engine fixes flaws of the QBE so what's the problem with using the QBE? I disagree BTW. I just recently fixed a problem caused by using HAVING rather than WHERE for a poster. In Joseph's query, the N=1 criteria applied to a column that was not included in the aggregated data so yes, the engine is smart enough to fix that problem. The N data values would have been summarized away by the time that the HAVING ran so they would never have been available for the HAVING to operate on. The two queries may end up with the same plan because there aren't enough values of N in the index to make using the index first more efficient.

Colin has done extensive testing of queries using a very large set of data. It is only with large sets of data that you can base assumptions like the above on. If 90% of the N's are = 1, there is no point in using an index. I don't know what the break point is but there needs to be enough distinct values of N to be sure that using an index will be more efficient than a full scan after the aggregation.
 

ebs17

Well-known member
Local time
Today, 11:50
Joined
Feb 7, 2020
Messages
1,946
so what's the problem with using the QBE?
1) An error is first generated that needs to be corrected.
2) The belief of many that the QBE does everything right, along with a quasi-encouragement to rely on this assistant and not really bother with the native SQL programming language. This “habit” is then carried on over and over again.
 

Users who are viewing this thread

Top Bottom