Solved Aggregate Query Returns Different Values When a Field is Included not in Group By (1 Viewer)

gemma-the-husky

Super Moderator
Staff member
Local time
Today, 00:31
Joined
Sep 12, 2006
Messages
15,658
@MajP

Ok. I decided to have a look, and I can see the problem now.
So here's a thing. If I include the field [goal balance] in the non-working query, you get the same erroneous result, but if I add the field [games played] it magically shows 5 games for every team and works correctly.

I actually split the home games and away games into 2 separate queries, and unioned the queries, and still got your result. The individual queries were correct, but the union was incorrect. With a bit of trial and error, it actually shows the wrong result UNLESS [games played] is included. It actually reports 3 home and 1 away for team 1. Is there something special about the [games played] field that would cause this?

I then looked at the games and noted that team 1 had 2 home games that were both won 1-0. I then checked all the team results and team 2 had the same, but no other team had the same score in multiple games. I didn't check period 2, but I expect team 10 had 2 duplicate results on Period 2. (yes - I just checked and team 10 had 2 2-0 home wins)

Maybe the selection of columns from the union query treats these otherwise matching scores/results as duplicates unless you add something to the selection that forces it not to disregard duplicates, and the [games played] count may be the only field that can do that. Would that make sense?
 

MajP

You've got your good things, and you've got mine.
Local time
Yesterday, 19:31
Joined
May 21, 2018
Messages
8,529
Maybe the selection of columns from the union query treats these otherwise matching scores/results as duplicates unless you add something to the selection that forces it not to disregard duplicates, and the [games played] count may be the only field that can do that. Would that make sense?
I cannot say it makes sense, but that is what is exactly happening. The Union query by itself returns all results correctly. In the Union query, you cannot have a duplicate if you include matchid, and teamID (or team type Home or Away).

So it has to do with the execution plan. I would assume it executes the union query then the aggregate, but you never know how it really executes. It appears the selection of fields in the aggregate determines the selection of fields to use in the union. This causes the Union to not be sufficient and a Union All is needed. (Even though the Union query is correct by itself (140 records)).

So since the aggregate only includes PeriodeID, TeamID, and TotalPoints (which is based on the calculation of Score and OpponentScore), then the only fields selected from the Union query are PeriodID, TeamID, OpponentScore, Score,. So now you would have duplicates as you point out.

Bottom line a Union All is needed in a Union query, if you are going to then use a subset of those fields in other queries. Interestingly, SQL Server executes the query in the way I desired.
Lesson learned always default to a Union All if you plan to base other queries off the union query, unless you purposely want to drop potential duplicates.
 
Last edited:

gemma-the-husky

Super Moderator
Staff member
Local time
Today, 00:31
Joined
Sep 12, 2006
Messages
15,658
Here we are. Back to School (dataschool). Exactly what you said. Who knew?
Certainly not me. I only vaguely realised there WAS a union All alternative before this thread, and definitely did not appreciate there were these subtleties involved.

Difference between UNION and UNION ALL (dataschool.com)
 
Last edited:

nector

Member
Local time
Today, 02:31
Joined
Jan 21, 2020
Messages
368
The query for complete stats works fine, but any attempt to use that query seems to cause this strange problem
qryTeamStatsByPeriod

This complete data is correct as far as I can tell which is just an aggregate of the union on Team and Period
qryTeamStatsByPeriod


PeriodIDTeamIDTotalPointsGoalBalanceGamesPlayedHomeMatchesAwayMatchesGoalsForGoalsAgainstWinsLossesTies
1​
1​
15​
8​
5​
4​
1​
9​
1​
5​
0​
0​
1​
2​
11​
3​
5​
3​
2​
5​
2​
3​
0​
2​
1​
8​
10​
4​
5​
3​
2​
8​
4​
3​
1​
1​
1​
4​
9​
5​
5​
2​
3​
8​
3​
3​
2​
0​
1​
7​
8​
4​
5​
3​
2​
9​
5​
2​
1​
2​
1​
3​
8​
3​
5​
3​
2​
7​
4​
2​
1​
2​
1​
5​
8​
2​
5​
3​
2​
7​
5​
2​
1​
2​
1​
6​
7​
0​
5​
2​
3​
6​
6​
2​
2​
1​
1​
9​
4​
-2​
5​
2​
3​
4​
6​
0​
1​
4​
1​
14​
4​
-4​
5​
3​
2​
4​
8​
1​
3​
1​
1​
10​
4​
-6​
5​
2​
3​
3​
9​
1​
3​
1​
This is only grouped on TeamID and Period ID. However if I remove certain fields or try to use this query in another query it fails.

See qryTotalPointsByPeriod_Problem

Looking at the information required, a single table will not work unless someone is so genius in access, the possible and easy way is to create the listed tables below:

(1) tblteam - Main table
(2) tblscores - lookup
(3) tblmarch available - look up with dates
(4) tblLeague - lookup

Then make sure all your tables are indexed, then your queries will have no problem to give you what you want. The same principle we use when developing an accounting system will surely apply otherwise with only one table it will surely not work.

Regards

Chris
 

gemma-the-husky

Super Moderator
Staff member
Local time
Today, 00:31
Joined
Sep 12, 2006
Messages
15,658
^^^^
We've identified the real problem now, which was appreciating a technical issue with a UNION query.

In your case, with an accounting system, it's easy to see the same issue arising. If you had separate queries to select debit postings and credit postings, and a UNION of the two queries for the overall result, then that might fail to work as you expect, because of duplications within the sub-set of data you try to use.
 

MajP

You've got your good things, and you've got mine.
Local time
Yesterday, 19:31
Joined
May 21, 2018
Messages
8,529
Looking at the information required, a single table will not work unless someone is so genius in access, the possible and easy way is to create the listed tables below:

(1) tblteam - Main table
(2) tblscores - lookup
(3) tblmarch available - look up with dates
(4) tblLeague - lookup

Then make sure all your tables are indexed, then your queries will have no problem to give you what you want. The same principle we use when developing an accounting system will surely apply otherwise with only one table it will surely not work
No, not the issue, not the problem, not the solution. There is not a single table, but a temp table is needed for the solution. There is a tblPeriod, tblRound, tblTeam, and tblMatches. They are all from the same league so league table not needed.
In theory tblMatches is not normalized. In that table there is a AwayTeamID, HomeTeamID, AwayScore, and HomeScore. I imagine 9 times out of 10 most people would do it this way. But yes a more normalized would be
tblMatchScores with
-- matchID_FK
-- TeamID_FK
--HomeAway ' home or away
-- Score

Although that is more normalized IMO it is a lot more work. This actually makes data validation harder, requires some code, and makes a more difficult UI. You have to ensure through code that two and only two records created for each match and that one is marked home and the other away. IMO, this is a rare case where using a normalizing query is easier than creating a slightly more normalized design.
Either way, the normalizing Union query works fine returning all the correct results.

The issue (unknown to me) is that if you create a union query using "Union" and not "Union All", you may get correct results. However if you then create a query based on the Union query (such as an aggregate) and do not include all the fields in the original query you can create a condition where the original query requires a Union All, even though the original query did not.

Bottom Line, in this case a Union All is needed to ensure that queries built on the Union query are correct, even though the Union query is correct. My guess is that the execution plan is not as one would expect. They do not necessarily run in order from top down. The Union query does not get executed in entirety and then fields are chosen for the aggregate, and the aggregate runs. The execution plan probably selects only the fields in the aggregate to include in the union.

IMO,the original request cannot be done without a temp table, or if so it is beyond my abilities in SQL which are pretty good. I challenge anyone to do it in Access. I think in SQL server you may have a chance.

The ranking of teams follow these rules in order.
1. Total Points for all mateches (win 3, tie 1, lose 0)

Now the hard part.
2. If teams have the same amount of total points then ordered by Points only in those matches between teams with the same total points. (example Team 3,5,7 have 8 Total points)
3. If teams with the same total points, have the same points for matches played within the tie group, then ranked by point differential in the tie group

4. Then ranked by highest Goals For in all matches
5. Then ranked by lowest Goals against in all matches.

The solution was to create a temp table that saves the results from step 2 and 3 (Points within a tie group, point differential in tie group).
 

Minty

AWF VIP
Local time
Today, 00:31
Joined
Jul 26, 2013
Messages
10,371
Rather off-topic, but you could probably do this in SQL Server using Row_Number or Rank functions a little easier, and probably in one pretty complicated query, using a CTE for the recursion required.

I'm not volunteering though - it's Saturday and the pub is open. :D🍺
 

MajP

You've got your good things, and you've got mine.
Local time
Yesterday, 19:31
Joined
May 21, 2018
Messages
8,529
Rather off-topic, but you could probably do this in SQL Server using Row_Number or Rank functions a little easier, and probably in one pretty complicated query, using a CTE for the recursion required.

I'm not volunteering though - it's Saturday and the pub is open. :D🍺
I do think your right that is what would be required in SS. I do not think it can be done in Access.
 

Users who are viewing this thread

Top Bottom