Query optimization: Why does Access use temporary index instead of index or rushmore for a JOIN query with SUM? (1 Viewer)

otto

New member
Local time
Today, 15:29
Joined
Aug 29, 2019
Messages
12
Hello all,

I have two tables:
tblSales with the fields
ID,
GroupID,
SalesTypeID,
CompanyID,
Amount

And tblCompany
with the fields
ID, TypeID

The underlined fields are PKs, the bold ones have indexes.

1756306518959.png


Based on these two tables, I created the query qrySalesOfCompanyType2, which returns the sales totals for all companies with company type = 2.

SQL:
SELECT
    S.CompanyID,
    SUM(S.Amount) AS Amount
FROM
    tblSales AS S
    INNER JOIN tblCompany AS C ON S.CompanyID = C.ID
WHERE
    C.TypeID = 2
GROUP BY
    S.CompanyID;

The query works as expected.

The problem or question is about the query plan or performance:
If I run the query and see the plan of JETShowPlan, it looks like this:

1756303209326.png


Questions:
1. Why does Access sort in step 02?
2. And why does it use a temporary index for the JOIN instead of the existing index or Rushmore? I suspect the second part of the question relates to the sorting.
3. Does anyone have any idea why this is happening and how we could optimize the query?

Thanks for any help :)
 
Last edited:
To get an answer about how Rushmore fits in to this, if you are seeing Gemini responses on Google questions, type "MS Access rushmore" and read up on what it does. Gemini's response is quite detailed.

To answer at least some parts of your question directly, Access does what it does because it was programmed to do it that way. To Access, what it is doing DOES optimize the query. Why does ANY program do what it does? Because it was written that way. Sort of like Jessica Rabbit in the movie Who Framed Roger Rabbit? "I'm not a bad girl... I'm just drawn that way."

The temporary index would be chosen if no existing index conforms to what it is trying to do. I can't tell you why yours doesn't because we can't see it directly - though you do show the query and have a description of the fields.

Access performs the sort because establishing an index works better if you already know your data set will be sorted.
 
M.Wolfe wrote the article, maybe he has a better explanation.
 
Thank you for your quick reply.

Do you know a better way to formulate the query to improve performance?
 
it says on the ouput textfile that it create temp index on the expression:

S.CompanyID = C.ID

always index on the joining fields and on the Where clause field of your query.
 
As I mentioned before, the fields already have indexes. So I'm wondering why a temporary index is created instead of using the existing one.
 
Unfortunately, there is little documentation on internals of the JET/ACE-Engine and the the SHOWPLAN output.
So, my explanations are meant more as educated guesses than absolute facts.

1. Why does Access sort in step 02?
Because it makes the subsequent join in 03 easier (less resource intensive) if both sets of records joined are sorted in the same way.

2. And why does it use a temporary index for the JOIN instead of the existing index or Rushmore? I suspect the second part of the question relates to the sorting.
There is no existing index on the ID column for the subset of companies with TypeId=2. The primary key cannot be used directly, because it also includes companies with TypeId not =2.

3. Does anyone have any idea why this is happening and how we could optimize the query?
SQL:
CREATE UNIQUE INDEX idxCompany_TypeID_ID ON tblCompany (TypeId, ID);
If my above guesses are correct, this could make the DB engine use this index instead of the sort and the temporary index.
However, if this actually improves performance or not depends on the cardinality of data in TypeId. If the cardinality is very low, the optimizer might decide to ignore the new index altogether.
 
I chalk this up to "Access does this in its infinite wisdom". The query engine is written to handle all queries, and the optimizer is quite limited. The result will be sometimes remarkably good performance, sometimes remarkably bad.

I don't see how you can rewrite your SQL better. It is too simple.

If you have the option: upsize your database to SQL Server (the free Express edition will suffice) and review the execution plan that an enterprise class database engine will come up with.
 
Because it makes the subsequent join in 03 easier (less resource intensive) if both sets of records joined are sorted in the same way.

There is no existing index on the ID column for the subset of companies with TypeId=2. The primary key cannot be used directly, because it also includes companies with TypeId not =2.
Even if you omit the WHERE clause, it creates temp index -> Same showplan output.
Only when you remove the sum does it use the existing index. So, I think the creation of the temp index is due to aggregation.

SQL:
CREATE UNIQUE INDEX idxCompany_TypeID_ID ON tblCompany (TypeId, ID);
If my above guesses are correct, this could make the DB engine use this index instead of the sort and the temporary index.
However, if this actually improves performance or not depends on the cardinality of data in TypeId. If the cardinality is very low, the optimizer might decide to ignore the new index altogether.
As you have guessed, the new index doesn't bring any changes.

Thank you very much for your detailed answer.
 
Last edited:
I don't see how you can rewrite your SQL better. It is too simple.
So, then I leave it as it is.

If you have the option: upsize your database to SQL Server (the free Express edition will suffice) and review the execution plan that an enterprise class database engine will come up with.
Unfortunately I don't have the option. Anyways it's always good to learn new things about Access and make use of it in the future. :)

Thank you for your advice.
 
I agree with the earlier comment that it is highly unlikely you could further optimize your query design

I spent a lot of time researching query execution plans several years ago including the conditions in which Rushmore is utilized.
There are a few minimum conditions. If my memory is correct these include:
  • all fields used in joins and where / having clauses must be indexed
  • simple filter expressions
  • no datatype conversions
  • no aggregate functions
Several articles that may be of interest:






and an article by Mike Wolfe:

Sometimes Rushmore isn't used where the optimizer finds no benefit in doing so.
Its also true that there are situations where using indexes can cause a query to run more slowly
e.g. indexing fields used for aggregation - see the 6th link above
 
Last edited:
Perhaps a temporary index is “faster” than a seek on an existing index followed by a lookup for one field ("Amount").

Just as an idea: please try testing with an index (2 fields) on CompanyID and Amount.
 

Users who are viewing this thread

  • Back
    Top Bottom