Stumped by slow query

snow-raven · Apr 12, 2018

Hello kind forum users, I have a query in a new database I'm developing that persists in being slow, in spite of all I've tried to address it. I've gone through the "100 Tips for Faster Microsoft Access Databases", "The Twenty-Something Rules For Access Development", and all of Allen Browne's relevant tips, to no avail. Maybe someone here can spot my problem.

The query is intended to evaluate sets of sample gradation data. It is the third of four queries that must pass data from a table to perform necessary calculations to classify the sample. The first two queries are fast - even zippy!
Things I've tried:
-I have made each of those queries as streamlined as possible (removing joins, excess fields) while maintaining key fields and indices, and
-I've connected them in as straightforward a manner as I can in the Relationship window.
-My database is normalized as well as I know how.
-I rewrote several of the queries to follow suggestions about subqueries, nz() functions, nested IIf statements, etc.
- I've tried shifting subqueries between the four sequential queries to speed and/or simplify. I think I've isolated the slowness to this third query. I can't shift any more pieces, as this query refers to sorted/summed subqueries in the previous one.

It's much slower now. Constructive suggestions would be much appreciated:

Code:

SELECT DISTINCT Sample_Gradation_RunSum.Grad_ID, Sample_Gradation_RunSum.Samp_ID, Sample_Gradation_RunSum.Sieve_Size, Sample_Gradation_RunSum.Sieve_Log10, Sample_Gradation_RunSum.Percent_Pass, Sample_Gradation_RunSum.Previous_Sieve, 

(SELECT (Percent_Pass) 
FROM Sample_Gradation_RunSum AS Pass1
WHERE Pass1.Sieve_Size = Sample_Gradation_RunSum.Previous_Sieve
AND Pass1.Samp_ID = Sample_Gradation_RunSum.Samp_ID) AS Previous_Pass, 

(SELECT (Sieve_Log10) 
FROM Sample_Gradation_RunSum AS Sieve1
WHERE Sieve1.Sieve_Size = Sample_Gradation_RunSum.Previous_Sieve
AND Sieve1.Samp_ID = Sample_Gradation_RunSum.Samp_ID) AS Last_Sieve_Log10, 

([Percent_Pass]-[Previous_Pass])/([Sieve_Log10]-[Last_Sieve_Log10]) AS Grad_m, 

[Percent_Pass]-([Grad_m]*[Sieve_Log10]) AS Grad_b, 

IIf([Grad_m]=0,CDbl(0),(0.6-[Grad_b])/[Grad_m]) AS aD60, 

IIf([Grad_m]=0,CDbl(0),(0.5-[Grad_b])/[Grad_m]) AS aD50, 

IIf([Grad_m]=0,CDbl(0),(0.3-[Grad_b])/[Grad_m]) AS aD30, 

IIf([Grad_m]=0,CDbl(0),(0.1-[Grad_b])/[Grad_m]) AS aD10

FROM Sample_Gradation_RunSum;

Additional information about the general structure:
Table: Sample_Intervals (holds all sample-specific data)
Table: Sample_Gradation_Intervals (hold test data, generally 14 records per sample)
Query: Sample_Intervals_Gradation_Query (queries Sample_Intervals for data entry form)
- five calculations are performed, two with IIf statements to avoid DIV#0 errors
- two fields are sorted, no aggregates
Query: Sample_Gradation_Query (Queries Sample_Gradation_Intervals for data entry subform, allows data entry)
- two fields are sorted, no aggregates or calculations
Query: Sample Gradation_Retained (Queries Sample_Intervals_Gradation_Query and Sample_Gradation_Intervals to perform calculations on gradation interval data. Separated from previous query as this one would not allow data entry.)
- two calculations are performed, one with IIf statements to avoid DIV#0 errors
- no aggregates or sorting
Query: Sample Gradation_Run_Sum (Queries Sample Gradation_Retained)
- two running total calculations are performed
- one subquery "Previous_Sieve" references the previous record to pull a value to allow calculations between subsequent records in the next query (compare data from sieve size 2 to data from sieve size 1)
Query: Sample Gradation_Row_Math (Query in question - Queries Sample Gradation_Run_Sum)
- performs calculations to determine the slope/intercept defined by the Sieve_Log10 and Percent_Pass of two adjacent records, as defined by the "Previous_Sieve" relationship.
- performs four calculations to determine slope/intercept, including two SELECT subqueries that pull the values from the previous record
- also performs four IIf calculations that use the slope/intercept values. IIf statements are to prevent DIV#0 errors where slope = 0. I orginally had these in a separate query, but later combined them in the hopes of increasing efficiency overall
Query: Sample_Ds (Queries Sample Gradation_Row_Math to select minimum value)
- four queries that select the minimum value from the four IIf statements in previous query, meeting a WHERE clause of greater than a constant value
Further queries need to use these Sample_Ds values!

arnelgp · Apr 12, 2018

You can do better with your query since you are only using two query.
Separate the two sub query into another query and dont filter yet.
Join the resulting query to the orig wury.

snow-raven · Apr 12, 2018

Thanks for the quick reply, arnelgp. Unfortunately, one of the things I've also tried is breaking it down into pieces. I've had it as ONLY the Previous_Pass subquery, and it is still weirdly slow. The preceeding query has three subqueries as well, could that be part of the problem? It loads very quickly, however.

Code:

SELECT Sample_Gradation_Retained.Grad_ID, Sample_Gradation_Retained.Samp_ID, Sample_Gradation_Retained.Sieve_Size, Sample_Gradation_Retained.Sieve_Log10, Sample_Gradation_Retained.Grad_Wt_Ret, 

(SELECT Sum(Run_wt.Grad_Wt_Ret)
FROM Sample_Gradation_Retained AS Run_wt
WHERE
Run_wt.Samp_ID = Sample_Gradation_Retained.Samp_ID 
AND Run_wt.Sieve_Size >= Sample_Gradation_Retained.Sieve_Size) AS Grad_Wt_Cum, Sample_Gradation_Retained.Grad_per_Ret, 

(SELECT Sum(Run_per.Grad_per_Ret)
FROM Sample_Gradation_Retained AS Run_per
WHERE
Run_per.Samp_ID = Sample_Gradation_Retained.Samp_ID 
AND Run_per.Sieve_Size >= Sample_Gradation_Retained.Sieve_Size) AS Grad_per_Cum, 1-[Grad_per_Cum] AS Percent_Pass, 

(SELECT MIN (Sieve_Size)  
FROM Sample_Gradation_Retained 
AS Sieve1 
WHERE Sieve1.Sieve_Size > Sample_Gradation_Retained.Sieve_Size  
AND Sieve1.Samp_ID = Sample_Gradation_Retained.Samp_ID) AS Previous_Sieve


FROM Sample_Gradation_Retained
ORDER BY Sample_Gradation_Retained.Samp_ID, Sample_Gradation_Retained.Sieve_Size DESC;

I'll edit my original post to include the things I've tried. I thought I'd been detailed, but I've been working on this for two weeks, so bits escape me.

Minty · Apr 12, 2018

Are all the fields you have Criteria or Aggregate clauses on indexed ?
Lack of proper indexes can cause things to bog down very quickly where you have effectively nested queries.

I'm with Arne as well - filter the results later should speed things up.
I would store each sub query as a separate query, and join them in order to see what stage is causing the slow down.

My understanding is that the access engine will optimize a stored query better than a multistep sub query.

isladogs · Apr 12, 2018

just to get a better picture, how long does the query take to run and roughly how records does it contain.
Also how many records in each of the source tables/queries

snow-raven · Apr 12, 2018

Thanks all,

To answer more fully, I have updated my original post with the sequence of tables & queries. My database currently contains 518 records in Sample_Gradation_Intervals and 284 records in Sample_Intervals. The queries mostly chain off of Sample_Gradation_Intervals, so they each contain 518 records, until the last one selects a minimum value for each tested sample.

The first four queries take less than a second each to run. Sample Gradation_Row_Math currently takes 3 minute 20 seconds.

Samp_ID is my indexed primary key in Sample_Intervals and Grad_ID is my indexed primary key in Sample_Gradation_Intervals. Sample_Intervals is indexed on a few other fields relating to parent tables. Sample_Gradation_Intervals is also indexed on Sieve_Size and SAMP_ID, and on the two collectively, per the recommendation of Analyze Performance.

I will try rearranging the queries again, but I still feel like I'm missing something key for so few records to take such a ridiculously long time to process...

The_Doc_Man · Apr 12, 2018

I might separate the queries and then, rather than have sub-queries, run a JOIN between the outer query and the separated aggregate queries.

I also note that you are running aggregates within the sub-queries but you aren't aggregating anything. So you are aggregating everything selected by the WHERE clauses for each internal iteration of the process. If instead you made these one-run aggregates with ORDER BY clauses, you can build a JOIN among queries, but what I think will happen is that you will run one summation query for the WHOLE TABLE, then the next and the next (and last but not least the MIN). THEN do a JOIN between the main and the subs.

When I try to imagine it your way, I see as many summations being run as there are distinct WHERE groupings. I think if you do it my way, you run five queries - one main and four subs. But hey, I could be wrong.

isladogs · Apr 12, 2018

I'd be fairly confident that the three layers of subquery are the issue.

Some other thoughts:
Are all the fields used in table joins, WHERE and ORDER BY clauses indexed?

Are all datatypes optimal? For example are you using
Memo fields when text would suffice
Text fields with max size of 255 when 20 would be enough
Double numbers when single or integer or byte would be ok.... Etc

CJ_London · Apr 12, 2018

you will need to do this in sql since the query gui won't display the join (highlighted in red). But you can significantly improve your last query example

Based on your last example, I've aliased the tables to make it more readably. I also cannot test it so there may be the odd typo

Code:

SELECT SGR.Grad_ID, SGR.Samp_ID, SGR.Sieve_Size, SGR.Sieve_Log10, SGR.Grad_Wt_Ret, sum(RWP.Grad_Wt_Cum) AS Grad_Wt_Cum, SGR.Grad_per_Ret, sum(RWP.Grad_per_Ret) AS Grad_per_Cum, 1-sum(RWP.Grad_per_Ret) AS Percent_Pass, Min(s1.Sieve_size) AS Previous_Sieve

FROM (Sample_Gradation_Retained  SGR 
    INNER JOIN (SELECT Samp_ID, Sieve_Size, Grad_Wt_Ret,Grad_per_Ret FROM Sample_Gradation_Retained) AS RWP ON SGR.Samp_ID = RWP.Samp_ID AND RWP.Sieve_Size [COLOR=red]>= [/COLOR]SGR.Sieve_Size)
        INNER JOIN (SELECT Samp_ID, Sieve_Size FROM Sample_Gradation_Retained) AS S1 ON SGR.Samp_ID = S1.Samp_ID AND S1.Sieve_Size[COLOR=red] >[/COLOR] SGR.Sieve_Size

GROUP BY SGR.Grad_ID, SGR.Samp_ID, SGR.Sieve_Size, SGR.Sieve_Log10, SGR.Grad_Wt_Ret, SGR.Grad_per_Ret

ORDER BY Samp_ID, Sieve_Size DESC;

to do this quickly, drag your Sample_Gradation_Retained onto the query grid three times, alias as I have suggested or not as you wish. join the three tables on samp_id and sieve_size.

Drag down the fields and group/sum as required.

Now go into sql view and modify the join requirement for sieve size. If you go back to the query gui you will get an error message that the join is not recognised and will be dropped. No worry, do your changes, and add the join back in as above.

As previously advised by others, ensure Samp_ID and Sieve_Size are indexed

Stumped by slow query

snow-raven

Registered User.

arnelgp

..forever waiting... waiting for jellybean!

snow-raven

Registered User.

Minty

AWF VIP

isladogs

MVP / VIP

snow-raven

Registered User.

The_Doc_Man

Immoderate Moderator

isladogs

MVP / VIP

CJ_London

Super Moderator

Similar threads

Users who are viewing this thread