VBA Nested SQL Efficiency

Vigilante001

A Jolly Fellow...
Local time
Today, 01:35
Joined
May 16, 2008
Messages
28
VBA Nested SQL Efficiency (SOLVED)

Hello Gurus,

I have the following SQL code which gets the results I need. Unfortunately, it takes a long time to run. I am familiar enough with SQL to build complex queries, but in this case I'm not confident that I did so in the most efficient manner. Any suggestions on how to improve it?

Code:
  [COLOR=#943634]SELECT p.Date, p.FirstNameInt, p.LastNameInt, p.DOBChild, p.DatabaseID, p.InteractionType, s.[Phone Number][/COLOR]
  [COLOR=#943634]FROM [Interaction Table] AS p INNER JOIN SampleData AS s ON p.DatabaseID = s.AutoNumber[/COLOR]
  [COLOR=#943634]WHERE ((p.Date Between #2/1/2010# And #2/28/2010#) AND [/COLOR]
  [COLOR=#943634](p.DatabaseID In[/COLOR] [COLOR=#00B050](SELECT b.DatabaseID [/COLOR]
  [COLOR=#00B050]                                FROM[/COLOR] [COLOR=#E36C0A](SELECT a.databaseID, a.FirstNameInt, a.LastNameInt, a.txtBehind, a.UTD431331, a.chkUnassessed [/COLOR]
  [COLOR=#E36C0A]                                             FROM [Interaction Table] AS a [/COLOR]
  [COLOR=#E36C0A]                                            WHERE (a.Date Between #2/1/2010# And #2/28/2010#) AND [/COLOR]
  [COLOR=#E36C0A]                                            (a.InteractionType='visit' Or a.InteractionType='Assessed') AND [/COLOR]
  [COLOR=#E36C0A]                                             a.txtBehind=True[/COLOR]
  [COLOR=#E36C0A]                                         )[/COLOR] [COLOR=#00B050]AS b[/COLOR] 
                                  [COLOR=#00B050]WHERE b.DatabaseID NOT IN[/COLOR] [COLOR=#AC5208](SELECT t.DatabaseID [/COLOR]
  [COLOR=#AC5208]                                                                FROM [Interaction Table] AS t [/COLOR]
  [COLOR=#AC5208]                                                               WHERE (t.Date Between #12/1/2009# And #5/28/2010#) AND [/COLOR]
  [COLOR=#AC5208]                                                                             t.InteractionType = 'Post-Visit Call')[/COLOR]
  [COLOR=#943634]) AND p.InteractionType='Visit')[/COLOR]  
  [COLOR=#943634]ORDER BY p.Date;[/COLOR]
Thanks for any help offered!
 
Last edited:
One major flaw, wll not actually a flaw, but a noticable mistake is use the word Date as a field name in a table. This is an Access reserved word and because it is not even bracketed can influence the way Access treats this iten in the SQL statement.
 
At first I thought you could combine into one select statement, but then realised you have probably got more than one instance of DatabaseID in [Interaction Table].

I find access to be very slow when combining more than one query. I usually do maketables instead and then use your 2 new temp tables, instead of your nested select statements in your original "p.DatabaseID In" and "not in" clauses. That should speed it up alot, even including the time it takes to do the maketable queries.
 
You've got two IN clauses which act almost like an array, so in terms of an array you've got an inner and outer loop.

Look into replacing your IN clauses with INNER JOINs

OR

Create a query (i.e. just one) to run your three subqueries and use an INNER JOIN to the Main query.

Also remember to check that the ID fields and/or join fields are indexed. Bear in mind that joining on number fields will also run faster on text fields.
 
Don't know how many records you have in your interaction table but you are loading it 3 times so the query is processing the table three instances seperately and applying a filter it each intance then conjoining them together and finally it sorting them in its entirety. I think that this is the source of the problem.
 
I don't see why--since all your queries are based on the same table--that you need subqueries at all. I'd only use a subquery to work with data that's not available to my main query.
 
Excellent feedback all... I'm on the road to NYC, so once I'm in front of my computer I'll give these a try and report back. Thank you all!
 
I have to admit, I think the good folks have been a little harsh on your implementation.
(Not saying they're being nasty - but that your apparent SQL convolution is not without cause).
But then you do have some fairly convoluted requirements (i.e. an entirely different set of criteria if InteractionType might be 'Assessed' instead of 'visit'. :-s

There's an old rule of thumb, join - don't corrolate.
The In operator in itself isn't evil - but neither is it usually a preferable choice. Add corrolation into the mix an it's inevitably relatively poor performing.
However your nested Not In is the devil in your detail. That can not be index optimised by the engine. So, hence, "Join!" is the inevitable cry.

However ANSI joins (certainly in Jet anyway) can bring with them an inherent likelihood of non-updatabilty if any part is non-updatable or not key joined).
So you need to weigh up if these results are for information (display) or you hope to work directly within them.
Personally, it's rare that I would query (other than in the very most basic syntax) and then want to edit those results.
(FWIW non-ANSI joins would always be non-updatable in Jet, but not so in the server RDBMS platforms).

Total aircode naturally:
Code:
SELECT p.Date, p.FirstNameInt, p.LastNameInt, p.DOBChild, p.DatabaseID, p.InteractionType, s.[Phone Number]
FROM 
  ([Interaction Table] p 
    INNER JOIN 
  SampleData s 
    ON p.DatabaseID = s.AutoNumber)
    INNER JOIN
  (SELECT a.databaseID
   FROM [Interaction Table] a 
       LEFT JOIN
    (SELECT t.DatabaseID 
     FROM [Interaction Table] AS t 
     WHERE (t.Date Between #12/1/2009# And #5/28/2010#) AND 
          t.InteractionType = 'Post-Visit Call') X1
       ON a.databaseID = X1.DatabaseID
   WHERE (a.Date Between #2/1/2010# And #2/28/2010#) AND a.txtBehind=True AND 
         (a.InteractionType='visit' Or a.InteractionType='Assessed') AND
         X1.DatabaseID Is Null
  ) X
    ON p.DatabaseID = X.DatabaseID
WHERE (p.Date Between #2/1/2010# And #2/28/2010#) 
    AND p.InteractionType='Visit'
ORDER BY p.Date;

Your involved criteria requirements will likely always make this less than ideal performance wise though.

Cheers.
 
Last edited:
Assuming the [interaction table] is a pretty big one that has an Index on your DatabaseID, it may well prove very usefull to join your SampleData into each of the sub queries...

Also, with the very simular: (p.Date Between #2/1/2010# And #2/28/2010#)
and (p.Date Between #2/1/2010# And #2/28/2010#)

It may be worth a shot to use a sum(IIF()) structure...
That way instead of doing 2 queries you only do one, counting the number of times the 'Visit' and the 'Assessed' happen

Visits: Sum(iif(InteractionType='Visit',1,0))
Assessed: Sum(iif(InteractionType='Assessed',1,0))

Then put a criteria on that for >= 1

Another thing to consider, is your Date field indexed?
Also on your InteractionType? Indexed?

General note: Having a column like InteractionType, with strings, this column sounds like it has a limited number of values and might be a candidate for a Lookup/Combo?
 
Thanks all for the great insight!

vbaInet & LPurvis, I'm going to try the JOIN modifications tomorrow and see if that runs better (I imagine so...).

Just to clarify - and also to potentially clarify me if the code doesn't reflect what I thought it did - the code should be applying the same criteria to both Visit & Assessed interactions. I'll provide the concept behind it because the code is verbose.. although I think most distilled it from the code:

1.There are two tables:
Sample Data - Indexed on AutoNumber, contains clients & their demographic information. (aka, phone number here)
Interaction Table - contains any interaction for a client in SampleData, indexed by Autonumber, but related to SampleData by the DatabaseID field.

2. The goal of the query is to return a list of dates for clients that meet the following criteria:
a. They have a "Visit" or "Assessed on ImpactSIIS" entry within the date range
b. They also do NOT have an interaction called "Post-visit Call"
(aka, they have been assessed at visit or online, but not yet followed up with)

That was the intent of the code... I saw some comment on the different processes for "Visit" and "Assessed". There shouldn't be a difference... so if there is in the code, definitely point it out. Egad! (:o)

Namliam, you are correct about the the InteractionType field having limited values (6 to be exact). What did you mean by a lookup/combo codewise? I'm familiar with the access object sense of a combobox...

Again, thanks to you all. I'll post back with the results...
 
Well, in your subquery you have:
(a.InteractionType='visit' Or a.InteractionType='Assessed') AND a.txtBehind=True
but in the main query only:
p.InteractionType='Visit')
apart from the matching dates of course.
That's the difference I was refering to. How the two behave differently.

Ultimately, you'll know if your query was giving the correct results (despite it being slow).
If it was, then so should the joined version. (Don't know if it will be plug and play though).

Cheers.
 
What I mean, if the data is entered manually... You will run into problems, if instead of entering "Visit" or "Vsiit" or "Visti" people / users can choose from a combobox to enter their data.

You would store the ID of the combo value in your table, should reduce some overhead, though probably very little.
 
Gotcha... yep, that was limited to a list from the start. Gotta love human error.
 
Woot!

Alright! Works like a charm now; The JOIN statements did the trick. I'd like to thank you all for the insight! I'd rep many of you, but since I'm limited to one I'll give it to Namlaim|mailmaN because of the in-depth answer with sample code showing JOIN examples.

Here was the resultant code:

Code:
 [COLOR=#943634]SELECT o.DatabaseID, o.Date, o.FirstNameInt, o.LastNameInt, o.DOBChild, a.[Phone Number] [/COLOR]
  [COLOR=#943634]              FROM [/COLOR][COLOR=#984806]([/COLOR][COLOR=#00B050]([/COLOR]
  [COLOR=#00B050]                     SELECT g.DatabaseID, g.date, g.FirstNameInt, g.LastNameInt, g.DOBChild [/COLOR]
  [COLOR=#00B050]                     FROM [Interaction Table] AS g _[/COLOR]
  [COLOR=#00B050]                     WHERE (g.InteractionType = 'Visit' OR g.InteractionType = 'Assessed on ImpactSIIS') AND (g.Date Between #3/1/2010# And #3/31/2010#)[/COLOR]
  [COLOR=#00B050]                     ) [/COLOR][COLOR=#984806]AS o INNER JOIN SampleData AS a ON o.DatabaseID = a.AutoNumber[/COLOR]
  [COLOR=#984806]                    )[/COLOR][COLOR=#943634] LEFT JOIN [/COLOR][COLOR=#E36C0A]([/COLOR]
  [COLOR=#E36C0A]                                 SELECT a.DatabaseID [/COLOR]
  [COLOR=#E36C0A]                                 FROM [Interaction Table] AS a [/COLOR]
  [COLOR=#E36C0A]                                 WHERE a.InteractionType = 'Post-Visit Call' AND [/COLOR]
  [COLOR=#E36C0A]                                 (a.Date Between #1/1/2010# And #5/31/2010#)[/COLOR]
  [COLOR=#E36C0A]                                 )[/COLOR][COLOR=#943634] AS i ON o.DatabaseID = i.DatabaseID WHERE i.DatabaseID Is Null [/COLOR]
  [COLOR=#943634]ORDER BY o.Date ASC;[/COLOR]

Again, thanks to all!
 
I'll give it to Namlaim|mailmaN because of the in-depth answer with sample code showing JOIN examples.
I think you mean Lee, not me... however now you force/tempt me to 'expand' my input a bit :P

Mostly for (my?) readability I changed the layout a bit... also changed 2 things:
Code:
SELECT o.DatabaseID, o.Date, o.FirstNameInt, o.LastNameInt, o.DOBChild, a.[Phone Number] 
FROM ((
       SELECT g.DatabaseID, g.date, g.FirstNameInt, g.LastNameInt, g.DOBChild 
       FROM [Interaction Table] AS g 
       [COLOR="Red"]INNER JOIN SampleData AS a ON g.DatabaseID = a.AutoNumber[/COLOR]
       WHERE (   g.InteractionType = 'Visit' 
              OR g.InteractionType = 'Assessed on ImpactSIIS') 
         AND g.Date Between #3/1/2010# And #3/31/2010#
      ) AS o 
     ) 
LEFT JOIN (
           SELECT a.DatabaseID 
           FROM [Interaction Table] AS a 
           [COLOR="Blue"]INNER JOIN SampleData AS b ON a.DatabaseID = b.AutoNumber[/COLOR]
           WHERE a.InteractionType = 'Post-Visit Call' 
             AND a.Date Between #1/1/2010# And #5/31/2010#
          ) AS i ON o.DatabaseID = i.DatabaseID 
WHERE i.DatabaseID Is Null 
ORDER BY o.Date ASC;

The red inner join I moved up, this should return less records from the (first) subquery (o), thought the optimization might already handle this... It also reads better (to me)

The blue join I added, as it will result in less records parsing thru the Left Join, only the sampledata needs to be joined.

PLEASE be aware (it has been mentioned) that DATE is a reserved word and shouldnt be used, same for AutoNumber (I think)

This would be my solution to this same problem (air code)
Code:
Select DatabaseID, Sum(Visits) as CountVisits, Sum(PostVisit) as CountPostVisits
from (
      SELECT DatabaseID
           , IIf(    [date] Between #3/1/2010# And #3/31/2010# 
                 And (    [InteractionType]='Visit' 
                       Or [InteractionType]='Assessed on ImpactSIIS'),1,0) AS Visits
           , IIf(    [Date] Between #1/1/2010# And #5/31/2010#
                     And [InteractionType]='Post-Visit Call' ,1,0) AS PostVisit
      FROM [Interaction Table] as i
      INNER JOIN SampleData AS s ON i.DatabaseID = s.AutoNumber
      Where [Date] Between #1/1/2010# And #5/31/2010#
        And (   [InteractionType]='Post-Visit Call' 
             OR [InteractionType]='Visit' 
             Or [InteractionType]='Assessed on ImpactSIIS')
     ) as x
GROUP BY DatabaseID
HAVING sum(visits)>=1 
   AND Sum(PostVisit)=0;
This can even fit into one query but it gets a bit complicated to write with aircode as you have to repeat the IIF in the Having.

This should list all that your looking for, without the need to look at the same table two times
 
That's what I get for posting while tired. Thanks Lee for your in depth example, and to you mailman for your expanded post. I'll check out the ideas later today.
 

Users who are viewing this thread

Back
Top Bottom