"Top 2 per Category" Query Mind-Bogglingly Slow

lemo

Registered User.
Local time
Today, 16:45
Joined
Apr 30, 2008
Messages
187
i am running a query to produce top 2 dates in each "prop id" category (see SQL below).
the progress bar hasn't moved even a fraction of a millimeter while i was out to lunch.
is it normal? how slow can it go?!
the dbase is only about 47000 records strong.
l

SELECT S.[prop id], S.date
FROM dbase1 S
WHERE (((S.[year])>=2006) AND ((S.date) IN (SELECT
TOP 2 T.date from [dbase1] T WHERE T.[prop id] =
S.[prop id] Order by T.date DESC)))
ORDER BY S.[prop id], S.date DESC;
 
Subselects like this are MURDER....

You can do this in a 3 stage query that will be much faster.

But first
a) Dont use reserved words as columnnames (DATE)
b) Dont use spaces or special characters in columnnames
c) Use prefixes for tables (ie. tblDbase1)

1)
Select the max Date from your table per prop id
2)
Add your table and Query1 to the design, join the tables on Prop id
Select the max date where date <> Maxdate from query1
3)
Add your table and Query2 to the design, again join on prop id
Select all from your table where Date > maxdate from query2.

happy query-ing :)
 
Namliam, completely forgot to thank you, sorry!
well, thank you.

top 2, part 2:
left for a day, came back in the morning - still running. at about noon, it finished. 23 hours to output about 6500 rows! and then i still had to use Excel to arrange the output.. i could have copy-pasted faster..

i'll definitely use Excel next time, number of records permitting.

thanks again,
l
 
Two questions:
1) Is the query being run on a local table in the database, or on a linked table?
2) Are the prop id and date fields indexed?

As a test, I created a database that has a local table with the prop id, year and date fields only, and generated 47,000 random records. I then tried your query.

With no indexes, the query took forever to run. Then I created a non-unique index on the prop id and date together, reran the query, and it returned the results in under 10 seconds.
 
You're a lot more patient than I am. I'd never have made it 23 hours! :p

I find this a little interesting, as I have a more complex top-by-group query running on SQL Server that returns 10-15 records virtually instantly. The main table in my case contains over 150k records. While SQL Server is certainly faster than Access, something seems haywire to make this take 23 hours. Are the fields indexed? Is this on a network? Front end/back end?
 
i am purely an end-user of this database, all i see is an Access local version on my machine. so, i am not sure what type it is - SQL, Access, etc. i am more of an Excel user, i just use Access to pull data which usually end up in Excel. i'll ask the Information dept..

1 - if 'linked table' means that it is somewhere out there on the server, then yes, this is a linked table. i did try to run this query on a local version though, it was still slow (became reasonably fast only when i reduced the table to about 3000 records).
2 - neither field is indexed.

i guess the fact that the fields are not indexed is a problem?
good to know, thanks much.
so, ByteMyzer, you just concatenated the two variables and indexed the resulting one?
l
 
Linked table is indeed "somewhere out on the server"

Indexes really help...
 
I used the Index Builder to create an index containing the two fields (see attached image).
 

Attachments

  • Index_Builder.JPG
    Index_Builder.JPG
    34.7 KB · Views: 213
Namliam, i tried your 3-stage suggestion, and it worked, thanks.
one question - do i have to run those 3 queries separately, or there is a way to combine them so i can run the whole thing in one shot?

ByteMyzer, i tried your indexing suggestion, but for now am getting big fat blank for the output. which is no surprise, naturally, since i don't know anything about indexing. i'll read up on them when i have a chance, seems like it's a must.

on a side note - not only the query ran for 23 hours, it also returned a few blanks for the top second entries. all these blanks were in cases where the new data were entered in the database during those 23 hours. i am guessing the query was confused by the fact that what it thought was the latest date suddenly became the second latest..

thanks a bunch, as always,
l
 
You dont need to run them all, just run the final query.

Did you time the 3 step approach? I am curious to know... ;)
 
the 3rd query whipped out the answer in about 12 endless seconds..
l
 
LMAO... 12 seconds vs 24 hours???

HAHAHAHAHAHAHAHA the query master Rulz!
 

Users who are viewing this thread

Back
Top Bottom