Dealing With Large Queries

chergh

blah
Local time
Today, 23:54
Joined
Jun 15, 2004
Messages
1,414
Hey folks,

I have a SQL backend database from which I extract data from. One of the tables I query is the auditlog table which I use to look at how long it takes tickets to move through a particular process. This is a reasonably large table with ~2 million rows. I only want to extract records which contain information relating to my companies Qlist process, I use an intermediate access database and then after a bit of manipulation in the access db I output it to excel. Ideally I would do this something like:

Code:
INSERT stuff INTO table WHERE ID IN (SELECT ID FROM otherTable)

This just causes a "malloc" error to be returned. So instead I use this code.

Code:
strSQL1 = "SELECT DISTINCT Incident_Number FROM tbl_qlist"

Set rs1 = db.OpenRecordset(strSQL1)

rs1.MoveFirst

Do While rs1.EOF = False

    Set rs2 = db.OpenRecordset("SELECT Entry_ID FROM HPD_Help_Desk where Incident_Number = '" & rs1!incident_number & "'")

    strEntryID = rs2!Entry_ID
    
    strSQL2 = "INSERT INTO tbl_qlist_comp ( Incident_Number, Log_Field, Create_Date ) " & _
              "SELECT '" & rs1!incident_number & "' AS Incident_Number, [Log] AS Log_Field, Create_Date " & _
              "FROM HPD_HelpDesk_AuditLogSystem " & _
              "WHERE (((HPD_HelpDesk_AuditLogSystem.Original_Request_ID)='" & strEntryID & "'))"

db.Execute (strSQL2)

rs1.MoveNext

Loop

This is slow as hell so does anyone have any suggestions on how to improve this?

Doing anything on the SQL Server database is a no no, the DBA just doesn't allow it.
 
Drop all the criteria and bring it all over to a local mdb and filter / manipulate it- ?
 
I wonder if it's slow because you're using a dynamic SQL.

Why not use a parameter query instead?
 
Could certainly, at least, do with a breakdown of which table is local to where?
tbl_qlist - local?
tbl_qlist_comp - local?
HPD_Help_Desk - server?
HPD_HelpDesk_AuditLogSystem - server?

If so - tbl_qlist is used to gather the required Incident Numbers upon which you want to report?
 
Yep Leigh you've got it right there.

Essentially I have to identify a list of incident numbers that have been through the qlist process using:

Code:
SELECT     C1 AS Request_ID, 
               C450 AS Original_Request_ID, 
               C459 AS [Log], 
               C1000000161 AS Incident_Number, 
               C452 AS Audit_Date, 
               C3 AS Create_Date, 
               C6 AS Modified_Date
FROM         dbo.T1445
WHERE     (C450 LIKE 'inc%') AND 
              (C459 LIKE '%Q%') AND 
              (C459 NOT LIKE '%NQ%') AND 
              (C459 LIKE '%Categorization Tier 3%')

The above is a server side view.

Which is used to create the table tbl_qlist. I then use the incident number to get the entry ID from the table HPD_Help_Desk. I use the entry ID to extract the data from HPD_HelpDesk_AuditLogSystem.

If I try to import the tables locally it errors out with malloc erros. For the parameter query I'm not quite sure how I would set it up for this. using vba and dynamic sql it was a lot easier to get my head round.

I've not actually looked at this for about 6 months, higher priorites came along and now I'm trying to remember why i did things the way they are.
 
Can you guestimate the amount of rows you'd usually end up with in tbl_qlist by way of a listing of incident numbers?
(i.e. how many incident numbers does a user usually want to be reporting on at once?)
 
Well the purpose of this is really to allow the generation of metrics on the qlist process, so management can have on demand reporting of how the process is working. Once it's in access I do a bit of manipulation then spit it out into excel and create pretty graphs and pivot tables for the senior management to look. So really all existing incidents have to be reported on each time.
 
So tbl_qlist effectively lists every incident number?
(How many is that though?)
 
THe best way to do stuff like this is to let the server do the heavy lifting.

Some query like .... (quick cut together)
Code:
INSERT INTO tbl_qlist_comp ( Incident_Number, Log_Field, Create_Date ) 
SELECT IncedentNumber AS Incident_Number, [Log] AS Log_Field, Create_Date 
FROM HPD_HelpDesk_AuditLogSystem 
WHERE HPD_HelpDesk_AuditLogSystem.Original_Request_ID in (
SELECT     C1 AS Request_ID
FROM         dbo.T1445
JOIN         HPD_Help_Desk on ....
WHERE     (C450 LIKE 'inc%') AND 
              (C459 LIKE '%Q%') AND 
              (C459 NOT LIKE '%NQ%') AND 
              (C459 LIKE '%Categorization Tier 3%')
)
That may, will and does contain errors, but you get the idea I think...

The idea is to take out as much as possible trafic from your DB to your server. 1 statement to your server, one dataflow/recordset back to the DB.

Also note your query using 4 LIKE statements. LIKE elimenates the use of Indexes if they excist on these fields....
It is much better to use IN ('','','' ) if there are a relative limited number of posibilities or use a "lookup table" on the server to make a join (NOT outer join, but JOIN)

Good luck
 
tbl_qlist returns ~800 unique incident numbers, very few tickets actually go through the qlist process which is our process for fixing bugs that have made into released versions of firmware and hardware, this equates to around 64,000 records being returned into tbl_qlist_comp.

It seems quite obvious now that the reason this takes ages is that it will be executing 800 queries.

I would prefer to do this server side but the dba for this system won't allow any changes to the DB unless it is really needed and trying to convince him of that is a pain, it's the customer services db.

Might try using a WHERE SOMETHING IN (SELECT TOP 50 PERCENT and SELECT BOTTOM 50 PERCENT and see if that avoids the 'malloc' errors.
 
Doing something like the big fetch query I suggested and taking care of the "LIKE" stuff in your where clause will greatly increase your speed/responsetime.

If your tables T1445 and HPD_Help_Desk have a 1:M relationship with M beeing a good number of records...
You may also be helped by making your "Where clause" into a subquery instead of doing it after the join.

Also if your tickets have a sequential number and only new(er) tickets are added for your bug fixing, you can limit the records that need to be fetched by doing an incremental update from the database.
 
i will look at getting rid of the like stuff on C450, the other like stuff is needed to only return qlist items and it's a memo field so won't be indexed and getting rid of it would cause >2 million records to be returned.
 
like stuff ... it's a memo field
*ouch*

Like on a memo field, that is a performance NIGHTMARE! anyway you look at it... If you are forced to use that, forget about the C450 because that will not make (much) difference in that case :(

Sending the one query into the server tho will help. :D
 

Users who are viewing this thread

Back
Top Bottom