Append query takes forever to run!

PaulSpell

Registered User.
Local time
Today, 13:20
Joined
Apr 19, 2002
Messages
201
I have a table containing 450,000 records (yes I know it's probably pushing the limits for Access) and need to perform an operation to summarise certain records and append them to another table, containing about 2,000-3,000 records.

Not surpirisingly it takes quite a while (5 hours in fact). I want to make sure that I have done everything I can to make sure this operation runs as fast and as smoothly as possible.

Therefore I am in the process of making the following changes to all of the tables used in the append query:

1. set up additional indexes.
2. change several 4 character string fields from strings to long integer values (they store things like year and id numbers etc).
3. set the 'Use Transaction' property in the append query to No.

I am hoping the above changes will make quite a dent on the time taken to perform this task. Can ayone suggest anything else that might help?

At present the table is sitting within the database, would it make any difference if the records were read from a flat file instead?

Also, is it likely to be any quicker if, instead of using an append query, I use VBA to write the new records?
 
Besides the indexes, the only real speed advantage is to perform operation on less records. So if you are only doing something to a portion of your 450K rows, create a select query to pull those rows only, and then base you operational queries off that. It sounds like you maybe scanning your 450K rows more than once, and that leads to long run times. There are other factors, like network speed, server usage, sharing, but often cutting the number of rows down is your best solution.
 
In addition to FoFa's reply, you may be able to archive records that are old and not needed by perhaps culling them by financial year or something.

Can you post why you have so many in one table and are they all needed as "current"?

Col
 
The problem is that the append query needs ALL of the data in this table. It is all current data and is used to produce summary info that is appended to history tables.

However, the process could be broken down into several stages by filtering out date ranges. I could write the date ranges to a temp table and then loop through these records passing ranges to the append procedure with each iteration. Do you think this might work better?
 
Another option to cut the time is to copy the relevant tables to a local frontend and do the donkey work on the frontend then import back into the backend - this is the quickest access combination there is - May save a huge chunk of time?
 
Fizzio said:
copy the relevant tables to a local

The trouble is that I am trying to automate as much of this as possible as I will be leaving this job in a few months and someone else will need to maintain the database.

Without an in-depth knowledge of the database they will not be able to do this (copy to local drive). Also access to the live version of the database will be restricted to user forms and not behind the scenes (I don't want other people tinkering and messing things up).

Therefore this process has to run across the network.
 
If the data is historic data, why do you re-run it every time? does it get amended and need to be re-run?

Col
 
The 450k records are all new, they are in relation to the past quarter. They get summarised and then appended to history tables. This takes place each quarter in order to give a full historical picture of how the numbers are developing.

The history tables are used for numerous reporting and analysis purposes.
 
Oh right I see, 450k records per quarter. Hmmm, have to think on that one.

Col
 
OK, where are these 450K rows coming from, can you summarize them coming from the source?
If not, lets think about what is happening.
Read a bunch of rows, sort them, summarize them.
Sorting that many rows is most likely the biggest time consumer in the process, and it has to sort them to summarize them. In reality if you can bring them in sorted properly and keep them in order (like use an autonumber ID field) you could write a VBA script to sequentially read through the data, and manually summarize them and put them in a temp table. Reading through once is a lot less I/O than sorting the rows.
Just a thought.
 
If you import the data and use an autonumber, you can keep the data in the proper order by the autonumber. I use this method all the time when bringing in outside data I need to keep in a certain order.
 
Pat Hartman said:
Copying the db to the local drive and running the process there will substantially decrease the processing time.

Pat, thanks for your reply. I realise that running this locally will make a huge difference to performance, but I am building this for someone else who will need to run this process each quarter and don't want them to have to keep copying the database backwards and forwards.

I will post a copy of the query tomorrow so that you can see what it is doing.
 
Here is the query:

"INSERT INTO tblRepBase_SyndNCFSummary ( QtrID, Synd, YOA, Capacity, YrEndResult, SQRQtrlyResult, NCF, ActPerc, EstPerc )
SELECT RetCurrQtr() AS QtrID, tblSynds_SyndYearsMaster.Synd, tblSynds_SyndYearsMaster.YOA, tblSynds_StandingInfo.Capacity, tblSynds_StandingInfo.SyndResult AS YrEndResult, tblSynds_QrtlyFinancialInfo.FcastResult AS SQRQtrlyResult, Sum(IIf(IsNull([Est_NCF]),0,[Est_NCF])) AS NCF, qryRepBase_NCFNamesPtns.ActPerc, qryRepBase_NCFNamesPtns.EstPerc
FROM (((tblSynds_SyndYearsMaster LEFT JOIN tblMbrCallProjections ON (tblSynds_SyndYearsMaster.Synd = tblMbrCallProjections.Synd) AND (tblSynds_SyndYearsMaster.YOA = tblMbrCallProjections.YOA)) LEFT JOIN qryRepBase_NCFNamesPtns ON tblSynds_SyndYearsMaster.SynYr = qryRepBase_NCFNamesPtns.SynYr) LEFT JOIN tblSynds_QrtlyFinancialInfo ON tblSynds_SyndYearsMaster.SynYr = tblSynds_QrtlyFinancialInfo.SynYr) LEFT JOIN tblSynds_StandingInfo ON tblSynds_SyndYearsMaster.SynYr = tblSynds_StandingInfo.SynYr
GROUP BY RetCurrQtr(), tblSynds_SyndYearsMaster.Synd, tblSynds_SyndYearsMaster.YOA, tblSynds_StandingInfo.Capacity, tblSynds_StandingInfo.SyndResult, tblSynds_QrtlyFinancialInfo.FcastResult, qryRepBase_NCFNamesPtns.ActPerc, qryRepBase_NCFNamesPtns.EstPerc
ORDER BY tblSynds_SyndYearsMaster.Synd, tblSynds_SyndYearsMaster.YOA;
"

The 450k table links to 4 other tables each containing a couple of thousand records. The query sums the NCF field and appends the relevant data to "tblRepBase_SyndNCFSummary" (which contains about 2000 records).

Any suggestions about how to get this maess to run faster will be gratefully received. Also let me know if you need any explanations.
 
Actually, I've just had a thought (doesn't happen very often), at the moment the query pulls in all of the associated data it needs to append to "tblRepBase_SyndNCFSummary" in one go by joining all of the tables together.

Will the proces work quicker if I break this down into several update queries? In other words populate "tblRepBase_SyndNCFSummary" with the data from the 450k table first, then use 4 more queries to join from "tblRepBase_SyndNCFSummary" to each other table in turn and update the fields in "tblRepBase_SyndNCFSummary".

What do you think?
 
Usually that helps in large queries, also why do you have an order by in your insert? You can order by when you select the data after it is inserted. Plus it has to sort by your group by to perform the group by, you are performing multi-sorts in this query. You may try removing the order by to see how big a differance, than you may consider breaking it apart if that does not gain you that much.
 
Just to let you know, I broke the process down into two stages. The first is a query that groups the 3 fields required from the 450k table and appends the data to a temp table. The second part joins the temp table to the other tables and appends the relvant data to the summary table.

The upshot is that this process now runs in a few minutes rather than a few hours , as before.
 

Users who are viewing this thread

Back
Top Bottom