loop gets progressively slower (1 Viewer)

frederik

Registered User.
Local time
Today, 14:56
Joined
Feb 5, 2008
Messages
28
Hi,

I have an Access program that basically creates Excel-files based on a query on an Access table.
The program uses loops and does the following:

- it queries each loop another year
- it transfers each loop the results of that query in a temporary excelfile X (using DoCmd.TransferSpreadsheet acExport)
- it makes a copy of some columns from the source temporary excelfile X to another (definitive) excelfile Y.
- in Excelfile Y, it pastes the columns each loop after the columns of the previous year. So the excelfile Y gets bigger each loop.

What I notice is that each loop takes more and more time to complete.
At the first loop, it takes 28 seconds to complete, which is relatively fast.
At loop 10, it takes 49 seconds.
Loop 20, it needs 72 seconds.
At loop 30, we are talking about 100 seconds.
As you can see, the growth in duration is linear, certainly not exponential.

After putting some timers in my code, I found out which part of the code is causing this increasing processing time: it is the actual copying:

Code:
For year = a To b

'preprocessing code

'copying:
x.Worksheets(1).Columns(9).Copy Destination:=y.Worksheets(sheetnaam).Columns(start_kolom + 0)
x.Worksheets(1).Columns(10).Copy Destination:=y.Worksheets(sheetnaam).Columns(start_kolom + 1)
x.Worksheets(1).Columns(11).Copy Destination:=y.Worksheets(sheetnaam).Columns(start_kolom + 2)
x.Worksheets(1).Columns(12).Copy Destination:=y.Worksheets(sheetnaam).Columns(start_kolom + 3)
x.Worksheets(1).Columns(16).Copy Destination:=y.Worksheets(sheetnaam).Columns(start_kolom + 4)

'postprocessing code

next

I’m not a VBA expert, so this is why I’m wondering if the method I used is the most efficient one.
Is it normal that in such situations - the excelfile gets bigger each loop - I have to accept this increasing processing time?
Or are there more effcient ways to do the copy between the 2 Excelfiles, reducing the processing time?
(ideally, each loop would take the same amount of processing time)

Full code in attachment (did some modifications for reasons of identification)
 

Attachments

  • code.txt
    15.9 KB · Views: 473

Minty

AWF VIP
Local time
Today, 13:56
Joined
Jul 26, 2013
Messages
10,371
Check (Use the taskmanager) that you aren't getting more and more instances of Excel left open.
If you are your available memory will be gobbled up and things will slow down.

If you are, it might be prudent to make sure you explicitly Open and the Close the Excel Object.

Also it would be more efficient to copy columns 9-12 in one hit as they are going to a matching sized destination.
And while you are at it copying the entire column means looking at over a million cells, much better to find out what the used range is and only copy that.
 

arnelgp

..forever waiting... waiting for jellybean!
Local time
Today, 20:56
Joined
May 7, 2009
Messages
19,237
can you try this:
 

Attachments

  • code.txt
    15.6 KB · Views: 479

frederik

Registered User.
Local time
Today, 14:56
Joined
Feb 5, 2008
Messages
28
Thank you Arnelgp and Minty for your reply.

There is/was no problem of multiple instances of the Excel Object with the original code.
To see the effect, I changed some code as suggested by Arnelgp, but there was no improvement.

What did make a big difference was not copying the entire column(s), but only a range of values:

Code:
Set yy = y.Worksheets(sheetnaam)
Set xx = x.Worksheets(1)

yy.Range(yy.Cells(1, start_kolom), yy.Cells(LastRow, start_kolom + 3)).Value = xx.Range(xx.Cells(1, 9), xx.Cells(LastRow, 12)).Value
yy.Range(yy.Cells(1, start_kolom + 4), yy.Cells(LastRow, start_kolom + 4)).Value = xx.Range(xx.Cells(1, 16), xx.Cells(LastRow, 16)).Value
                              
Set yy = Nothing
Set xx = Nothing

These were the runtimes (to compare with the first post):
loop 1: 16 seconds
loop 10: 18 seconds
loop 20: 21 seconds
loop 30: 27 seconds

Total runtime of my program was over 1 hour before this optimization, now it took only 32 minutes to accomplish.
As you can see, every loop still gets progressively slower, but I can live with it.
I consider this as a logical consequence because of the fact that the excelfile gets bigger each loop.
 

Pat Hartman

Super Moderator
Staff member
Local time
Today, 08:56
Joined
Feb 19, 2002
Messages
43,264
Are you sure you can't collect all the data with a single query? Perhaps a union query if a simple select won't do it. Then you can do a single export and be done with it.

If you can post your selection criteria for each iteration, we might be able to offer an alternative.
 

Minty

AWF VIP
Local time
Today, 13:56
Joined
Jul 26, 2013
Messages
10,371
A couple of final thoughts, can't you collect this data all in one go, or at least present for export in one go?

And turn off automatic calculations in the spreadsheets until the last data is entered.
 

frederik

Registered User.
Local time
Today, 14:56
Joined
Feb 5, 2008
Messages
28
Thank you for your reactions, Minty & Pat.

The base query is as follow:

Code:
queryTemp2 = "SELECT tabF.ref, " _
                & "tabR.NL , tabR FR , tabR.niveau, " _
                & "tabF.rub, " _
                & "tabR.[rub NL], " _
                & "tabR.[rub FR], " _
                & "queryTemp1.YEAR, " _
                & "queryTemp1.M_A, " _
                & "queryTemp1.M_B, " _
                & "queryTemp1.M_C, " _
                & "queryTemp1.M_D, " _
                & "queryTemp1.M_E, " _
                & "queryTemp1.M_F, " _
                & "queryTemp1.M_G, " _
                & "queryTemp1.M_H " _
                & "FROM (((tabF " _
                & "LEFT JOIN queryTemp1 " _
                & "ON (tabF.rub = queryTemp1.rub) " _
                & "AND (tabF.ref = queryTemp1.ref)) " _
                & "LEFT JOIN tabR ON tabf.ref = tabR.ref) " _
                & "LEFT JOIN tabR ON tabf.rub = tabR.rub) " _
                & "WHERE tabR.niveau = " & rsRefniveau("ID") & " " _
                & "AND tabf.jaar = " & rsJaar("Jaar") & " " _
                & "ORDER BY tabF.ref, tabR.SRT; "



The final output in excel should be a sheet where per year, 5 variables are presented.


So for the first year or loop, 1980, you have:

(the labels => this is copied in the first loop)
Column A which containes variable ref
Column B which containes variable NL
Column C which containes variable rub
Column D which containes variable rub NL

(the values)
Column F which containes variable M_A
Column G which containes variable M_B
Column H which containes variable M_C
Column I which containes variable M_D
Column J which containes variable M_H

Next loop, year 1981, you have:

(column K = empty column)

(the values)
Column L which containes variable M_A
Column M which containes variable M_B
Column N which containes variable M_C
Column O which containes variable M_D
Column P which containes variable M_H

Next loop, year 1982, you have:

(column Q = empty column)
(the values)
Column R which containes variable M_A
Column S which containes variable M_B
Column T which containes variable M_C
Column U which containes variable M_D
Column V which containes variable M_H

Next loop year 1983

(and so on)

You can see the wanted output here
I'm very curious how this can be done with just one query.
 
Last edited:

Pat Hartman

Super Moderator
Staff member
Local time
Today, 08:56
Joined
Feb 19, 2002
Messages
43,264
Given that your layout is anti-relational, it would be difficult to get a single query with that many columns.

One thing might be to export a separate csv for each year. Then to populate the spreadsheet, you would open all the csv files and write one row from each getting the first group of columns from the first file, the second group from the second file, etc. This is pretty convoluted but it means that you only have to write the spreadsheet once. One row at a time with ALL the columns.

To load the arrays (or create the separate csv files), create a query that selects ALL the years. Then in a recordset loop, add year1 to array1, year2 to array2, etc. Don't name anything with specific years. Use Year1, Year2, etc so that the data selection controls the years and the code doesn't care.

Instead of separate files, you could use arrays. They would be faster since there is no I/O involved and I/O is the most expensive operation you can do in any program. It really depends on if you have enough memory to hold all the arrays without a problem.

I'm assuming that the rows are always consistent for all years or already you would have had alignment issues with your current method.
 

frederik

Registered User.
Local time
Today, 14:56
Joined
Feb 5, 2008
Messages
28
Thank you @Pat Hartman for your advice

I'm very interested in the idea of loading all values in arrays.
If I'm understanding it correctly, one array contains the variables (year, varA, varB, varC, varD) for one specific year.
I have to loop through the recordset (which contains all years) and for each new year, I have to create a new array, right?

This is the code I would use for loading 1 specific year into an array:

Code:
Dim rstData As DAO.Recordset
Dim arrYear As Variant

Set rstData = CurrentDb.OpenRecordset("select year, varA, VarB, VarC, VarD from querytemp2 where year = " & rsYears("Year"))
arrYear = rstData.GetRows(rstData.RecordCount)

How could I change this code so I can make it work with a loop " For year = 1 to nYears" ?
It feels like I have to create another array, one dimension, where I can store the arrays for each specific year.
Like an array of arrays, but as you can see, it is not 100% clear to me how to do that: (following code does not work)

Code:
Dim rstData As DAO.Recordset
Dim arrYear As Variant
Dim arrAllYears as Variant

For i to nYears
    Set rstData = CurrentDb.OpenRecordset("select year, varA, VarB, VarC, VarD from querytemp2 where year = " & rsYears("Year"))
    arrYear = rstData.GetRows(rstData.RecordCount)
    arrAllYears(i) = arrYear

rsYears.movenext
Next

Thank you for your advice
 

gemma-the-husky

Super Moderator
Staff member
Local time
Today, 13:56
Joined
Sep 12, 2006
Messages
15,653
Code:
For i to nYears
    Set rstData = CurrentDb.OpenRecordset("select year, varA, VarB, VarC, VarD from querytemp2 where year = " & rsYears("Year"))
    arrYear = rstData.GetRows(rstData.RecordCount)
    rstData.close 'I am sure you should do this each loop
    set rstdata = nothing

    arrAllYears(i) = arrYear
    rsYears.movenext
Next

I am sure you ought to close the recordset as above
Why do you keep using querytemp2?
why not have a query that includes the year as a criteria, rather then keep filtering a bigger query? It might not help - but I am not sure.

Also
what does the loop for 1 to nYears do?
where does nYears get used within the loop?
 

Pat Hartman

Super Moderator
Staff member
Local time
Today, 08:56
Joined
Feb 19, 2002
Messages
43,264
I didn't have time to look at all these but you might find even more. You need a two dimensional array because you have rows and columns

http://www.worldbestlearningcenter.com/index_files/Access-vba-array-two-dimensions.htm ''cute but might not be useful

 

Isaac

Lifelong Learner
Local time
Today, 05:56
Joined
Mar 14, 2017
Messages
8,777
Make sure to turn off the excel application's ScreenUpdating and also set calculation to xlManual before running the code - then turn it back on after running the code. This will make the whole thing run faster generally
 

frederik

Registered User.
Local time
Today, 14:56
Joined
Feb 5, 2008
Messages
28
Thank you very much @Pat Hartman for all this documentation, very interesting!
Arrays is a subject that I’m not that familiar with, that’s why I appreciate your support so much.

I’m also convinced I need a two dimensional array – for each year 1 array (as you stated in your post #8)

The code for earch year would look like this:

Code:
'First year 1980

Set rstData = CurrentDb.OpenRecordset("select year, varA, VarB, VarC, VarD from sourcetable where year = 1980")
arrYear_year1 = rstData.GetRows(rstData.RecordCount)

'Second year 1981

Set rstData = CurrentDb.OpenRecordset("select year, varA, VarB, VarC, VarD from sourcetable where year = 1981")
arrYear_year2 = rstData.GetRows(rstData.RecordCount)

'Third year 1982

Set rstData = CurrentDb.OpenRecordset("select year, varA, VarB, VarC, VarD from sourcetable where year = 1982")
arrYear_year3 = rstData.GetRows(rstData.RecordCount)

' and so on until 2020 ...

As you can see, this could use a for ... next loop.
I know how to modify the sql-code so that the year increments for each year.
But what about the next line? how can I make arrYear_YearX in function of the index i:

Code:
'rsYears is a select distinct of all available years
rsYears.movefirst

For i=1 to nYears
 
  

    Set rstData = CurrentDb.OpenRecordset("select year, varA, VarB, VarC, VarD from sourcetable where year = " & rsYears("Year"))
    arrYear_yearX = rstData.GetRows(rstData.RecordCount)  '=> how to make this in function of the index i ?
    Set rstData = nothing

    rsYears.movenext
Next
 
Last edited:

Pat Hartman

Super Moderator
Staff member
Local time
Today, 08:56
Joined
Feb 19, 2002
Messages
43,264
This one looks like what you are attempting.
Recordset.GetRows method (DAO) | Microsoft Docs
GetRows Method (DAO) - Microsoft DAO 3.60 Documentation


I have never used this method and I don't think I would based on those examples. It seems like you must specify the number of rows to return but you can ask for more than are retrieved and only get what was retrieved so just use a fixed number since you already know how many rows you are getting (or your original example wouldn't be working).

You will need to create n arrays. One for each year. Don't hardcode the years. Just name them ary1, ary2, etc.

Something simpler just occurred to me. Do NOT select year. Just load ALL the years you want into a single 2-dimensional array. Then as you populate the spreadsheet adjust the index so you append the columns from row 0, row + 100 (the number of rows for each year), row + 200, etc. So you write ONE row for all years at a time. That part can be in a loop. You need to keep two counters. 1 for the rows in a single year and a second for where you are in the year list. After all years have been loaded for the forst row, increment the single year row count to to the second row, then it would still be row + 100 for the second year, row +300 for the third year, etc

Also, as someone already mentioned, if you have the spreadsheet visible, it's entertaining to watch but really slows down the processing.
 

frederik

Registered User.
Local time
Today, 14:56
Joined
Feb 5, 2008
Messages
28
Thank you very much again @Pat Hartman .

This is the result of putting all data (all years, which is 38 years) in just one array (each year has 18K rows of data) and copying the data to an excelsheet.

This is the code I used to load the data to an array. This goes relatively fast.

Code:
'load all data to an array
Set rsResults = CurrentDb.OpenRecordset(sourcedata)
arrResults = rsResults.GetRows(rsResults.RecordCount)
Set rsResults = Nothing

This is the code I used to copy the data from the array to the excelsheet:

Code:
With y.Worksheets(sheetnaam)

    For cc = 0 To nRows - 1
                  
        ' copy the dimensions to sheet
        .Cells(cc + 3, 1) = arrResults(0, cc)
        .Cells(cc + 3, 2) = arrResults(1, cc)
        .Cells(cc + 3, 3) = arrResults(4, cc)
        .Cells(cc + 3, 4) = arrResults(5, cc)

        ' copy all data for each year to sheet
                                    
        For rr = 0 To nYears - 1
            .Cells(cc + 3, rr + start_kolom) = arrResults(7, cc + rr * nRows) 
            .Cells(cc + 3, rr + start_kolom + 1) = arrResults(9, cc + rr * nRows) 
            .Cells(cc + 3, rr + start_kolom + 2) = arrResults(10, cc + rr * nRows)   
            .Cells(cc + 3, rr + start_kolom + 3) = arrResults(11, cc + rr * nRows)   
            .Cells(cc + 3, rr + start_kolom + 4) = arrResults(15, cc + rr * nRows)   
                                    
            start_kolom = start_kolom + 5
                                    
         Next
                        
         start_kolom = 6
    Next

End With

This code does the job very well, but it is slow.
To give an idea, after 15 minutes of running time, only 20% of the array was copied.
(I turned off screen updating and visible = false)

I guess it's because it does the copy cell by cell, instead of copying a range.
With arrays, I did not find a solution to copy a whole row or a whole column to a sheet in just 1 instruction.

The solution presented in post #4 of this thread is the quickest until now and usable in production-environment.

I want to thank everybody here for giving me the necessary insights the last couple of days.
 

Pat Hartman

Super Moderator
Staff member
Local time
Today, 08:56
Joined
Feb 19, 2002
Messages
43,264
I'm glad this was faster. (18,000 rows * 38 years * number of fields) is a lot of data to fill cell by cell.

I'll bet if instead of writing to excel, you wrote to a flat file with a .csv extension and then when the csv file was closed. you could open excel, that would be even faster. It would eliminate all that OLE automation which is what slows down this process.

Now that you've got the code working, it will be easy to change the code to open a file with a .csv extension and then write to it. You will be doing only 18,000 write operations. To build the record, simply concatenate the fields and separate them with commas. If any of the fields are text, you should enclose them in quotes to be compliant with the .csv standard.
 

Users who are viewing this thread

Top Bottom