Fastest way to write to a text file (1 Viewer)

Guus2005

AWF VIP
Local time
Today, 18:13
Joined
Jun 26, 2007
Messages
2,645
I need to write about a million records to a text file.
There are several ways to accomplish this. I need the fastest ofcourse.

Method #1
Code:
Set fs = CreateObject("Scripting.FileSystemObject")
Set a = fs.CreateTextFile("c:\testfile.txt", True)
a.WriteLine("This is a test.")
a.Close

Method #2
Code:
Open "c:\testfile.txt" For Output As #1
Print #1, "This is a test"
Close #1

Method #3
Code:
Set objStream = CreateObject("ADODB.Stream")
With objStream
    .Open                   ' Open new stream
    .Type = 2               ' type 'Text' (adTypeText)
    .Charset = "utf-8"      ' charset UTF-8
    .Position = 0           ' from the start
    .LineSeparator = -1     ' close every record with CRLF
    .WriteText "This is a test", 1 ' (adWriteLine)
    .SaveToFile "c:\testfile.txt", 2                               
    .Close 
End With

The string which needs to be written is concatenated using FastCat.
I expect writing a single line takes more time than writing n lines at once. So i am writing n lines at once.

But i need advice on the fastest method. Is there a fourth method even quicker?

Thanks!
 

The_Doc_Man

Immoderate Moderator
Staff member
Local time
Today, 12:13
Joined
Feb 28, 2001
Messages
26,999
How OFTEN do you need to do this? Once-off? Daily? Weekly? Monthly? There comes a point of diminishing returns. E.g. if you are going to do this once, who CARES that it took a few minutes longer than it could have? But if it is a daily action (or more often) then speed becomes an issue.

There is also the issue that you have to consider where this data stream originates. You not only have the speed of WRITING to consider, but also the speed of READING the data source plus any required computations such as string manipulation (even counting a FastCat option). Remember that anything you do from Access has to go through memory even if it isn't going to stay there for very long.

Also, with a million records you have to consider disk buffering. How big is this data set (in bytes)? Because you also need to consider factors like system resources that frequently get tapped out for really big files. I've seen maybe at least half a dozen cases in this forum in the last five or six years where really big files just flat-out gagged the Windows file system.

Your guess - about writing single lines being slow - is accurate. Each write operation, whether single-line or blocks of lines, involves various subroutine calls to system routines. Which often means a context switch, since the system I/O routines are in a .DLL file, particularly for the case of using the file system object. Therefore, if you CAN write multiple lines at once, you save a LOT of context switches.

Here is the biggest "gotcha" of them all. If you are using VBA, you are interpreting pseudo-code. Can this possibly be written in C++ or VB? Those are true-compiled languages that will produce machine code. VBA does not do that. Pseudo-code emulation is ALWAYS slower that machine code execution.
 

Minty

AWF VIP
Local time
Today, 17:13
Joined
Jul 26, 2013
Messages
10,355
This is purely a hypothesis, but can you not create the entre string into an array or single field recordeset where each record would be one line of text (Which would be quick) then dump the lot in one go into a text file?

Trying to avoid doing anything a million times has to be quicker if it can be done once?
 

sonic8

AWF VIP
Local time
Today, 18:13
Joined
Oct 27, 2015
Messages
998
Here is the biggest "gotcha" of them all. If you are using VBA, you are interpreting pseudo-code. Can this possibly be written in C++ or VB? Those are true-compiled languages that will produce machine code. VBA does not do that. Pseudo-code emulation is ALWAYS slower that machine code execution.
I think the other factors you mentioned will have far greater influence on the overall performance than whether the code is compiled to machine code.
 

The_Doc_Man

Immoderate Moderator
Staff member
Local time
Today, 12:13
Joined
Feb 28, 2001
Messages
26,999
@sonic8 - if this is monthly or weekly, you are absolutely right. If it is more often, there is a BIG differential. It is the reason why we NEVER tell anyone to do VBA Recordset operations if a query is possible. The query engine is compiled whereas VBA is not. Isladogs has done many studies on this and the result always favors queries. By a LOT.
 

Pat Hartman

Super Moderator
Staff member
Local time
Today, 13:13
Joined
Feb 19, 2002
Messages
42,971
I know this is silly but have you tried TransferText? I would assume that would be as optimized as you can get both with reading and writing.
 

Pat Hartman

Super Moderator
Staff member
Local time
Today, 13:13
Joined
Feb 19, 2002
Messages
42,971
PS: I've used TransferText with multi-million row tables and not had a problem. Nothing is lightening fast with a million records but I wouldn't say the transfer was slow either.
 

sonic8

AWF VIP
Local time
Today, 18:13
Joined
Oct 27, 2015
Messages
998
It is the reason why we NEVER tell anyone to do VBA Recordset operations if a query is possible. The query engine is compiled whereas VBA is not.
While I agree with the advice, your reason is not *the* reason, but only one of several. Others being that a query reduces read/write operations to the bare required minimum, it reduces network traffic (particularly with a server side database), and the database engine being able to use multi threading.
Isladogs has done many studies on this and the result always favors queries. By a LOT.
I've done my own research. ;-)
The above is correct for the vast majority of operations. However, there is a case study of mine to the opposite for a specific scenario.

I know this is silly but have you tried TransferText?
Despite (or rather because!) its simplicity, this is the best answer in this thread so far!

@Guus2005, I think it is up to you to try out the different approaches and measure their performance. Please update this thread with your results.
 

Guus2005

AWF VIP
Local time
Today, 18:13
Joined
Jun 26, 2007
Messages
2,645
What @The_Doc_Man said is very true. Speed is relative. If you are waiting for it to happen, it takes a lot of time.
The solution needs to run once a month. So that is not very often.

This is a record in the output file.
Code:
1000S00001020A001000000000001030N000111040N0009100000101080A0001E1100A0005KLIJN1120N0008197811071200A00069999ZZ
Where:
...
1100A0005KLIJN is a string containing a name of 5 characters.
1120N000819781107 is a birthdate of 8 characters: YYYYMMDD
1200A00069999ZZ is the zipcode of 6 characters: NNNNAA

TransferText requires a separator. A space or a comma/tab/semicolon/... so that is not an option. Separators are not allowed.
So that would be a very nice option but i can't use it.

I can build this record and put it in a table field. Then i can use TransferText to export a single field to the output.
Because it is a single field no separators are required. :cool:

Thanks for your time!
 
Last edited:

arnelgp

..forever waiting... waiting for jellybean!
Local time
Tomorrow, 01:13
Joined
May 7, 2009
Messages
19,169
you create a Query that has an Expression that will concatenate those fields.
export the Query instead.
 

Galaxiom

Super Moderator
Staff member
Local time
Tomorrow, 04:13
Joined
Jan 20, 2009
Messages
12,849
Put your data in SQL Server, generate the output with a query (like arnelgp said) and use Bulk Copy utility (bcp).

I use this technique to export half a gigabyte of data to a text file in a few seconds. Performance my vary depending on the data.
 

Guus2005

AWF VIP
Local time
Today, 18:13
Joined
Jun 26, 2007
Messages
2,645
No. You can also export as a fixed width column format without a separator. - That seems to describe your desired output.
The record length can vary from 50 to 400 characters. Which is not very fixed.
 

The_Doc_Man

Immoderate Moderator
Staff member
Local time
Today, 12:13
Joined
Feb 28, 2001
Messages
26,999
This is always going to be a case of "pay me now, pay me later" - or at least I think it will be that way. To export the string, you must BUILD the string, which means you must assemble the components, which means you must get the components. Not to mention that you must format the string if the components have leading zeros but started life as numbers. Can't tell from what you posted because can't tell where one "field" starts and another one ends. However, any way you count it, that will be a LOT of string manipulation.

There is also the issue that you can't write a string that long; at least, not monolithically. Short text can get to 255 bytes before you run into the wall. Oh, you can write multiple strings of 255 bytes each (you go up to 400 bytes per record, you said.) Unless, of course, you make the output into Memo/Long Text fields and just concatenate shorter parts into the long text field.

Here is where an Export of some flavor comes in. I would say "just build a report" but the problem is that reports will want to make their outputs conform to the chosen printer dimensions. I don't think you don't want to build a CSV because that format is defined with punctuation and field-size "baggage" so that might not work.

I might look into the way you open the file and use the Open For Output with the Len=n option, because that Len argument will lead to buffering. If you write a lot of records but buffer the hell out of what you are doing, it is possible that you will save calls to the final output driver by bunching up the output operation into LARGER chunks before you actually finish the output's internal actions. If Access will buffer the records, you can write them using VBA but they won't actually get into the file right away. When they do, you will write a LOT to the file at one time, and that part will be compiled code (not your emulated pseudo-code that is VBA). Therefore, there is a CHANCE of some time savings here by deferring part of the output operation. I'm tossing out ideas here with no guarantee that this will work for you - but if you get desperate enough, MAYBE this will help.
 

Pat Hartman

Super Moderator
Staff member
Local time
Today, 13:13
Joined
Feb 19, 2002
Messages
42,971
TransferText requires a separator.
Do you think I would have recommended it if that were true? You might want to walk through an export ONCE manually just to see how an export spec is built and what options you have.

If you are writing a file where individual rows are different formats, then no automatic method would work. You might have mentioned that initially.

When I write our 837P EDI billing file which can easily get to a million rows, I use
fsoFile.WriteLine strRecord

The key when working with loops is to make sure that you are not doing anything inside the loop that can be done ONCE outside of the loop. If the loop runs 100 times, it doesn't matter what you do but when it runs a million times, it will be like watching paint dry if you are sloppy.
 

Users who are viewing this thread

Top Bottom