Solved Reading ANSI encoded files (1 Viewer)

KitaYama

Well-known member
Local time
Today, 23:47
Joined
Jan 6, 2022
Messages
1,893
I have a field in a table that contains the full path to BPF files.
How can I show the contents of these ANSI encoded files in a textbox on a form?

When I read the contents of the file in vba, and set the textbox, it remains empty.
If I open the file in notepad and save it in UTF-8 , then I can read them and set a textbox to show the contents.

I was thinking of using a function in on-current event of the form to save a temp copy of the file in utf-8 encoding, then read the temp file.
But I couldn't find a way to save a file with a specific encoding.

The code that I use to read them :

SQL:
Public Function ReadTextFile_All(ThisFile As String) As String

    Dim FSO As New Scripting.FileSystemObject
    Dim txtStream As TextStream

    If Not FSO.FileExists(ThisFile) Then Exit Function

    Set txtStream = FSO.OpenTextFile(ThisFile)
    ReadTextFile_All = txtStream.ReadAll
     
    txtStream.Close
    Set txtStream = Nothing
    Set FSO = Nothing

End Function

Some info about BPF files:
BPF files are special files written in NC language to talk to 5 axis manufacturing machines. They are ANSI encoded and they need special editors to edit the contents.
If you open them in notepad you will see something like this: But even showing this in a textbox, suits us fine.

2024-06-15_07-57-47.jpg
 

Attachments

Last edited:
If you can, you might consider posting a sample file in case someone is inclined to do some experiments.
 
If you can, you might consider posting a sample file in case someone is inclined to do some experiments.
Done. And thanks for looking into this.
3 Files are attached to #1.

Thanks again.
 
Done. And thanks for looking into this.
3 Files are attached to #1.

Thanks again.
Thanks. Otherwise, perhaps another option is to batch convert the files to utf-8 for your database.
 
another option is to batch convert the files to utf-8 for your database.
Do you know any software/program that can do the job?
I googled and couldn't find any.
Can vba be an option?

thanks.

Edit: I found this, but now I have to search how to run it in notepad++
 
Last edited:
Do you know any software/program that can do the job?
I googled and couldn't find any.
Can vba be an option?

thanks.

Edit: I found this, but now I have to search how to run it in notepad++
When I did a search earlier, I saw UTFCast mentioned.
 
I was able to batch convert the files from ANSI to UTF-8.
I keep the thread as not solved, because I'm still interested to see if there's a way to do it with vba.

Here's the process if anyone in future faces the same problem.
  1. In Notepad++, select Plugins in menu bar, select Plugin Admin....
  2. Find PythonScript in the list, tick it and click Install. PythonScript plugin will be added.
  3. Click Plugins in menu bar, point to Python Script and select New Script from sub menu bar.
  4. Copy the following code and paste it into the newly created script, change the path.
  5. Click Save in toolbar and save the changes.
  6. Click Plugins in menu bar, point to Python Script and select Show Console. (optional step)
  7. Click Plugins in menu bar, point to Scripts and select the script you saved in step 5.
  8. You're done.
Python:
# -*- coding: utf-8 -*-
import os
import sys
from Npp import notepad

filePathSrc = "Full path to your folder"

filePathSrc = filePathSrc.decode('utf-8')
os.chdir(filePathSrc)
for root, dirs, files in os.walk(".", topdown = False):
    for fn in files:
        if fn[-4:] == '.PBF':
            notepad.open(root + "\\" + fn)           
            notepad.runMenuCommand("Encoding", "Convert to UTF-8")           
            notepad.save()
            console.write('File ' + fn + ' saved. Closing ... \n')
            notepad.close()
 
KitaYama, you are going the other way than I did, because I had a genealogy database that used to work on ASCII formatted files downloaded from Ancestry.COM - but a few years ago they changed from ASCII to UTF-8 and my scanner blew up. I had to roll my own file converter to strip the extended characters because my DB was not originally based in UNICODE. Conversion is NOT a fast process, particularly since the GEDCOM (Genealogy specialized Entity Attribute Value format) includes some very long records. A 2300-person family tree is WELL over a megabyte worth of downloads. And you almost have to go byte-by-byte to catch the extended characters.

For anyone who is wondering, UTF-8 is an 8-bit scheme that uses 7-bit ASCII for all of the ASCII (ordinary text) characters. But for the extended characters, they have some 8-bit markers to introduce the special characters that are expressed by using 2-byte, 3-byte, or 4-byte extensions. Since the majority of the characters of UTF-8 still ARE ASCII 7-bit and thus fit into 1 byte, you save nearly half the space in the file for English language text files.
 
I couldn't quite figure out from your dialog whether the problem has been solved or not, but here would be a quick and easy VBA solution for 'displaying' the content:

Code:
Public Sub TestPbfFile()
    With New ADODB.Stream
        .Type = ADODB.StreamTypeEnum.adTypeBinary
        .Open
        .LoadFromFile "c:\Mazak250\4019.PBF"
  
        Dim byteArray() As Byte
        byteArray = .Read(ADODB.StreamReadEnum.adReadAll)

        Dim xIndex As Long
        For xIndex = LBound(byteArray) To UBound(byteArray)
            If byteArray(xIndex) = 0 Then byteArray(xIndex) = 32
        Next xIndex

        MsgBox StrConv(byteArray, vbUnicode)
    End With
End Sub

Maybe it helps.
 
I couldn't quite figure out from your dialog whether the problem has been solved or not
I hoped to find a way to show ANSI encoded files in a textbox, but ended up to re-encode them to UTF-8.
So actually my problem is not solved. I only found a not reliable work around. Because not only changing the encode of nearly half a million files takes a while, but if for some reason something goes wrong, we're facing a huge problem.

I'll check to see if we have any luck to use your solution.
Thanks for the help.
 
@AHeyne It's perfect.
No I was wrong. It's more than perfect.
I really can't find the words to thank you.
I absolutely don't understand the code, but will try to do some research to see how it works.

Million thanks for the help.
Best regards.
 
How can I show the contents of these ANSI encoded files in a textbox on a form?
How did you get the idea that these files are ANSI encoded? They are not! The files you provided in the ZIP archive are binary files.
That's the very simple reason that @AHeyne's approach of reading them as binary files does work.
You probably could also use the build in Open ... for Binary statement to read the files without a dependency on ADO.
 
How did you get the idea that these files are ANSI encoded?
When I open the file, Windows notepad tells me it's ANSI encode:

2024-06-17_07-47-50.jpg


When I try to save it, if I change the encoding from ANSI to UTF-8, then I could view them in a text box.
So I simply believed what windows is telling me they are.

2024-06-17_07-49-40.jpg


I don't know, they may be binary as you said. All I know is what notepad shows as their encode.

I will check your offered link too, even though I have a working method.
Thanks for help.
 
All I know is what notepad shows as their encode.
ANSI text files do not have a mark or property indicating that they are ANSI encoded. So, whenever Notepad cannot find any indication of an encoding, e.g., a Unicode BOM, it assumes the file is ANSI encoded. It then opens the file as if it was encoded with the current ANSI Codepage.
This also applies to binary files, which also do not have an indicator for their text encoding because they simply are no text files.

As an experiment, open the ZIP file Mazak250.zip you attached to this thread in Notepad. It will also show "ANSI" as encoding even though it is clearly a binary file.
 

Users who are viewing this thread

Back
Top Bottom