Fixing a csv File Before Importing Data (1 Viewer)

namliam

The Mailman - AWF VIP
Local time
Today, 15:18
Joined
Aug 11, 2003
Messages
11,695
Print #2, Left(myString, 223)
Print #2, Mid(myString, 225)

Splits out the first (corrupted) line into to seperate lines

I missed the part where the first bit is not needed, but that can easily be skipped changing the Left to a MID.

Air code but should work:
Code:
Sub mytest()
    Dim myString As String
    Open "c:\mytest.txt" For Input As #1
    Open "c:\myOutput.txt" For Output As #2
    Line Input #1, myString
    Print #2, mid(myString, 46, 198)
    Print #2, Mid(myString, 225)
    Line Input #1, myString
    Do While Not EOF(1)
        Print #2, myString
        Line Input #1, myString
    Loop
    Print #2, myString
    Close #1
    Close #2
End Sub
 

boblarson

Smeghead
Local time
Today, 06:18
Joined
Jan 12, 2001
Messages
32,059
Print #2, Left(myString, 223)
Print #2, Mid(myString, 225)

Splits out the first (corrupted) line into to seperate lines

The only thing is, is that the position of the letters is not necessarily going to be the same each time. Read posts #2 and #3 again and see that I had asked if it was going to be consistent. The answer was that it was always going to be there but it wouldn't necessarily be the same exact line.
 

scotthutchings

Registered User.
Local time
Today, 06:18
Joined
Mar 26, 2010
Messages
96
I just wanted to clarify that the output file is not corrupt, per se. It is simply in a configuration that I don't know how to extract and transfer the data that is contained in the header row.

Also, How can I flag the file if the correction has already been done so it doesn't try to do it again? Would it work to insert text into the file at the beginning "Corrected" or something like that so when the transfer starts, if it finds this code word(s), it skips the file correction process?

Because the last header will always be "No Products found for selected project", can I do a search for this phrase and then insert a line break at the end and then resave the file? If so, how would I do it?

Thanks for everyone's help on this...:eek:

Here are a couple of sample files.
 

Attachments

  • 1productsummaryv2.txt
    9.8 KB · Views: 185
  • 2productsummaryv2.txt
    24.2 KB · Views: 249
Last edited:

sonof27

Registered User.
Local time
Today, 23:18
Joined
Sep 28, 2010
Messages
29
Wow way to make things easy. :eek: These two example files are different and would need to be dealt with differently.

There seems to be some anomilies with the actual data, the second file seems to contain " between more " so the import throws up errors. Also you forgot to tell people there is a totals line at the end of the data that would also need to be removed.

Too good for me, sorry.
 

namliam

The Mailman - AWF VIP
Local time
Today, 15:18
Joined
Aug 11, 2003
Messages
11,695
I just wanted to clarify that the output file is not corrupt, per se.
It seems to me to be a corrupted version of a CSV no questions asked.
Could be a configuration some place, perhaps even omit the header(s) if possible.

Also, How can I flag the file if the correction has already been done so it doesn't try to do it again? Would it work to insert text into the file at the beginning "Corrected" or something like that so when the transfer starts, if it finds this code word(s), it skips the file correction process?
IMHO you shouldnt save it as the same file, instead (like my code does) you create a new file which you import leaving the original file ... the original file.

Because the last header will always be "No Products found for selected project", can I do a search for this phrase and then insert a line break at the end and then resave the file? If so, how would I do it?
Not that hard to do with Instr...


Here are a couple of sample files.
The sample files are not consistant (enough) to be treated (exactly) the same...
The #1 is OK, comparing to your sample in the OP, however the #2 is missing some of the header and missing the "s around the fields and stuff

Now it is possible but... it gets more and more fiddly/messy to deal with properly.

There is an additional problem with File #2 is that the first line seems to be split in 2 lines, However... excel doesnt seem to mind, I havent tried access, but if excel dont mind access probably wont mind.
Also like sonof27 says, there is / seems to be a footer line in file #2, which I do not have a clue of how to deal with... Guess we can check for ",Totals:," assuming that will not be used elsewhere... yet again... more "mess" :p

if not for these problem... this code will work:
Code:
Sub mytest()
    Dim myString As String
    Dim LastCollLocation As Integer
    Dim LastCollName As String
    Dim HeaderStart As Integer
    Dim AreThereQuotes As Boolean
    Open "c:\1productsummaryv2[1].txt" For Input As #1
    Open "c:\myOutput.txt" For Output As #2
    Line Input #1, myString
    HeaderStart = InStr(1, myString, "MFG")
    LastCollName = "No Products found for selected project"
    LastCollLocation = InStr(1, myString, LastCollName)
    If HeaderStart = 1 Then
        AreThereQuotes = False
    Else
        AreThereQuotes = Mid(myString, HeaderStart - 1, 1) = """"
        HeaderStart = HeaderStart - 1 ' decrease by 1 to account for the extra " at the start
        LastCollLocation = LastCollLocation + 1 ' increase by 1 to account for the extra " at the end
    End If
    Debug.Print Mid(myString, HeaderStart, LastCollLocation + Len(LastCollName) - HeaderStart)
    Debug.Print Mid(myString, LastCollLocation + Len(LastCollName) + 1)
    Print #2, Mid(myString, HeaderStart, LastCollLocation + Len(LastCollName) - HeaderStart)
    Print #2, Mid(myString, LastCollLocation + Len(LastCollName) + 1)
    Line Input #1, myString
    Do While Not EOF(1)
        Print #2, myString
        Line Input #1, myString
    Loop
    ' insert some stuff to deal with the totals part of the last line.
    Print #2, myString
    Close #1
    Close #2
End Sub

We will need to add some logic to deal with the CRLF inside the first line/record but above does (nearly) take care of business IMHO.

It is not using the AreThereQuotes variable but that might be needed when handling the totals part of the last line.
 
Last edited:

Users who are viewing this thread

Top Bottom