Ok well what I am trying to do is use VBA code to save an Acrobat Version 7.0 PDF into a text file so it can be pulled into a database using monarch. The actual saving of the file as text works, and here is the code I have used to accomplish that:
Dim AcroXApp As Object
Dim AcroXAVDoc As Object
Dim AcroXPDDoc As Object
Set AcroXApp = CreateObject("AcroExch.App")
AcroXApp.Hide
Set AcroXAVDoc = CreateObject("AcroExch.AVDoc")
AcroXAVDoc.Open PDF_PATH & filename, "Acrobat"
AcroXAVDoc.BringToFront
Set AcroXPDDoc = AcroXAVDoc.GetPDDoc
Dim jsObj As Object
Set jsObj = AcroXPDDoc.GetJSObject
jsObj.SaveAs OUTPUT_PATH & OutputFile, "com.adobe.acrobat.plain-text"
AcroXAVDoc.Close False
AcroXApp.Hide
AcroXApp.Exit
Now as I said the code itself runs perfectly and saves the pdf as a plain text file. The problem I am having is that the original pdf is in neat column format, but the resulting text file loses all the whitespace and creates only a single space between column fields. This is insufficient for my needs as some of the fields have spaces in them, so i cannot use a space-delimited method to import them.
My current work-around is that rather than save the file as text, i save it as a html version 3.20 file by changing "com.adobe.acrobat.plain-text" to "com.adobe.acrobat.html-3-20 "and i use monarch to anaylze that file. This works mostly, but it has had many bugs to work out because of inconsistencies in the way acrobat saves the file as html.
The application i've built does work for the most part, but i am concerned that if it receives some odd data that it will break because as it is built it is somewhat volatile. I need 7 monarch models as it is just to gather all of the data correctly.
I was wondering if there was any way to save the pdf file as a text file but still retain the column format and whitespace from the original pdf. This would alleviate most all of my problems. Also I use Acrobat version 6.0 standard, Monarch Pro 6.00 and MS Access 2000. I appreciate any help anyone could give me.
Dim AcroXApp As Object
Dim AcroXAVDoc As Object
Dim AcroXPDDoc As Object
Set AcroXApp = CreateObject("AcroExch.App")
AcroXApp.Hide
Set AcroXAVDoc = CreateObject("AcroExch.AVDoc")
AcroXAVDoc.Open PDF_PATH & filename, "Acrobat"
AcroXAVDoc.BringToFront
Set AcroXPDDoc = AcroXAVDoc.GetPDDoc
Dim jsObj As Object
Set jsObj = AcroXPDDoc.GetJSObject
jsObj.SaveAs OUTPUT_PATH & OutputFile, "com.adobe.acrobat.plain-text"
AcroXAVDoc.Close False
AcroXApp.Hide
AcroXApp.Exit
Now as I said the code itself runs perfectly and saves the pdf as a plain text file. The problem I am having is that the original pdf is in neat column format, but the resulting text file loses all the whitespace and creates only a single space between column fields. This is insufficient for my needs as some of the fields have spaces in them, so i cannot use a space-delimited method to import them.
My current work-around is that rather than save the file as text, i save it as a html version 3.20 file by changing "com.adobe.acrobat.plain-text" to "com.adobe.acrobat.html-3-20 "and i use monarch to anaylze that file. This works mostly, but it has had many bugs to work out because of inconsistencies in the way acrobat saves the file as html.
The application i've built does work for the most part, but i am concerned that if it receives some odd data that it will break because as it is built it is somewhat volatile. I need 7 monarch models as it is just to gather all of the data correctly.
I was wondering if there was any way to save the pdf file as a text file but still retain the column format and whitespace from the original pdf. This would alleviate most all of my problems. Also I use Acrobat version 6.0 standard, Monarch Pro 6.00 and MS Access 2000. I appreciate any help anyone could give me.