Processing HTML files

  • Thread starter Thread starter cyware
  • Start date Start date
C

cyware

Guest
Hi,

I have more than 1000 HTML files containing the same header which is useless for me.

Is there a way to open the HTML file in text mode, and to delete a string (a sequention of chars which are the useless tags) from it, than save it ?

I plan to make a loop in order to "clean up" all the files.
Any replies would be greatly appreciated.
 
Last edited:
I have written this piece of code 2 years ago. Modify it to fit your need. Some parts in the code were cut out.

Function CleanUp()
Dim strLink As String
Dim strAllText As String
Dim I As Integer
Dim strOld As String, strNew As String

'strOld = "h:\html\" & strHTML
'strNew = "h:\html\old\" & strHTML

Open "h:\html\120757.html" For Input As #1
Open "h:\html\new120757.html" For Output As #2
I = 0
Do While Not EOF(1)
Line Input #1, strLink
strAllText = strLink
' Start from now on
If strAllText = "<![endif]-->" Then
strAllText = ""
I = 1
ElseIf strAllText = "<!--[if gte mso 9]><xml>" Then
strAllText = ""
I = 0
ElseIf Left(Trim(strAllText), 59) = "<p class=MsoNormal align=center style='text-align:center'>[" Then
strAllText = ""
I = 0
ElseIf strAllText = "<div class=MsoNormal align=center style='text-align:center'>" Then
I = 1
ElseIf strAllText = "<div class=Section1>" Then
strAllText = ""
I = 1
' Clean Calvin Smith ad out
ElseIf Left(Trim(strAllText), 54) = "<p><a href=""http://home/"">" Then
strAllText = ""
I = 0
'ElseIf InStr(strAllText, "http://home.sprint") <> 0 Then
' strAllText = ""
' I = 0
ElseIf InStr(strAllText, "href=""http://home/""><span style='text-decoration:") <> 0 Then
strAllText = ""
I = 0

ElseIf InStr(strAllText, "http://home/sig.gif") <> 0 Then
strAllText = ""
I = 1
ElseIf strAllText = "<p><![if !supportEmptyParas]> <![endif]><o:p></o:p></p>" Then
strAllText = ""
I = 1
End If

If strAllText = "<p><a name=postfp>Post a Followup</a></p>" Then
strAllText = "<p><A HREF=""http://members.mweb.co.th/specialvoa/index.htm"">Modified by Tim K.</a></p>"
End If
If I = 1 Then
On Error Resume Next
If InStr(strAllText, "http://www.") <> 0 Then
strAllText = Left(strAllText, InStr(strAllText, "http://www./") - 1) & _
Mid(strAllText, InStr(strAllText, "/messages/") + 10)
End If
Print #2, strAllText
End If
Loop
Close #1

'FileCopy strOld, strNew
'Kill strOld

End Function
 
Thanks alot.
You saved me !
 

Users who are viewing this thread

Back
Top Bottom