Importing Html into Access via VBA

tmd_63

Registered User.
Local time
Today, 03:43
Joined
Jul 30, 2008
Messages
25
I need help in a multiple html import of data into access.
I have over 600 html files that include multiple (around 7) <table></table> entries and I want to import all the data included inside each html file into a single record in an Access database.
Has anyone every produced any free code that can be used to import all this information easily.
Any help would be greatfully recieved.
P.S. Access 2000 please.
 
Last edited:
So you want all the HTML in a single record or all the data in the HTML table in a single record?
 
I have separate html files and I want the data contained within those files stored as a single record for each html file.
So file 0106.html will be stored in record 0106. But each html has a sub-nested table-/table sets of 6 tables inside a further single table. Inside each table are td-/td datasets which are to be stored in each record column.
The format is the same so, for example, the first td-/td is the name, the second is the system number etc. but not all the fields have data and could be blank.
I just need some code to open the records in a folder, read the text, find the first 'td' and store the data until the next '/td', the loop until it has found the last '/td' then save the record. Then get the next file until all the files have been read.
 
I have two approaches for you, I assume you want the first one...

OPTION 1
Merge all of your HTML documents into a single file and manually link that HTML document to MS Access as a linked table. You can then write queries off of that HTML table directly


OPTION 2
write looping code to generate connection strings for each HTML file, parse the data values from each HTML file, then append those parsed values to an MS Access table.

This second approach requires that each recordset reads the HTML code into a string variable and that that string variable is used to parse field values

(1) use instr functions to find the character string positions for <td> and <\td>
(2) use <td> and </td> positions (modified to by either 4 or 5 characters to adjust for the width of the HTML tag) to compose arguments for use with a midstr function to extract field values.
(3) I would assign each extracted field value to a variable and
(4) use those variables to create an SQL append query once all field values were identified
(5) docmd.runsql will execute an sql string
(6) be careful to include case handling for nulls to prevent docmd.runsql errors, and/or allow zero length strings in the table definition to prevent write failures

Though it is not especially elegant, I'll share a code snippet I wrote years ago to tokenize a string based upon a comma delimiter. Hopefully it will give you the gist of what you want to be doing to parse the HTML This code snippet reads a string, tokenizes it by comma delimiters, then passes each tokenized value into an array. The code snippet does not go through building a SQL statement, but I assume you already kow how to do this once you have isolated field values. My response does not go into using MS Access



Public Const intArraySize As Integer = 25

Public Function Tokenize(strString As Variant)
'//////////////////////////////////////////////////////////////////
'// Function:
'//////////////////////////////////////////////////////////////////
'// Author: tranchemontaigne
'// Created:
'// 27 December 2005
'// Modified:
'// 22 October 2007: documented function with comments
'//////////////////////////////////////////////////////////////////
'// Description:
'//
'//
'//
'//////////////////////////////////////////////////////////////////
'// Requirements:
'// Microsoft Visual Basic for Applications
'// Microsoft Access 9.0 Object Library
'//
'//////////////////////////////////////////////////////////////////
'// Input:
'// Variable Description
'// ======================================
'// gstrObject Module object producing the error
'// gstrCode Code block within module producing error
'// gstrMessage Error message
'//
'//////////////////////////////////////////////////////////////////
'// Output:
'//
'//////////////////////////////////////////////////////////////////
'declare variables
Dim TokenArray(1 To intArraySize) As String
Dim intCharPosition As Integer
Dim intStartCharPosition As Integer
Dim intNextCharPosition As Integer
Dim intTokenLength As Integer
Dim intLoopCounter As Integer
Dim intStringLength As Integer
'test for null argument
If IsNull(strString) Then
For intLoopCounter = 1 To intArraySize
TokenArray(intLoopCounter) = ""
Next
Else
'initialize variables
intStringLength = Len(strString) + 1
intStartCharPosition = 1
intNextCharPosition = InStr((intStartCharPosition + 1), strString, ",")
If (intNextCharPosition < intStartCharPosition) Or (intNextCharPosition = 0) Then
intNextCharPosition = intStringLength
End If


For intLoopCounter = 1 To intArraySize
If (intNextCharPosition < intStartCharPosition) Or (intNextCharPosition = intStartCharPosition) Then
intNextCharPosition = intStringLength
Else
intTokenLength = (intNextCharPosition) - (intStartCharPosition)
TokenArray(intLoopCounter) = Mid(strString, intStartCharPosition, intTokenLength)
intStartCharPosition = (intNextCharPosition + 1)
'Debug.Print "TokenArray" & intLoopCounter & " " & TokenArray(intLoopCounter)
intNextCharPosition = InStr((intStartCharPosition + 1), strString, ",")
End If

Next
End If
Tokenize = TokenArray
End Function
________
Weed
 
Last edited:

Users who are viewing this thread

Back
Top Bottom