Read XML webpage from Access WebBrowser Object

Tezcatlipoca

Registered User.
Local time
Today, 11:45
Joined
Mar 13, 2003
Messages
246
Related to this thread.

Apologies if this seems like a second post, but the above thread had two issues, the first of which - to which the thread title relates - is solved.

I've decided to start a second thread for the latter problem with a more relevant thread title purely so, if a solution is found, this thread will show up in future searches and help people with a similar problem.

Hokay, as you can see from the linked thread, I have a form. The form has a WebBrowser object on it. I also have an unbound textbox and a button.

The user can enter a number into that textbox, click the button, then the WebBrowser object automatically goes to the static webpage, appending the textbox contents to the URL.
In this case, the static webpage is an ASP page that is used to perform an online search from a database separate to my own, and the number the user puts into the textbox directly relates to the member number of the person they are seeking in the other database. This all works great.

Now, when that person has been found, the WebBrowser object shows an XML webpage with a number of fields on it (Number, Name, Postcode, and so on). This works, but, frankly, looks very messy.
In addition, I need to be able to use parts of that returned data elsewhere in my project...

...which is where the thread coms in. I'm hunting for a way, any way, to strip the data off of that search results webpage contained in the WebBrowser object, and update other textfields with it.

So, for example, I would have an unbound textbox on the form called txtPostCode. When the user does a search on, say, member 12345 and the postcode is listed in the XML as <POSTCODE>W1 4JH</POSTCODE>, then txtPostCode would automatically get updated with this data. the same would go for Name, and so on.

Does anybody know how to get VB to grab those details from the WebBrowser object?
 
i was given this basic code to scrape a web page into an access app
url is the webpage you need to read
you will need to set a few variables

sresponse is the final string, which you will then need to parse yourself

there are various ways of handling the screen scarpe - hence the bits commented out - theres more examples on this site

hope this helps


Code:
function scrape
Dim bResponse() As Byte
Dim sResponse As String
Dim XMLHTTP As Object
Dim url As String

Set XMLHTTP = CreateObject("MSXML2.XMLHTTP")

XMLHTTP.Open "GET", url, False
XMLHTTP.send

'    bResponse = XMLHTTP.responsestream
'    MsgBox ("Response Stream: " & vbCrLf & bResponse)
    
'    bResponse = XMLHTTP.responseBody
'    MsgBox ("Response Body: " & vbCrLf & bResponse)
    
bResponse = XMLHTTP.responseBody
Set XMLHTTP = Nothing
'
sResponse = StrConv(bResponse, vbUnicode)
MsgBox (sResponse)
Exit Function

'
 
You know, Gemma, given the degree to which you've helped me and others, you really should be in line for a medal, you know :)

Thanks for the post; it's given me something to start investigating to see if I can get it up and running for my own project. Obviously, if I do, I'll post the results here to ensure anyone else with the same problem has access to the answer.
 
Have tinkered with this a little now, but not getting especially far, to be honest.

If, if I understand it, bResponse is one of the fields I'm wanting to grab data from (such as PostCode), I presumably need to define X bResponse strings, where X is the number of fields I want to get...?

I think the parsing thing looks to be pretty straight forward, and, once I've got it as a string, I can easily fire it at a chosen textbox; it's just the stripping it out bit that I'm having a hit of a headache with.

Did you ever get it working for your own project, Gemma, and would it be possible to post up an example DB?
 
If you are using the activex web browser you can use

Code:
somestring = Browser.Document.Body.Innertext
and that will dump all the text into the variable.

if you have multiple results you can loop using lbound and ubound if you fill each of your results into an array. Parsing is easy with xml, it's just tedious, as long as the data is uniform there shouldn't be too many problems.
 
you need a string that represents the URL (including any embedded lognis/passwords etc )

so have a line

url = " etc "

try pasting this directly into your web browser to make sure it works

if it does then instead of displaying the website, the code should store the same html content in the variable sresponse

i used this to read map refs from google maps initially - and there are optional settings (in the url string) to return either a xml file, or a csv string.
 
Thanks ASherbuck, that has worked beautifully.

To summerise what I have on the grounds this thread may well help other people with the same issue:

I have a WebBrowser object - WebBrowser0 - hidden on the form.
I also have two unbound text boxes, txtMemSearch and txtResults (which is invisible). The user punches the desired member number into the former.
Finally, there is a button, btnSearch, that has the following OnClick code attached:

Code:
Private Sub btnSearch_Click()
    Dim Text As String
    Me!WebBrowser0.Navigate "http://www.mydomain.co.uk/search.asp?id=NLAcllu9&memberno=" & Me.txtMemSearch
    Text = WebBrowser0.Document.Body.InnerText
    Me.txtResults = Text
End Sub

When the user makes a search, the online service searches the other database on that number. If found, it returns an XML page in the following format:

Code:
<Member>
  <MembershipNumber>12345</MembershipNumber> 
  <MemType>TYPE</MemType> 
  <MemTypeDesc>DESCRIPTION</MemTypeDesc> 
- <Surname>
- <![CDATA[ SURNAME
  ]]> 
  </Surname>
- <Forenames>
- <![CDATA[ FORENAME
  ]]> 
  </Forenames>
  <PostCode>POSTCODE</PostCode> 
  <ClearedDate>DATE</ClearedDate> 
  </Member>

This data is then lifted out and automatically pasted as text into the invisible textbox txtResults.

Finally, I have a series of visible text boxes on the form, one for each XML category (so Postcode, Surname, and so on). All I need to do now is add some code to lift the data specific to a particular catergory out of txtResults and put it into the correct field.

Thanks again for all your help, guys, and hopefully this thread will end up being of use to others who want to strip data from webpages that have been embedded into their Access projects.
 
Last edited:
The attached example will give a better idea of the effect I'm trying to achieve...
 

Attachments

T,

Use a function like this:

Me.PostCode = GetTag(Me.Results, "{PostCode}")
Me.Number = GetTag(Me.Results, "{Number}")

Code:
Public Function GetTag(strText As String, strTag As String) As String
Dim intStart As Long
Dim intEnd As Long
'
' Look for where the Tag starts
' If not found return an empty string
'
intStart = Instr(1, strText, strTag)
If intStart = 0 Then
   GetTag = ""
   Exit Sub
End If
'
' The data value lies "after" the tag
'
intStart = IntStart + Len(strTag)
'
' The end of the data is two-characters "before" the next "{/"
'
intEnd = Instr(intStart, strText, "{/") - 2

GetTag = Mid(strText, StrStart, StrEnd - strStart)

End Function

I didn't test it, but it should be close.

hth,
Wayne
 
Sorry WayneRyan, but your code produces invalid function and invalid call errors for me, specifically GetTag = Mid(strText, strStart, StrEnd - strStart) is returned as an invalid call or function, and

If intStart = 0 Then
GetTag = ""
Exit Sub
End If

crashes out with Exit Sub not being valid in a function.

:confused:
 
T,

I was close, it was just "air-code".

Anyway, from the debugger:

?gettag(Me.Results, "<MemTypeDesc>")
DESCRIPTION

Or to populate the textboxes:

Me.PostCode = gettag(Me.Results, "<PostCode>")

Code:
Public Function GetTag(strText As String, strTag As String) As String
Dim intStart As Long
Dim intEnd As Long
'
' Look for where the Tag starts
' If not found return an empty string
'
intStart = InStr(1, strText, strTag)
If intStart = 0 Then
   GetTag = ""
   Exit Function
End If
'
' The data value lies "after" the tag
'
intStart = intStart + Len(strTag)
'
' The end of the data is two-characters "before" the next "</"
'
intEnd = InStr(intStart, strText, "</")

GetTag = Mid(strText, intStart, intEnd - intStart)

End Function

You could put the function in a Public Module and extract anything
between the <SomeKey>.....</SomeKey> tags.

hth,
Wayne
 
Thanks, WayneRyan for your help so far. I punched your code into VB as a function, setup the calls to a button and now have a situation where I can search the test XML page, then click a separate button to have your code grab the returned data from inside the textbox and pass it to the relevant fields!

There are only two problems cropping up, which I've tried to solve myself but can't seem to get working:

1) Currently, the user punches in the desired XML page and clicks a Search button. The embedded WebBrowser goes and fetches that page, and the AfterUpdate code of the browser passes the text to the Reults box.
What I have currently is a second button with your code attached to grab the data from the results box and put it into the relevant fields, but what I'd really like to to eliminate this stage altogether. The user should only have to click the Search button, then the whole operation will automate and the results get returned to the correct boxes.
I've tried to add your code to an AfterUpdate of the Results box, but it doesn't seem to fire.

2) This one is a little more serious. Your code works marvelously for the data that is contained between <tag> and </tag> tags. One bit of my data, however, is CDATA, and follows the format <Name><![CDATA[ Dave ]]></Name>. I've tried to tinker with your Function code to get it to take this CDATA bit into account, but can't get it working. Do I need to introduce an If clause that states if I am filling the Name field then I have to hunt between the <![CDATA[ and ]]> tags as well as the <Name> and </Name> tags? Note that in the proper search, there are two of these tags, ForeName and Surname, so I don't think a single comamnd to grab all data between CDATA tags will work?

Updated example attached.
 

Attachments

Last edited:
Quick bump in the hope somebody has an answer on this.

The only issue remaining is point 2 in the post above; how to screen scrape data from within an XML tag that includes a CDATA tag?

For all the other, non-CDATA, tags, WayneRyan's code works beautifully; I just can't grab the last details I need, and have attempted to tweak WR's original code without much success!
 

Users who are viewing this thread

Back
Top Bottom