There are certainly a lot of questions out there about the internet and VBA. I use internet explorer quite often with VBA to navigate and perform other various automation tasks. This thread will hopefully give you the basics of navigating the internet with IE through VBA code.
Which browser do I use to read webpage source code?
They're making it easier to read through stuff, but I would recommend using Firefox to read source code. It is formatted in a very friendly way and you can find data very easily. Internet Explorer is very poor and I would not recommend using it unless you have IE8. IE8 has been modified greatly for web developers, and it is probably better for readability and navigation now than Firefox is. It is color coded and line numbered, which is a huge improvement by Microsoft (one of the few for sure!)
First thing to know is how to open IE in Access. Just like any other application, the best way is probably to create the object and make it visible. Like this:
dim ie as object
set ie = createobject("internetexplorer.application")
ie.visible = true
Navigating from webpage to webpage can be done like this:
After navigating to a page, it is always a good idea to pause the code until the page completely loads. This code will do that after the navigation:
You may also want to manipulate some data while you're browsing the internet. This all has to be done by using HTML code. Some of the tasks I do with it are things like pulling values from pages, populating textboxes, and submitting forms. You can do a number of other things, but these are the most common, at least for me. Here is an outline of how to do each of these:
How do I get a value from a page?
Values are stored in a variety of elements in HTML. They can be in tables, textboxes, links, and others. Regardless of where the target value is, you need to know how to reference an HTML control. The most common way is probably by it's ID. You can get an element's value from it's ID with the following syntax:
I don't believe there is any easy way to identify elements on a web page other than viewing it's source code. To do this, right click on the page and select "View Source" from the menu:
Then you need to find the ID of the element you want to manipulate. In this above example, the ID of the textbox happens to be "q"
. So, if there was indeed a value in the search box, you can pull it into Access like this:
the HTML source code that tells us the ID is found by viewing the page source. There you will find this:
<input autocomplete="off" maxlength=2048 name=q size=55 title="Google Search" value="">
Elements are generally listed in HTML in the order that you see them appear on the page. Therefore, if you're searching for the ID of a textbox, and the page code is 1000 lines long, use the find
feature to find some text that you see on the page that is relatively close to the textbox object. In this example, I searched for the phrase "Google Search"
. I did this because it is the caption (value) of the submit button that is right below the search box that I wanted to find. It just so happens that in the code, the element that comes right before the button's code is the search box! Easily found....
How do I put a value into a texbox?
Say you want to dynamically put a value in the google search box. You would need to use this code:
How do I check a checkbox?
ie.document.getElementById("q").value = "My Search Text"
- you want to opt-in for a mailing list when submitting a form. Chances are, the HTML will look something like:
<input type="checkbox", name="optin", id="optin" />
<div id="optinLabel">CLICK HERE TO RECEIVE OUR NEWSLETTERS</div>
here is how to usually
mark the box (code not tested in VBA, only in an HTML editor Dreamweaver):
How do I click a radio button?
ie.document.getElementById("optin").checked = true
Here is the way I did it one time:
How do I click on TABS?
ie.document.getElementById("buttonName").value = "on"
This one is a little bit tricky. From what I can tell, there are plenty of ways to write this sort of stuff. Some websites use javascriipt functions with tab number argument variables, and then use them to run the script that takes you to the tab you want. Other websites will simply create tables that look like tab strips, like this source code shows (this is one table row from the entire table):
<td width="10" valign="top"><div align="center"></div></td>
<td colspan="2" valign="top"><a href="../index.html">
<img src="../images2/nav_home_up.gif" width="99"
height="30" border="0"></a><a href="../products2/products.htm">
<img src="../images2/nav_products_up.gif" width="100" height="30"
<img src="../images2/nav_services_up.gif" width="100" height="30"
<img src="../images2/nav_learning_up.gif" width="100" height="30"
<img src="../images2/nav_forum_down.gif" width="100" height="30"
<img src="../images2/nav_contact_up.gif" width="100" height="30"
<td valign="top"> </td>
as you can see, what you see on the screen looks like a tap strip, but it's really a bunch of images behind anchors that take you to the different pages. Notice too, that all the images have a 100px width and a 30px height, to make it look uniform. to click on tabs that are actually tables, you have to get the URL behind the tab image and navigate to it.
How do I choose a value in a combo box list?
Combo boxes are usually called SELECT elements in HTML code. Typical syntax will look like this:
<option value="" selected="selected">(any prefix)</option>
<option value="-1">(no prefix)</option>
<option value="Question" class="" >Question</option>
<option value="Solved" class="" >Solved</option>
<option value="Tip" class="" >Tip</option>
To change the value, you can usually set the value of the combo box by giving it one of its option values, like this:
ie.document.getElementById("prefixchoice").value = "Solved"
I pulled this code from the Search Page's HTML on this forum as an example. So, if you were to run that code, you would be searching for all the threads that have been officially "Solved" (provided you actually pushed the Search button too). List Boxes (multi-select or not) are manipulated the same way.
How do I click a button?
There are a couple of ways to do this. In the google example, the ID of the search button is "btnG"
. So, in order to search google, you need to click it, like this:
Another way to do the same thing is to submit the parent form of the button. In this case, there is only one form on the page, and the index for HTML forms starts at 0, just like the default option base of an Access array. So, in order to submit the google form by using this method, you would need to use this:
In some cases though, these methods are not readily available for any number of reasons. Here is another way to submit a form using the "click" method of the form's button element:
This is a basic overview of how to get starting navigating the web with VBA.
You can also do this sort of stuff with other browsers like Firefox, Netscape, etc... However, it is much too complicated for the scope of this FAQ. And in all reality, it is much easier and more efficient to perform tasks like these with internet explorer, because that program is built into windows, and you can simply retrieve the program object by using the CreateObject
method, as with all other Microsoft Office applications in VBA.
I just had a requirement whereby I had to search 10,000 webpages for some information that was always located in an HTML table. Before this burden was put upon me, I had no idea how to query tables within a webpage, but in order to get paid, I had to figure it out. So....those who are looking at this FAQ can also benefit from what I did. In VBA, you get an HTML's table cell value like this (tag name for a table cell = <td>
elTableCells = ie.Document.getElementsByTagName("td")
For the requirement I had to fulfill, there was one table on every page that I accessed. But...the information in the table cell was quite lengthly. Luckily though, the info that I wanted was always preceeded by a uniform text string: Country:
. Thus, I could use VBA's text extraction functions to get the information I was looking for within the table's innerHTML. Below is some of the code I had to use to get the info across pages I had to access:
do until rs.eof
elTableCells = ie.Document.getElementsByTagName("td")
rs!LOCATION= Trim(Mid(elTableCells.innerHTML, InStr(elTableCells.innerHTML, _
"<BR><U><EM>Country: </U></EM>: ") + 27, _
(InStr(elTableCells.innerHTML, "<br><u><em>Director</u></em>:") - 1) - _
(InStr(elTableCells.innerHTML, "<BR><U><EM>Country: </U></EM>: ") + 27)))