Hello,
I am relatively new to using VBA for web scraping. I am trying to collect information behind a password protected site and have been successful for the most part. There is one problem that I have not been able to figure out.
I want to obtain file names which are linked, but where the names do not exist in the HTML. Essentially, when I physically click on the link in Internet Explorer, the file name does appear and I want to be able to grab that as a variable.
I am able to trigger the event such that the "Do you want to open or save ..." pop-up appears. However, I have no idea how or if it is possible to GRAB the save as name that appears.
Here is a *stupid* example that ends in the save as window appearing.
I mention why it is stupid in the code below, but will repeat here. My real example does NOT have the file name in the HTML or URL used to trigger the file download - the example below does (just pretend it doesn't!).
Thank you for any ideas on how to get the file name!
As a last resort, I know I could simply download the file and then get the name from that, but this whole process is a loop and I would need to be able to link 10+ downloaded files with the loop iteration.
I am relatively new to using VBA for web scraping. I am trying to collect information behind a password protected site and have been successful for the most part. There is one problem that I have not been able to figure out.
I want to obtain file names which are linked, but where the names do not exist in the HTML. Essentially, when I physically click on the link in Internet Explorer, the file name does appear and I want to be able to grab that as a variable.
I am able to trigger the event such that the "Do you want to open or save ..." pop-up appears. However, I have no idea how or if it is possible to GRAB the save as name that appears.
Here is a *stupid* example that ends in the save as window appearing.
I mention why it is stupid in the code below, but will repeat here. My real example does NOT have the file name in the HTML or URL used to trigger the file download - the example below does (just pretend it doesn't!).
Thank you for any ideas on how to get the file name!
Code:
Public Sub Example01()
Dim dbsSCF As DAO.Database
Dim rstSCF As DAO.Recordset
Set dbsSCF = CurrentDb
Set IE = CreateObject("InternetExplorer.application")
IE.Visible = True
' Not necessary, but this is the web page where the file link exists
IE.Navigate ("https://cran.r-project.org/package=abc")
'Wait while the page loads
Dim sw As StopWatch
Set sw = New StopWatch
sw.StartTimer
Do While (IE.Busy Or IE.ReadyState <> 4)
If sw.EndTimer / 1000 > 60 Then
MsgBox "Webpage taking too long to load; please check connection and try again."
Exit Sub
End If
Loop
' Note: this is a stupid example bacause the file name is part of the URL
' The real example I am looking at does not have the file name as part of the URL
IE.Navigate ("https://cran.r-project.org/web/packages/abc/../../../bin/windows/contrib/3.4/abc_2.1.zip")
' What I want is to be able to get the file name some other way.
' IE.Quit
' Set IE = Nothing
End Sub
As a last resort, I know I could simply download the file and then get the name from that, but this whole process is a loop and I would need to be able to link 10+ downloaded files with the loop iteration.