extracting data from the URL and then saving into MS Access table (1 Viewer)

talha

New member
Local time
Today, 02:14
Joined
May 9, 2020
Messages
22
Hi,

Is there a way I can automatically extract contents from the URL and save them in the table?

Here is a sample URL:

jobs.nvoids.com/job_details.jsp?id=1246753

I want to extract data daily from about 100 URLs, and I don't want to do it one by one.

Thanks
 

arnelgp

..forever waiting... waiting for jellybean!
Local time
Today, 15:14
Joined
May 7, 2009
Messages
19,243
some demo, Run test() sub and the email will be saved to responseT table.
 

Attachments

  • saveFromURL.accdb
    512 KB · Views: 36

GPGeorge

Grover Park George
Local time
Today, 00:14
Joined
Nov 25, 2004
Messages
1,867
I am open to the options... it can be in a table, or we can hardcode it to VBA, etc.
Access is used to create relational database applications, which are characterized primarily as places where data is stored in tables.
 

talha

New member
Local time
Today, 02:14
Joined
May 9, 2020
Messages
22
some demo, Run test() sub and the email will be saved to responseT table.
Hi, thanks for the solution... it is about 90 percent of what I was looking for.

Could you please guide on

A) how to run this code for 100 URLs that I can save in a table
B) Why is it saying email is protected, and how can I extract email addresses?

Thanks again.

I am super excited.
 

arnelgp

..forever waiting... waiting for jellybean!
Local time
Today, 15:14
Joined
May 7, 2009
Messages
19,243
put the URLs to another table. then Loop through each records in the table (using it's recordset) and passing the URL to the sub.
 

talha

New member
Local time
Today, 02:14
Joined
May 9, 2020
Messages
22
put the URLs to another table. then Loop through each records in the table (using it's recordset) and passing the URL to the sub.
And how to extract email address as well
 

arnelgp

..forever waiting... waiting for jellybean!
Local time
Today, 15:14
Joined
May 7, 2009
Messages
19,243
here is a "raw" dump to table.
 

Attachments

  • saveFromURL.accdb
    584 KB · Views: 29

talha

New member
Local time
Today, 02:14
Joined
May 9, 2020
Messages
22
here is a "raw" dump to table.
Thanks again for your assistance

Here is what I did
1. Created a new Table (Name: URLT)
2. Saved a list of URLs to the table URLT, under the column URL_To_Call
3. Loop through the table to extract data... here is the code

Private Sub test1()
Dim strURL As String

Dim db As Database
Dim rst As DAO.Recordset

Set db = CurrentDb
Set rst = db.OpenRecordset("URLT")

Do Until rst.EOF
Call SaveHTMLTableToLocalTable(rst!URL_To_Call)

rst.MoveNext
Loop
rst.Close


Issues:
1. for every record, I am getting the message: HTML Table saved to local table successfully... forcing me to click okay button. How can I suppress this

2. if the URL is invalid then it throws the run time error 5
If the URL is invalid, I still want the system to procee






End Sub
 

arnelgp

..forever waiting... waiting for jellybean!
Local time
Today, 15:14
Joined
May 7, 2009
Messages
19,243
add Rem in front of Msgbox:
Code:
..
''
    Rem MsgBox "HTML table saved to local table successfully."
..

then change your code to this to show that the process has been completed.

Code:
Private Sub test1()
Dim strURL As String

Dim db As Database
Dim rst As DAO.Recordset

Set db = CurrentDb
Set rst = db.OpenRecordset("URLT")

Do Until rst.EOF
Call SaveHTMLTableToLocalTable(rst!URL_To_Call)

rst.MoveNext
Loop
rst.Close
MsgBox "HTML table saved to local table successfully."
End Sub
 

talha

New member
Local time
Today, 02:14
Joined
May 9, 2020
Messages
22
that fixed my first issue by I am still having challenges with this one

2. if the URL is invalid then it throws the run time error 5
If the URL is invalid, I still want the system to proceed

1711377111581.png
 

arnelgp

..forever waiting... waiting for jellybean!
Local time
Today, 15:14
Joined
May 7, 2009
Messages
19,243
i change the variable i and j to Long Integer.
copy and overwrite your old Sub:
Code:
Sub SaveHTMLTableToLocalTable(ByVal strURL As String)

    Dim objHTTP As Object
    Dim db As DAO.Database
    Dim rs As DAO.Recordset
    Dim iRow As Integer
    Dim iCol As Integer
    Dim nNum As Long, i As Long, j As Long
    Dim var As Variant, content As String
    ' Create a new instance of the XMLHTTP object
    Set objHTTP = CreateObject("MSXML2.XMLHTTP")

    ' Open the URL
    objHTTP.Open "GET", strURL, False

    ' Send the request
    objHTTP.send

    ' Get the response text (HTML content)
    Dim htmlContent As String
    htmlContent = objHTTP.responseText

    ' Close the connection
    objHTTP.abort

    ' Release the object
    Set objHTTP = Nothing

    
    i = InStr(1, htmlContent, "<tr>")
    htmlContent = Trim$(Mid$(htmlContent, i))
    i = InStrRev(htmlContent, "</tr>")
    htmlContent = Trim$(Left$(htmlContent, i - 1))
    
    var = Split(htmlContent, "<tr>")
    
    ' Open the database
    Set db = CurrentDb

    nNum = Val(DMax("Reference", "responseT") & "") + 1
    
    ' Open the table to which you want to save the HTML table data
    Set rs = db.OpenRecordset("responseT", dbOpenDynaset)
    
    For j = 0 To UBound(var)
    
        content = var(j)
        content = Replace$(Replace$(Replace$(content, "<td>", ""), "</td>", ""), "</tr>", "")
        content = Trim$(Replace$(content, Chr(9), ""))
        If Len(content) Then
            i = InStr(1, content, "?email=")
            If i <> 0 Then
                content = Mid$(content, i + Len("?email="))
                content = Left$(content, InStrRev(content, """>") - 1)
            Else
                If InStr(1, content, "<br />") <> 0 Then
                    content = PlainText(content)
                End If
            End If
            rs.AddNew
            rs.Fields("Reference") = nNum
            rs.Fields("Message").Value = content
            rs.Update
        End If
    Next j

    ' Close the recordset
    rs.Close

    ' Release the objects
    Set rs = Nothing
    Set db = Nothing

    Rem MsgBox "HTML table saved to local table successfully."

End Sub
 

talha

New member
Local time
Today, 02:14
Joined
May 9, 2020
Messages
22
The code is working fine with just one last issue...

If all the URLs are valid, then there is no error message and it works like it should be
if any of the URL is not valid then the code works but throws following message at the end

1711397958368.png


at this line
1711397990256.png
 

talha

New member
Local time
Today, 02:14
Joined
May 9, 2020
Messages
22
Here is the file that I am using for reference
 

Attachments

  • saveFromURL (1).accdb
    416 KB · Views: 21

Josef P.

Well-known member
Local time
Today, 09:14
Joined
Feb 2, 2023
Messages
826
Maybe it is? There is hardly any other way to generate this error with the mid function.

Code to check:
Code:
    i = InStr(1, htmlContent, "<tr>")
    Debug.Print "<tr> pos: "; i
    Debug.Assert i > 0
    htmlContent = Trim$(Mid$(htmlContent, i))
 

talha

New member
Local time
Today, 02:14
Joined
May 9, 2020
Messages
22
Maybe it is? There is hardly any other way to generate this error with the mid function.

Code to check:
Code:
    i = InStr(1, htmlContent, "<tr>")
    Debug.Print "<tr> pos: "; i
    Debug.Assert i > 0
    htmlContent = Trim$(Mid$(htmlContent, i))
although off the topic, but it might be relevant, The other thing that I noticed is that when running test2() and test3() I got error message in test2() as potentially it hit more invalid URLs


1711413918763.png
 

arnelgp

..forever waiting... waiting for jellybean!
Local time
Today, 15:14
Joined
May 7, 2009
Messages
19,243
test again.
 

Attachments

  • saveFromURL (1).accdb
    448 KB · Views: 20

Users who are viewing this thread

Top Bottom