• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Extract data from website

inddon

Member
Hello There,

I would like to know how can I extract website data using VBA. Currently the below manual procesure is followed. I would like to do this using excel VBA

The following is the manual procedure followed:
1. Go to website: https://www.msc.com/nld
59463

2. Click Tab "TRACK A SHIPMENT" and Enter Container Number 'CAIU2704696' and click Search (Magnifying icon)
59468

3. Results are displayed
59470

4. Data not found: If the Container Number is not found then the message 'No Matching tracking infomration. Please try again' is displayed
59471



In Step 3, I would like to extract the following data along with their Header in a worksheet:
A. Field Final POD ETA (in Column A1)
B. Section Movments (all records) ( From Column A5 onwards)

If data is not found then print a message in the worksheet "Data not found"


Look forward to hearing from you.


Many thanks & regards,

Don
 

Stevie

Active Member
Hi inddon,
I had a go and managed to get a defined tracking number to output as you requested.
I fell into an error when attempting to deal with the 'Sign up for our Newsletter' pop-up. This may not be an issue for you if you system is signed up already?
Regardless, this worked for me and should get you most of the way there:

You will need to enable 'Microsoft Internet Controls' and 'Microsoft HTML Object Library' in Tools>References.
Code:
Sub a()
    Dim trackNum As String, IE As Object, table As MSHTML.HTMLTable, finalPODETA As String, row As Double, col As Double
    On Error GoTo error1
    trackNum = "CAIU2704696"
    Set IE = CreateObject("InternetExplorer.Application")
    IE.navigate "https://www.msc.com/nld#"
    IE.Visible = True
    While IE.busy
        DoEvents
    Wend
    Call delay(2)
    IE.Document.getElementById("ctl00_ctl00_Header_TrackSearch_txtBolSearch_TextField").Value = trackNum
    IE.Document.getElementById("ctl00_ctl00_Header_TrackSearch_hlkSearch").Click
    Call delay(2)
    finalPODETA = IE.Document.getElementById("ctl00_ctl00_plcMain_plcMain_rptBOL_ctl00_rptContainers_ctl01_pnlContainer").getElementsByClassName("containerStats singleRowTable table-equal-3")(0).getElementsByTagName("td")(2).textContent
    finalPODETA = Trim(finalPODETA)
    Set table = IE.Document.getElementById("ctl00_ctl00_plcMain_plcMain_rptBOL_ctl00_rptContainers_ctl01_pnlContainer").getElementsByClassName("resultTable")(0)
    For row = 0 To table.Rows.Length - 1
        For col = 0 To 4
            Sheet1.Cells(5 + row, col + 1).Value = table.Rows(row).Cells(col).innerText
        Next col
    Next row
    Sheets("Sheet1").Range("A1") = finalPODETA
    Exit Sub
error1:
    MsgBox "Tracking data retrieval failed, please check accuracy of tracking number."
End Sub
Function delay(seconds As Long)
    Dim endTime As Date
    endTime = DateAdd("s", seconds, Now())
    Do While Now() < endTime
        DoEvents
    Loop
End Function
If the above was helpful, please click 'Like!' on the bottom right.

Stevie ^.^

Edit: added reference requirements.
 

inddon

Member
Hi inddon,
I had a go and managed to get a defined tracking number to output as you requested.
I fell into an error when attempting to deal with the 'Sign up for our Newsletter' pop-up. This may not be an issue for you if you system is signed up already?
Regardless, this worked for me and should get you most of the way there:

You will need to enable 'Microsoft Internet Controls' and 'Microsoft HTML Object Library' in Tools>References.
Code:
Sub a()
    Dim trackNum As String, IE As Object, table As MSHTML.HTMLTable, finalPODETA As String, row As Double, col As Double
    On Error GoTo error1
    trackNum = "CAIU2704696"
    Set IE = CreateObject("InternetExplorer.Application")
    IE.navigate "https://www.msc.com/nld#"
    IE.Visible = True
    While IE.busy
        DoEvents
    Wend
    Call delay(2)
    IE.Document.getElementById("ctl00_ctl00_Header_TrackSearch_txtBolSearch_TextField").Value = trackNum
    IE.Document.getElementById("ctl00_ctl00_Header_TrackSearch_hlkSearch").Click
    Call delay(2)
    finalPODETA = IE.Document.getElementById("ctl00_ctl00_plcMain_plcMain_rptBOL_ctl00_rptContainers_ctl01_pnlContainer").getElementsByClassName("containerStats singleRowTable table-equal-3")(0).getElementsByTagName("td")(2).textContent
    finalPODETA = Trim(finalPODETA)
    Set table = IE.Document.getElementById("ctl00_ctl00_plcMain_plcMain_rptBOL_ctl00_rptContainers_ctl01_pnlContainer").getElementsByClassName("resultTable")(0)
    For row = 0 To table.Rows.Length - 1
        For col = 0 To 4
            Sheet1.Cells(5 + row, col + 1).Value = table.Rows(row).Cells(col).innerText
        Next col
    Next row
    Sheets("Sheet1").Range("A1") = finalPODETA
    Exit Sub
error1:
    MsgBox "Tracking data retrieval failed, please check accuracy of tracking number."
End Sub
Function delay(seconds As Long)
    Dim endTime As Date
    endTime = DateAdd("s", seconds, Now())
    Do While Now() < endTime
        DoEvents
    Loop
End Function
If the above was helpful, please click 'Like!' on the bottom right.

Stevie ^.^

Edit: added reference requirements.

Hi Stevie,

Thank you for your solution.

It did work as expected. However, I would like the results to be displayed in the worksheet without the browser being opened.

I came across a solution which does a data extract without opening the browser. It uses XML, something like March L had mentioned.
https://chandoo.org/forum/threads/receiving-data-from-a-website-with-excel-vba.39824/

Code:
Sub GetWebData()
    Dim elem As Object, trow As Object
    Dim R&, C&, S$

    With New XMLHTTP60
        .Open "POST", "https://yenibeygir.com/at/getatdetaytab", False
        .setRequestHeader "content-type", "application/x-www-form-urlencoded; charset=UTF-8"
        .send "tab=galopTab&id=15673"
        S = .responseText
    End With

    With New HTMLDocument
        .body.innerHTML = S

        For Each elem In .getElementsByClassName("at_Galoplar")(0).Rows
            For Each trow In elem.Cells
                C = C + 1: Cells(R + 1, C) = trow.innerText
            Next trow
            C = 0: R = R + 1
        Next elem
    End With
End Sub

Is it possible to do in the above manner?

Appreciate your help

Look forward to hearing from you.

Regards,
Don
 

Chihiro

Excel Ninja
The site uses post-back to display queried data. Meaning, without browser, you'd need to have access to their API.

Since it does not look like you do... you'll need to pilot the browser. You could set browser .Visible = False to not display the browser.
 

inddon

Member
The site uses post-back to display queried data. Meaning, without browser, you'd need to have access to their API.

Since it does not look like you do... you'll need to pilot the browser. You could set browser .Visible = False to not display the browser.
Thank you Chihiro for your recommendation.

How to recognise, the code uses an API?

Regards,
Don
 

Chihiro

Excel Ninja
Last edited:

Marc L

Excel Ninja
Inspecting the website request, all data seem to be within the text response but the issue is the request's URL …​
 

Chihiro

Excel Ninja
It's a postback req, that generate's special url along with all the attributes from current view state of the site. Then that info is poted back to the server and page redrawn with data returned. Meaning, it's a lot of info that requires parsing and may not work without browser state. Hence, my earlier post.
 

Marc L

Excel Ninja
Yep ! ('cause of parameters & cookies & … , response text is a classic html page)​
inddon, if you can share another container # for a comparative test but as Chihiro yet explained, I'm not very optimistic …​
 
Top