• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

How to parse data like a list instead of a single line?

shahin

Active Member
I've built a scraper to parse data from a website. My scraper is doing well. However, the result I'm having comes in a single line whereas I wish to have them like a list. The below attached image will let you understand what I meant. Thanks for any help on this.

This is what I've tried so far:

Code:
Sub edmunds()
    Dim IE As New InternetExplorer, HTML As HTMLDocument
    Dim elem As Object
  
    With IE
        .Visible = False
        .navigate "https://www.edmunds.com/ford/escape/2017/cost-to-own/?zip=43215"
        Do While .Busy = True Or .readyState <> 4: DoEvents: Loop
        Application.Wait Now + TimeValue("0:00:05")
        Set HTML = .document
    End With
  
    For Each elem In HTML.getElementById("tco_detail_data").getElementsByTagName("li")
        Debug.Print elem.innerText
    Next elem
    IE.Quit
End Sub
 

Attachments

  • Untitled.jpg
    Untitled.jpg
    53.7 KB · Views: 8
Last edited:
Try this code
Code:
Sub Parse_Data_Like_List_Edmunds_Website()
    Dim ie          As New InternetExplorer
    Dim html        As HTMLDocument
    Dim elem        As Object
    Dim e          As Object
    Dim s          As String

    With ie
        .Visible = False
        .navigate "https://www.edmunds.com/ford/escape/2017/cost-to-own/?zip=43215"
        Do While .Busy = True Or .readyState <> 4: DoEvents: Loop
        Application.Wait Now + TimeValue("00:00:05")
        Set html = .document
    End With

    For Each elem In html.getElementById("tco_detail_data").getElementsByTagName("li")
        s = ""
        For Each e In elem.getElementsByTagName("li")
            s = IIf(s = "", "", s & vbTab) & e.innerText
        Next e
        If s <> "" Then Debug.Print s
    Next elem

    ie.Quit
End Sub
 
Thanks YasserKhalil, for your solution. The way you have shown above does give me the expected results. However, I wished to parse them like we usually grab tabular data (the way we harvest data from a table).
 
So try populating the data to the worksheet in that way
Code:
Sub Parse_Data_Like_List_Edmunds_Website()
    Dim ie          As New InternetExplorer
    Dim html        As HTMLDocument
    Dim elem        As Object
    Dim e          As Object
    Dim f          As Boolean
    Dim r          As Long
    Dim c          As Long
   
    With ie
        .Visible = False
        .navigate "https://www.edmunds.com/ford/escape/2017/cost-to-own/?zip=43215"
        Do While .Busy = True Or .readyState <> 4: DoEvents: Loop
        Application.Wait Now + TimeValue("00:00:05")
        Set html = .document
    End With

    For Each elem In html.getElementById("tco_detail_data").getElementsByTagName("li")
        c = 1: f = False
       
        For Each e In elem.getElementsByTagName("li")
            Cells(r + 1, c).Value = e.innerText
            c = c + 1
            f = True
        Next e
       
        If f Then r = r + 1
    Next elem

    ie.Quit
End Sub
 
In the code without using the variable f (the boolean variable), the results will have empty results inbetween so this variable is used
In each loop we make the boolean =False and when the object elem has "li" tags then to make that variable True and if True the variable r will be increased by one as shown in the code r = r + 1
 
Now the last thing on this topic I wish to know: how come the usage of ".getElementsByTagName("li")" becomes valid having been used within the two different "for loops" meant to serve the same purpose. It works flawlessly, though.
 
That's because the inner loop has different tags "li" .. it is considered as sub-lists
You know more than me in the structure of the websites
 
Back
Top