1. Welcome to Chandoo.org Forums. Short message for you

    Hi Guest,

    Thanks for joining Chandoo.org forums. We are here to make you awesome in Excel. Before you post your first question, please read this short introduction guide. When posting or responding to questions please remember our values at Chandoo.org are: Humility, Passion, Fun, Awesomeness, Simplicity, Sharing Remember that we have people here for whom English is not there first language and we need to allow for this in our dealings.

    Yours,
    Chandoo
  2. Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  3. When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Trouble scraping links in the right way

Discussion in 'VBA Macros' started by shahin, Mar 12, 2018.

  1. shahin

    shahin Active Member

    Messages:
    865
    I've written some code in vba to scrape addresses from a webpage. There are 20 links in that page. The address I wish to parse lies within each link. My intention is to click each link to unveil the information and parse that. Upon execution the parser is suppose to scrape an address from the first link then it will go for the next link and repeat the process. However, my below script can do it erroneously. It clicks on each link but scrapes the same information (the information related to the first link) over and over again. How can I fix it?

    First I tried like this:

    Code (vb):

    Sub Scrape_Items()
        URL$ = "http://www.incometaxindia.gov.in/Pages/utilities/exempted-institutions.aspx"
        Dim post As Object, elem As Object

        With CreateObject("InternetExplorer.Application")
            .Visible = True
            .navigate URL
            While .Busy = True Or .ReadyState < 4: DoEvents: Wend
               
            For Each post In .Document.getElementsByClassName("fc-blue")
                post.Click
                Set elem = .Document.getElementsByClassName("exempted-detail")(0).getElementsByTagName("span")(0)
                r = r + 1: Cells(r, 1) = elem.innerText
                Application.Wait Now + TimeValue("00:00:05")
            Next post
        End With
    End Sub
     
    Then I tried like this but no luck, the results are always the same:

    Code (vb):

    Sub Scrape_Items()
        URL$ = "http://www.incometaxindia.gov.in/Pages/utilities/exempted-institutions.aspx"
        Dim post As Object, elem As Object, ldic As Object, key As Variant
     
        Set ldic = CreateObject("Scripting.Dictionary")

        With CreateObject("InternetExplorer.Application")
            .Visible = True
            .navigate URL
            While .Busy = True Or .ReadyState < 4: DoEvents: Wend
               
            For Each post In .Document.getElementsByClassName("fc-blue")
                ldic(post) = 1
            Next post
       
            For Each key In ldic.Keys
                key.Click
                Set elem = .Document.getElementsByClassName("exempted-detail")(0).getElementsByTagName("span")(0)
                r = r + 1: Cells(r, 1) = elem.innerText
                Application.Wait Now + TimeValue("00:00:03")
            Next key
        End With
    End Sub

     
    Last edited: Mar 12, 2018
  2. Chihiro

    Chihiro Excel Ninja

    Messages:
    4,658
    Study the source code. There's no need to expand each item for scraping. Data is already in the source.

    Also, since you are setting elem to fixed object. Each loop will get exactly the same info.

    Code (vb):
    Sub Scrape_Items()
        Url$ = "http://www.incometaxindia.gov.in/Pages/utilities/exempted-institutions.aspx"
        Dim post As Object, elem As Object

        With CreateObject("InternetExplorer.Application")
            .Visible = True
            .navigate Url
            While .Busy = True Or .ReadyState < 4: DoEvents: Wend
           
            Set elem = .Document.getElementsByClassName("exempted-detail")
            For i = 0 To elem.Length - 1
                r = r + 1: Cells(r, 1) = elem(i).getElementsByTagName("span")(0).innerText
            Next
        End With
    End Sub
    shahin likes this.
  3. shahin

    shahin Active Member

    Messages:
    865
    Yes, it is. Lots of love, sir. Thanks a lot.

Share This Page