• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

My scraper is unable to click all the links cyclically in a webpage

shahin

Active Member
I've written a script in vba to enter a webpage, click on "a" tags under class name "domino-viewentry" and finally quit the browser when all the links are done clicking. However, after clicking on the first link my scraper throws an error "permission denied". I suppose, after clicking on the first link, my scraper can't go back to it's earlier position to chase the next link. How can i fix this?

Here is the script:
Code:
Sub Click_Tag()

    Dim post As Object
    With CreateObject("InternetExplorer.Application")
        .Visible = True
        .navigate "http://www.siicex-caaarem.org.mx/Bases/TIGIE2007.nsf/4caa80bd19d9258006256b050078593c/"
        While .readyState < 4: DoEvents: Wend
        For Each post In .document.getElementsByClassName("domino-viewentry")
            With post.getElementsByTagName("a")
                If .Length Then .Item(0).Click
                Application.Wait Now + TimeValue("00:00:02")
            End With
        Next post
        .Quit
    End With
   
End Sub
 
Simply put, you can't do it like you are attemping.
In HTML, <a> denotes hyperlink....

So when you clicked on it, you've already navigated away from initial page and hence post object is no longer valid and you get permission denied.

Instead, first fill dictionary or other collection with list of href that you want to navigate to... then loop through the collection.

Something like...
Code:
Sub Click_Tag()
    Dim lDic As Object
    Dim post As Object
    Set lDic = CreateObject("Scripting.Dictionary")
    With CreateObject("InternetExplorer.Application")
        .Visible = True
        .Navigate "http://www.siicex-caaarem.org.mx/Bases/TIGIE2007.nsf/4caa80bd19d9258006256b050078593c/"
        While .readyState < 4: DoEvents: Wend
        For Each post In .document.getElementsByClassName("domino-viewentry")
            With post.getElementsByTagName("a")
                If .Length Then
                    lDic(.Item(0).href) = 1
                End If
            End With
        Next post
        For Each Key In lDic.Keys
            .Navigate Key
            Application.Wait Now + TimeValue("00:00:02")
        Next
        .Quit
    End With
  Set lDic = Nothing
End Sub
 
@sir Chihiro, Every time I create a thread in this forum, I always look for the availability of your online status because when any solution I get from you that becomes legendary and never fails. Thanks a lot. I'm still frightened to see the usage of scripting dictionary, though!
 
You can join .href using delimiter that does not occur in URL string.

Then at end split using delimiter and loop through resulting array elements.
 
@sir Chihiro, I tried but couldn't decipher this portion, "lDic(.Item(0).href) = 1". Why "1"? Sorry for my being ignorant.
 
It's just a place holder.

Dictionary has two components, Key and Item. Since I'm only interested in getting unique list of links (Keys). I really don't care what's held as item. So use whatever you like in place of 1 and it won't matter in this case.
 
Back
Top