• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

How to parse links conditionally?


Replace is like Split in this IE case (not the same of using a request)
so better if first char is "/" then use Mid or Right VBA text function …​
 
I tried like the below way which is totally useless here. The "Mid" and "Right" functions are real threats to me.

Code:
Sub GetValidLinks()
    Dim itemlinks As Variant, link As Variant, desiredlink$
  
    itemlinks = [{"/contact/","/contact.html","/contact/main/categories.html","contact/","contact.html"}]
    For Each link In itemlinks
        desiredlink = Replace(link, "/", "")
        MsgBox desiredlink
    Next link
End Sub
 
Finally found the solution to deal with the expected element in a robust way:
Code:
Sub Get_Conditional_Links()
    Dim IE As New InternetExplorer, HTML As HTMLDocument
    Dim post As Object, elem As Object, newlink As String
    Dim linklists As Variant, link As Variant

    linklists = [{"http://www.innovaprint.com.sg/","https://www.plexure.com.sg/","http://www.mount-zion.biz/","http://www.cityscape.com.sg/"}]

    For Each link In linklists
        With IE
            .Visible = True
            .navigate link
            While .Busy = True Or .readyState < 4: DoEvents: Wend
            Set HTML = .document
        End With

        For Each post In HTML.getElementsByTagName("a")
            If InStr(1, post.innerText, "contact", 1) > 0 Then newlink = post.getAttribute("href"): Exit For
        Next post
       
        If newlink = "" Then         '''this is the fix
            For Each elem In HTML.getElementsByTagName("a")
                If InStr(1, post.innerText, "about", 1) > 0 Then newlink = post.getAttribute("href"): Exit For
            Next elem
        End If
       
        R = R + 1: Cells(R, 1) = newlink
    Next link
End Sub
 
Back
Top