• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

How to fix the loop to traverse unknown page numbers?

shahin

Active Member
I've written a macro to collect some data from a webpage. The webpage displays it's data (as the default content is not I'm after) according to the user input. However, using chrome developer tool, I could find out that only appending the preferable parameters [Page Number, Location ID, Speciality] to the URL can lead me to the accurate destination.

1. Main URL : "https://www.mortgageandfinancehelp.com.au/find-accredited-broker/"
2. Preferable parameters: Needing change the "Location ID" (which are from 3004 to 3007) and the "Page Number". The rest are OKAY.
3. "Executable Link" which I've found out from chrome dev tools and used in my below scraper: "https://www.mortgageandfinancehelp....=1&query=&location=3006&expertise=Residential"

The page number used in the "Executable Link" are generated according to the load more button from the original page. The thing is I wanna control this page number in such a way so that my scraper can parse data from different pages no matter how many pages are there (it may be one, two or several).

This is what I've tried so far. It is a working one:
Code:
Sub controlling_loops()
   ''first portion of url
    Const URL As String = "https://www.mortgageandfinancehelp.com.au/find-accredited-broker/?page="
    Dim IE As New InternetExplorer, HTML As HTMLDocument, post As Object
    Dim location_id As Variant, Id_input As Variant, I As Long, str_append As String

    location_id = [{"3004","3005","3006","3007"}]

    For Each Id_input In location_id
        For I = 1 To 2            ''page number
            ''another portion of url
            str_append = I & "&query=&location=" & Id_input & "&expertise=Residential"
        
            With IE
                .Visible = False
                .navigate URL & str_append    ''complete link
                While .Busy = True Or .readyState < 4: DoEvents: Wend
                Set HTML = .document
            End With

            Application.Wait Now + TimeValue("00:00:05")

            For Each post In HTML.getElementsByClassName("broker-tile")
                With post.getElementsByClassName("broker-name")
                    If .Length Then r = r + 1: Cells(r, 1) = .Item(0).innerText
                End With
                With post.querySelectorAll(".broker-contact p a[href^='mailto:']")
                    If .Length Then Cells(r, 2) = .Item(0).innerText
                End With
            Next post
        
        Next I
    Next Id_input
End Sub

Last of all: I don't know If the way I have defined my loop is the ideal one. If any better idea comes along, I'll be very happy to comply.
 
Last edited:
Back
Top