shahin
Active Member
Hi there!! Hope you all are doing well. I've written a script in vba to scrape some movie names from a torrent site. The site has displayed it's content in several pages through pagination. When the next page link is clicked then along with usual address this very portion "?page=2" gets appended to the url and it increases like ("?page=3","?page=4" and so on) with the incremental click. You might have already noticed that it starts it's next page from "page=2" instead of "page=1". In fact, there is no "page=1" which means the default address is serving like "page=1". However, my intention is to use something like "if" block within my script so that my scraped next page urls are exactly like below. I've tried to give a nudge to make it work but it starts from "page=1" and at the same time misses "page=2". Any help on this will be highly appreciated. What I've done so far can be seen uncommenting the lines.
Here is my main code:
If i do nothing and run the above script then the partial result is like:
If i try to run the script uncommenting the lines, this is what i get:
Results I'm expecting:
Here is my main code:
Code:
Sub yts()
Const mlink As String = "https://yts.ag"
Dim http As New XMLHTTP60, html As New HTMLDocument
Dim new_link As String
With http
.Open "GET", "https://yts.ag/browse-movies", False
.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36"
.send
html.body.innerHTML = .responseText
End With
With html.getElementsByClassName("tsc_pagination")(0).getElementsByTagName("a")
For n = 2 To .Length - 1
' If n = 2 Then
' new_link = mlink & Split(.item(n).href, ":")(1)
' new_link = Left(new_link, Len(new_link) - 1) & "1"
' Else
' new_link = mlink & Split(.item(n).href, ":")(1)
' End If
' Cells(n, 1) = new_link
Cells(n, 1) = .item(n).href
Next n
End With
End Sub
If i do nothing and run the above script then the partial result is like:
Code:
about:/browse-movies?page=2
about:/browse-movies?page=3
about:/browse-movies?page=4
about:/browse-movies?page=5
If i try to run the script uncommenting the lines, this is what i get:
Code:
https://yts.ag/browse-movies?page=1
https://yts.ag/browse-movies?page=3
https://yts.ag/browse-movies?page=4
https://yts.ag/browse-movies?page=5
Results I'm expecting:
Code:
https://yts.ag/browse-movies?page=1
https://yts.ag/browse-movies?page=2
https://yts.ag/browse-movies?page=3
https://yts.ag/browse-movies?page=4
https://yts.ag/browse-movies?page=5
Last edited: