shahin
Active Member
I've written some script in vba to parse the links leading to the next page from a torrent site. My script is able to scrape them. However, the issue I'm facing is that couple of duplicate links coming along in the result. If I use "Length" property I can handle it by harcoding some numbers but few websites have lots of pagination links and in that cases it is definitely tedious to handle duplicates with hardcoded numbers to the "Length". My question is whether there is any technique with which I can parse only the unique links?
I've tried with:
The one I do not wish to go with because of hardcoding number to the length property. It can handle the duplicates in this case, though!
Pictures with duplicate links:
I've tried with:
Code:
Sub TorrentData()
Dim http As New XMLHTTP60, html As New HTMLDocument, post As Object
With http
.Open "GET", "https://yts.ag/browse-movies", False
.send
html.body.innerHTML = .responseText
End With
For Each post In html.getElementsByClassName("tsc_pagination")(0).getElementsByTagName("a")
If InStr(post, "page") > 0 Then
x = x + 1: Cells(x, 1) = post.href
End If
Next post
End Sub
The one I do not wish to go with because of hardcoding number to the length property. It can handle the duplicates in this case, though!
Code:
Sub TorrentData()
Dim http As New XMLHTTP60, html As New HTMLDocument
With http
.Open "GET", "https://yts.ag/browse-movies", False
.send
html.body.innerHTML = .responseText
End With
With html.getElementsByClassName("tsc_pagination")(0).getElementsByTagName("a")
For N = 0 To .Length - 3
If InStr(.item(N).href, "page") > 0 Then
x = x + 1: Cells(x, 1) = .item(N).href
End If
Next N
End With
End Sub
Pictures with duplicate links:
Attachments
Last edited: