• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Usage of proxy in vba scraper

shahin

Active Member
I've written some code in vba to harvest the movie names and year from a torrent site. The script is doing just awesome. Although the scraper is leaving no room for complaint, I'm still dubious about how the way I've set the proxy is accurate. Moreover, I've set two proxies in my scraper. I've googled a lot about the usage of proxy stuff in vba but didn't get satisfactory information from the web. At this point, all I wanna know is whether what I'm doing is the right way. Thanks in advance.

Here is what I've written:

Code:
Sub Proxy_Usage()
    Dim http As New ServerXMLHTTP60, html As New HTMLDocument
    Dim post As HTMLHtmlElement
 
    With http
        .Open "GET", "https://yts.ag/browse-movies", False
        .setRequestHeader "Content-Type", "text/html; charset=utf-8"
        .setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36"
        .setProxy 1, "61.233.25.166:80", "46.101.27.218:8118"
        .send
        html.body.innerHTML = .responseText
    End With
 
    For Each post In html.getElementsByClassName("browse-movie-bottom")
        With post.getElementsByClassName("browse-movie-title")
            x = x + 1: Cells(x, 1) = .item(0).innerText
        End With
        With post.getElementsByClassName("browse-movie-year")
            If .Length Then Cells(x, 2) = .item(0).innerText
        End With
    Next post
End Sub

I've tried with Amazon and found it working as well.
Code:
Sub Proxy_Usage()
    Dim http As New ServerXMLHTTP60, html As New HTMLDocument
    Dim elem As Object
  
    With http
        .Open "GET", "https://www.amazon.com/Books/s?ie=UTF8&page=1&rh=n%3A283155&srs=9187220011", False
        .setRequestHeader "Content-Type", "text/html; charset=utf-8"
        .setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36"
        .setProxy 1, "61.233.25.166:80", "46.101.27.218:8118"
        .send
        html.body.innerHTML = .responseText
    End With
  
    For Each elem In html.getElementsByClassName("s-access-title")
        i = i + 1: Cells(i, 1) = elem.innerText
    Next elem
End Sub
 
Last edited:
Found the solution. To make the valid use of proxy the parameter should have to be changed a little:
Code:
.setProxy 2, "61.233.25.166:80", "46.101.27.218:8118

And make sure the proxies are in use. There are several sites provide free proxies. That's it.
 
I found a site which certifies whether the proxy I'm trying with is working.

I've tried with a free proxy within the below scraper.
Code:
Sub Proxy_Usage()
    Dim elem As Object, S$

    With New ServerXMLHTTP60
        .Open "GET", "http://www.lagado.com/proxy-test", False
        .setRequestHeader "User-Agent", "Mozilla/5.0"
        .setProxy 2, "89.38.146.15:80"
        .send
        S = .responseText
    End With
 
    With New HTMLDocument
        .body.innerHTML = S
        Set elem = .querySelectorAll(".main-panel p")(1)
        [A1] = elem.innerText
    End With
End Sub

This is what the result you may get upon running the above script.
Code:
The request appears to have originated from ip address 89.38.146.15
 
Back
Top