1. Welcome to Chandoo.org Forums. Short message for you

    Hi Guest,

    Thanks for joining Chandoo.org forums. We are here to make you awesome in Excel. Before you post your first question, please read this short introduction guide. When posting or responding to questions please remember our values at Chandoo.org are: Humility, Passion, Fun, Awesomeness, Simplicity, Sharing Remember that we have people here for whom English is not there first language and we need to allow for this in our dealings.

    Yours,
    Chandoo
  2. Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  3. When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Usage of proxy in vba scraper

Discussion in 'VBA Macros' started by shahin, Sep 15, 2017.

  1. shahin

    shahin Active Member

    Messages:
    549
    I've written some code in vba to harvest the movie names and year from a torrent site. The script is doing just awesome. Although the scraper is leaving no room for complaint, I'm still dubious about how the way I've set the proxy is accurate. Moreover, I've set two proxies in my scraper. I've googled a lot about the usage of proxy stuff in vba but didn't get satisfactory information from the web. At this point, all I wanna know is whether what I'm doing is the right way. Thanks in advance.

    Here is what I've written:

    Code (vb):

    Sub Proxy_Usage()
        Dim http As New ServerXMLHTTP60, html As New HTMLDocument
        Dim post As HTMLHtmlElement
     
        With http
            .Open "GET", "https://yts.ag/browse-movies", False
            .setRequestHeader "Content-Type", "text/html; charset=utf-8"
            .setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36"
            .setProxy 1, "61.233.25.166:80", "46.101.27.218:8118"
            .send
            html.body.innerHTML = .responseText
        End With
     
        For Each post In html.getElementsByClassName("browse-movie-bottom")
            With post.getElementsByClassName("browse-movie-title")
                x = x + 1: Cells(x, 1) = .item(0).innerText
            End With
            With post.getElementsByClassName("browse-movie-year")
                If .Length Then Cells(x, 2) = .item(0).innerText
            End With
        Next post
    End Sub
     
    I've tried with Amazon and found it working as well.
    Code (vb):

    Sub Proxy_Usage()
        Dim http As New ServerXMLHTTP60, html As New HTMLDocument
        Dim elem As Object
     
        With http
            .Open "GET", "https://www.amazon.com/Books/s?ie=UTF8&page=1&rh=n%3A283155&srs=9187220011", False
            .setRequestHeader "Content-Type", "text/html; charset=utf-8"
            .setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36"
            .setProxy 1, "61.233.25.166:80", "46.101.27.218:8118"
            .send
            html.body.innerHTML = .responseText
        End With
     
        For Each elem In html.getElementsByClassName("s-access-title")
            i = i + 1: Cells(i, 1) = elem.innerText
        Next elem
    End Sub
     
    Last edited: Sep 15, 2017
    YasserKhalil likes this.
  2. shahin

    shahin Active Member

    Messages:
    549
    Found the solution. To make the valid use of proxy the parameter should have to be changed a little:
    Code (vb):

    .setProxy 2, "61.233.25.166:80", "46.101.27.218:8118
    And make sure the proxies are in use. There are several sites provide free proxies. That's it.

Share This Page