1. Welcome to Chandoo.org Forums. Short message for you

    Hi Guest,

    Thanks for joining Chandoo.org forums. We are here to make you awesome in Excel. Before you post your first question, please read this short introduction guide. When posting or responding to questions please remember our values at Chandoo.org are: Humility, Passion, Fun, Awesomeness, Simplicity, Sharing Remember that we have people here for whom English is not there first language and we need to allow for this in our dealings.

    Yours,
    Chandoo
  2. Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  3. When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Download Tweets from Twitter

Discussion in 'VBA Macros' started by Nebu, Oct 12, 2017.

  1. Nebu

    Nebu Excel Ninja

    Messages:
    2,053
    Hi All:

    I am trying to download tweets by a particular user to excel I am interested in extracting

    • Date
    • The commodity name
    • and the dollar values from the tweet
    As per the attached screen shot.
    upload_2017-10-12_15-53-12.png
    The Twitter Url is going to be static (https://twitter.com/The_AMEGroup).

    Is there a way to do it using VBA? I know there are lot of third party softwares available for this, but at this point in time we do not have enough work to invest in a third party software.

    I really appreciate if anyone can shed some light on this. Let me know if you want any further details.

    Thanks
    Nebu
    ThrottleWorks likes this.
  2. NARAYANK991

    NARAYANK991 Excel Ninja

    Messages:
    15,924
  3. Nebu

    Nebu Excel Ninja

    Messages:
    2,053
    Hi Narayan:

    I have gone through this website, the formula he provided will work of the first and last name of the tweeter and will not work for company accounts. Unfortunately this is not what I am looking for. Thanks for your suggestion.

    Regards
    Nebu
    ThrottleWorks likes this.
  4. Marc L

    Marc L Excel Ninja

    Messages:
    3,276
    Hi !

    Nebu, as you can check with your webbrowser inner inspector tool,
    Commodity Name & Dollar value are not in separate elements
    but within the same element …

    As a starter :​
    Code (vb):
    Sub Demo1()
                 Dim oElt As Object
        With CreateObject("WinHttp.WinHttpRequest.5.1")
            .Open "GET", "https://twitter.com/The_AMEGroup", False
            .setRequestHeader "DNT", "1"
             On Error Resume Next
            .send
             If Err.Number Then Beep: Exit Sub
             On Error GoTo 0
         If .Status <> 200 Then Beep: Exit Sub
             T$ = .responseText
        End With
        With CreateObject("HTMLFile")
                .Write T
            For Each oElt In .getElementsByTagName("P")
                  If oElt.className = "TweetTextSize TweetTextSize--normal js-tweet-text tweet-text" Then
                    Debug.Print vbLf & oElt.parentNode.parentNode.Children(0).Children(1).Children(0).Children(0).innertext;
                    Debug.Print " : "; Split(oElt.innertext, "/")(0)
                  End If
            Next
        End With
    End Sub
    shahin, YasserKhalil and Nebu like this.
  5. Nebu

    Nebu Excel Ninja

    Messages:
    2,053
    Hi Marc:

    Excellent!!!

    This is working fine for me . I can use formulas on this extract to get what I need for my analysis.However, it extracts only first few tweets on to the spreadsheet. Is there a way to dynamically decide on the no:eek:f tweets to be extracted? say, first 100 tweets.

    Appreciate all the support Marc.

    Regards
    Nebu
    ThrottleWorks likes this.
  6. Marc L

    Marc L Excel Ninja

    Messages:
    3,276
    It's the issue with JScript in webpage, all is not present at the initial load.
    As you must already notice when you surf the page …

    So try to see within your webbrowser inner inspector tool at network section
    which request is loaded when you reach end of scroll bar …

    Must see :

    Using the F12 developer tools

    Analyzing your webpage's network traffic
    ThrottleWorks likes this.
  7. Nebu

    Nebu Excel Ninja

    Messages:
    2,053
    Hi Marc:

    Is there a way to add a line of code (potentially break time for 10-15 seconds) which will allow a considerable chunk of the web page to load before scrapping the data from the site? Does that make a difference to the results?

    Disclaimer: I am not a programmer, my background is finance. Whatever VBA code I know is all self taught. Having said that I can pick up some of the basic jargons you guys use.

    Thanks
    Nebu
    ThrottleWorks likes this.
  8. Marc L

    Marc L Excel Ninja

    Messages:
    3,276

    If you just observe how this webpage works
    - either manually (scroll down) or via inspector tool (http request) -
    you must see a break time don't change anything …
    ThrottleWorks likes this.
  9. shahin

    shahin Active Member

    Messages:
    595
    Try this as well. These days, I've been a great fan of IE. All you need to do is apply string manipulation on the scraped portion to clean the data.

    Code (vb):

    Sub Get_Result()
        Dim IE As New InternetExplorer, html As HTMLDocument
     
        With IE
            .Visible = False
            .navigate "https://twitter.com/The_AMEGroup"
            Do Until .readyState = READYSTATE_COMPLETE: Loop
            Set html = .document
        End With
     
        Application.Wait (Now + TimeValue("0:00:05"))

        For Each posts In html.getElementsByClassName("content")
            With posts.getElementsByClassName("_timestamp")
                If .Length Then row = row + 1: Cells(row + 1, 1) = .item(0).innerText
            End With

            With posts.getElementsByClassName("TweetTextSize")
                If .Length Then Cells(row + 1, 2) = .item(0).innerText
            End With
        Next posts
     
        IE.Quit
    End Sub
     
    ThrottleWorks, Nebu and YasserKhalil like this.
  10. Nebu

    Nebu Excel Ninja

    Messages:
    2,053
    Hi Shahin:

    This is working fine ( I had to add VBA reference library to get it working), but the only issue is that it is scrubbing only the first few lines of the webpage as @Marc L pointed out in his post the web page is not loading soon enough to capture the data. Is there a way to overcome this problem? I am not looking for the entire page content since it is twitter it keep on loading more and more tweets. I am typically looking to scrub data for a month at a time it may run into 450-500 tweets.

    Edit:

    I have managed to get the data I want. I increased the waiting time to 1 minute and manually scrolled the webpage to the point from where I wanted to capture data.I tried using "IE.parentwindow.scrollby" but, is not working the way I wanted it to be.

    Thanks
    Nebu
    Last edited: Oct 16, 2017
  11. shahin

    shahin Active Member

    Messages:
    595
    Hi there!!! Your issue is fixed. Try the below script and get all the tweets. Just wait until no more tweets are left and the browser closes.
    Code (vb):

    Sub Get_Result()
        Dim IE As New InternetExplorer, html As HTMLDocument
        Dim storage As Object, posts As Object
        With IE
            .Visible = True
            .navigate "https://twitter.com/The_AMEGroup"
            Do Until .readyState = READYSTATE_COMPLETE: Loop
            Set html = .document
        End With
         
        For i = 1 To 50
            On Error Resume Next
            Set storage = html.getElementsByClassName("content")
            If Rank Is Nothing Then
                GoTo Skip
            End If
            Exit For
    Skip:
            SendKeys "{end}"
            Application.Wait (Now() + TimeValue("00:00:005"))
        Next i
           
        For Each posts In storage
            With posts.getElementsByClassName("_timestamp")
                If .Length Then row = row + 1: Cells(row + 1, 1) = .item(0).innerText
            End With

            With posts.getElementsByClassName("TweetTextSize")
                If .Length Then Cells(row + 1, 2) = .item(0).innerText
            End With
        Next posts
        IE.Quit
    End Sub
     
    Last edited: Oct 17, 2017
    Marc L and Nebu like this.
  12. Nebu

    Nebu Excel Ninja

    Messages:
    2,053
    Hi Shahin:

    This is working fine, I increased the loop from 20 to 25 and its giving me all the tweets I am looking for. Thanks for this quick turn around.

    Regards
    Nebu
  13. shahin

    shahin Active Member

    Messages:
    595
    @Nebu, Where did you find the number "20" in my loop? I used "50" instead. Thanks.
    Last edited: Oct 17, 2017

Share This Page