1. Welcome to Chandoo.org Forums. Short message for you

    Hi Guest,

    Thanks for joining Chandoo.org forums. We are here to make you awesome in Excel. Before you post your first question, please read this short introduction guide. When posting or responding to questions please remember our values at Chandoo.org are: Humility, Passion, Fun, Awesomeness, Simplicity, Sharing Remember that we have people here for whom English is not there first language and we need to allow for this in our dealings.

    Yours,
    Chandoo
  2. Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  3. When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Copy items from dropbox

Discussion in 'VBA Macros' started by YasserKhalil, Dec 8, 2017.

  1. YasserKhalil

    YasserKhalil Active Member

    Messages:
    968
  2. shahin

    shahin Active Member

    Messages:
    880
    Hi there!! Give it a go. It will fetch you all the urls you are after.
    Code (vb):

    Sub torrent_info()
        Dim IE As New InternetExplorer, html As HTMLDocument, post As Object, n_url As String
       
        With IE
            .Visible = True
            .navigate "https://www.dropbox.com/sh/v4zcawgs32v7qc9/AAALfDRjpNOT8NnOTL_4XAXja?dl=0"
            Do Until .readyState = READYSTATE_COMPLETE: Loop
            Set html = .document
        End With
       
        For Each post In html.getElementsByClassName("sl-grid-cell")
            With post.getElementsByTagName("a")
                If .Length Then row = row + 1: Cells(row, 1) = .Item(0).href
            End With
        Next post
       
        IE.Quit
    End Sub
     
    YasserKhalil likes this.
  3. YasserKhalil

    YasserKhalil Active Member

    Messages:
    968
    You are brilliant .. I already tried to use that code but couldn't reach a point as I have no great experience at scraping and html structures
    Thank you very much my friend
  4. YasserKhalil

    YasserKhalil Active Member

    Messages:
    968
    How can I grap the file name too?
    Just point me where to add lines so as to learn ...
  5. shahin

    shahin Active Member

    Messages:
    880
    If I like your comment then I'm approving that I'm brilliant but in reality I'm not. Thanks YasserKhalil, for being generous.
  6. YasserKhalil

    YasserKhalil Active Member

    Messages:
    968
    At last I tried it and succeed
    Code (vb):
    If .Length Then r = r + 1: Cells(r, 1) = .Item(0).href: Cells(r, 2) = .Item(0).Title
    shahin likes this.
  7. shahin

    shahin Active Member

    Messages:
    880
    Yes you did it. This is how Marc L taught me to avoid any IndexError. However, don't make it oneliner either.
    YasserKhalil likes this.
  8. YasserKhalil

    YasserKhalil Active Member

    Messages:
    968
    Hello again
    How to achieve the same task without relying on IE .. I mean using XML or something similar which I think will be faster and more reliable than using IE
  9. shahin

    shahin Active Member

    Messages:
    880
    If you use chrome then write this address in the chrome address bar "chrome://settings/content/javascript" and then disable javascript. When you are done then get back to dropbox page and reload it again. You will see that the page, i meant dropbox has nothing to display that means it's totally blank. The dropbox page is in reality javascript encrypted. So, you can't scrape items which are generated dynamically using "xmlhttp" request. Because, by the time the page loads it's content , the "xmlhttp" request has already been processed. Hope it helps.
    Last edited: Dec 10, 2017
    YasserKhalil likes this.
  10. YasserKhalil

    YasserKhalil Active Member

    Messages:
    968
    Thanks a lot for reply. So there is no other way except for piloting IE
    What if there are a lot of files (the files in this case would appear in several pages ..!!?)
    How can we deal with that case ..
    I hope to find another way away from using IE directly
  11. YasserKhalil

    YasserKhalil Active Member

    Messages:
    968
    I have a look at the source of the page and found that the links are related to this string "?dl=0", "file_id" when I pressed Ctrl + F and searched for that string I could found the four files existing. Can this helps in extracting relying on HTML source ...
  12. shahin

    shahin Active Member

    Messages:
    880
    I'm a great fan of IE nowadays. IE can do impossible stuffs. However, alternatively you can go for selenium (vba add-in). Few days back Marc L showed a method something like ".write". I'm not sure but i suppose there is a possibility with that to handle this.
  13. YasserKhalil

    YasserKhalil Active Member

    Messages:
    968
    Can you show me how to store the html source in a string variable ..?
  14. shahin

    shahin Active Member

    Messages:
    880
    If this is what you meant:
    Code (vb):

    Sub dropbox_info()
        Dim http As New XMLHTTP60, str_items As Variant
       
        With http
            .Open "GET", "https://www.dropbox.com/sh/v4zcawgs32v7qc9/AAALfDRjpNOT8NnOTL_4XAXja?dl=0", False
            .send
            str_items = .responseText
        End With
       
        Debug.Print str_items
    End Sub
     
    YasserKhalil likes this.
  15. YasserKhalil

    YasserKhalil Active Member

    Messages:
    968
    Thank you very much for this great code ..
  16. YasserKhalil

    YasserKhalil Active Member

    Messages:
    968
    After I have copied the string from the immediate window, I put it into text file but found that it is different from that on chrome source !!
  17. shahin

    shahin Active Member

    Messages:
    880
    This is what i told you about earlier that xmlhttp request can't catch dynamically generated content.
    YasserKhalil likes this.
  18. YasserKhalil

    YasserKhalil Active Member

    Messages:
    968
    But when I opened the text file where I stored the html string, I found the links there .. so it would possible to grab those link .. But I don't know exactly how to do that
    shahin likes this.
  19. shahin

    shahin Active Member

    Messages:
    880
    Could you upload that .txt file? I'll try.
  20. YasserKhalil

    YasserKhalil Active Member

    Messages:
    968
    I just copied the immediate window in a text file. That's all and opened it using notepad++

    Attached Files:

    • s.txt
      File size:
      107.3 KB
      Views:
      5
  21. Marc L

    Marc L Excel Ninja

    Messages:
    4,010
    Hi !

    Yasser, it's just about extracting part of a text
    via VBA basics text functions, at beginner level …
    YasserKhalil likes this.
  22. YasserKhalil

    YasserKhalil Active Member

    Messages:
    968
    Thanks sir. I will try to work on that and if I get any problems I will inform you
  23. Marc L

    Marc L Excel Ninja

    Messages:
    4,010

    As again here it's a bad wild cross posting behavior,
    first join links for each forum you created same post !

    Must see forum rules and read this !
  24. YasserKhalil

    YasserKhalil Active Member

    Messages:
    968
    Thanks a lot
    The other threads may be similar but not the exact threads. I try to just take only small points in each thread but they are different (at least in my point of view)
  25. shahin

    shahin Active Member

    Messages:
    880
    Hi YasserKhalil!! You perhaps remember that few days back we tried to build a scraper using some monster string to scrape some data traversing multiple pages using "POST" request or something but we were having data only from the first page. If you remember the link of that thread of yours then drop the link here. Perhaps, I've got a workaround to deal with that. Thanks.

Share This Page