• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Unable to parse all the links from a webpage

shahin

Active Member
Can't figure out how can I get all the company links from the page used in my code. Running my script I get only 20 links. The page has got lazyloading method that is why I can't get all of them. Any input on this will be highly appreciated. I've tried so far with:
Code:
Sub Web_Data()
Const lnk = "http://fortune.com"
Dim http As New XMLHTTP60, html As New HTMLDocument
Dim topics As Object, topic As Object

With http
    .Open "GET", "http://fortune.com/fortune500/list/", False
    .send
    html.body.innerHTML = .responseText
End With

For Each topic In html.getElementsByClassName("small-12 column row")
    x = x + 1
    With topic.getElementsByTagName("a")
        If .Length Then Cells(x, 1) = lnk & Split(.item(0).href, "about:")(1)
    End With
Next topic
Set html = Nothing: Set topics = Nothing
End Sub
 
Hi, shahin!
hourglass-3d-model-3ds-fbx-blend-dae.png

Regards!
 
Thanks SirJB7, for the picture. Metaphorically identical stuff. It does represent my problem but I don't know how to make the browser wait until the content is fully loaded using xmlhttp.
 
Thanks Marc L, for your response. For the last couple of hours I've been waiting for this moment. Finally, you got back. Anyways, as for the answer to your question: I have double-checked my code several times to find out If I have made any mistake but unfortunately I can't figure out. Class name seems to be right to me.
 

Sorry, it was a misreading of your need …

So you can only get the first 20 links as in the page within any webbrowser
without the $999 for the complete list !
 
Yes it is. I tried to find out if there is any hidden pagination in the url but unable to do that either.
 
Hi, shahin!
I didn't intend to be metaphoric, just suggested to try placing a delay, a wait, or a loop doing nothing... in order to wait for the full page load.
Regards!
 

Hi !

No needs any wait 'cause its a request with False parameter in open.
As source URL has only 20 items …
 
@Marc L,
I found it very tricky compare to those I've worked so far with. At this point I can't think anything to move along.
 
Thanks Narayan, for your sharp and effective finding. You have been a great help when in need. I'm trying myself with the chrome developer tool to figure out how you discover that link. If you don't take it as a botheration then you might wanna give me a hint how to find out such links (for future reference). Thanks a trillion.
 
Hi ,

I used the Chrome Developer tool to look at the requests that were being sent ; as you scroll down the page , you can see the additional requests that were made.

Narayan
 
Thanks Narayan, for your instruction and guideline. I wanted to scrape the name of organizations and the ceo of each organizations. Successful for the first 100 names. Is there any way I can make use of the api in my script to roll until 1000 records are exhausted? Btw, I tried inputting 1000 in the url but it gives 100 records at best. Here is the working code for first 100 records:
Code:
Sub Web_Data()
Dim http As New XMLHTTP60
Dim str As Variant

With http
    .Open "GET", "http://fortune.com/api/v2/list/2013055/expand/item/ranking/asc/0/100", False
    .send
    str = Split(.responseText, "fullname"":""")
End With

N = UBound(str)

For L = 1 To N
    Cells(L + 1, 1) = Split(str(L), """")(0)
    Cells(L + 1, 2) = Split(Split(str(L), "ceo"":""")(1), """")(0)
Next L
End Sub
Once again, hats off to you Narayan.
 
Eventually, I got a workaround. Now It gives me 101 to 900 records. Here is the code.
Code:
Sub Web_Data()
Dim http As New XMLHTTP60
Dim str As Variant
For x = 1 To 20
    With http
        .Open "GET", "http://fortune.com/api/v2/list/2013055/expand/item/ranking/asc/" & (x * 50) & "/50", False
        .send
        str = Split(.responseText, "fullname"":""")
    End With
  
    N = UBound(str)
  
    For L = 1 To N
        y = y + 1
        Cells(y, 1) = Split(str(L), """")(0)
        Cells(y, 2) = Split(Split(str(L), "ceo"":""")(1), """")(0)
    Next L
Next x
End Sub
 
Back
Top