• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Using Htmlhtmlelement once in a for loop while parsing web data

shahin

Active Member
Hi there everyone! I've already completed building a scraper which is parsing a web data very smoothly. However, in my code I've used "Htmlhtmlelement" twice within a for loop to get the work done but it doesn't look good, I meant not well organized. Is it possible to use that aforementioned object once in a for loop to make the code look clean. I'm pasting here the code for your consideration. Thanks in advance.
Code:
Sub ParseHackerNews()
Dim http As New ServerXMLHTTP60, html As New HTMLDocument
Dim topics As Object, posts As Object, post As HTMLHtmlElement, data As HTMLHtmlElement

With http
    .Open "GET", "https://news.ycombinator.com/", False
    .send
    html.body.innerHTML = .responseText
End With

Set topics = html.getElementsByClassName("athing")
Set posts = html.getElementsByClassName("subtext")
    On Error Resume Next
    For x = 0 To topics.Length - 1
    Set data = topics(x)
        i = i + 1
        Cells(i, 1).Value = data.getElementsByClassName("storylink")(0).innerText
        Cells(i, 2).Value = data.getElementsByClassName("sitestr")(0).innerText
       
    Set post = posts(x)
        Cells(i, 3).Value = post.getElementsByClassName("score")(0).innerText
        Cells(i, 4).Value = post.getElementsByClassName("hnuser")(0).innerText
    Next x
    Set http = Nothing: html = Nothing: topics = Nothing: posts = Nothing
End Sub
 

Hi !

As yet mentioned at least in your previous thread you do not ever need
any object variable using With statement …
 
Hi Marc L, glad to see your comment. Actually, my intention is to go through every possible combination so that when time is right I can apply either of the sort. I could have managed to get things done the way I write here but you know every pattern breaks in different situation cause html elements are not always stratified the desired way. Thanks.
 
Hi Marc L, I've rewritten my code using "Htmlhtmlelement" once and it is also working nicely but I wanted to use it the way i started it. I meant, using two set keywords holding "Htmlhtmlelment" object variable each and finally deploy those in a single for loop. Here is my newly written code:
Code:
Sub HackerNews()
Dim html As New HTMLDocument
Dim topics As Object, post As HTMLHtmlElement

With CreateObject("MSXML2.serverXMLHTTP")
    .Open "GET", "https://news.ycombinator.com/", False
    .send
    html.body.innerHTML = .responseText
End With
Set topics = html.getElementsByClassName("athing")
    For Each post In topics
        x = x + 1
        Cells(x, 1)= post.getElementsByClassName("storylink")(0).innerText
        Cells(x, 2) = post.getElementsByClassName("sitestr")(0).innerText
        Cells(x, 3) = post.NextSibling.getElementsByTagName("span")(0).innerText
        Cells(x, 4) = post.NextSibling.getElementsByTagName("a")(0).innerText
    Next post
    Set html = Nothing:  Set topics = Nothing
End Sub
 
Actually, I have learnt this stuff from you to set variables free. However, I overthought a bit this time that is why it seemed to me that if I use set once then I can state variables equal to nothing using a colon between them so this is how it happened, moreover while creating codes I do not write Option Explicit on the top that is why this mistake got undetected.
 
Last edited:
Back
Top