• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

How to produce results with the below method?

shahin

Active Member
To get the first text embedded within the class mentioned below, we usually follow the below method or anything similar to that:
Code:
Sub fetch_specific_text()
    Dim IE As New InternetExplorer, html As New HTMLDocument
    Dim post As Object

    With IE
        .Visible = True
        .navigate "https://stackoverflow.com/questions/"
        Do While IE.readyState <> READYSTATE_COMPLETE: Loop
        Set html = .document
    End With

    For Each post In html.getElementsByClassName("summary")(0).getElementsByClassName("excerpt")
        Debug.Print post.innerText
    Next post
    IE.Quit
End Sub

However, when I try to go like this, it doesn't print anything no error either. What should be the accurate method to accomplish the way I've started below?

Code:
Sub fetch_specific_text()
    Dim IE As New InternetExplorer, html As New HTMLDocument
    Dim post As Object

    With IE
        .Visible = True
        .navigate "https://stackoverflow.com/questions/"
        Do While IE.readyState <> READYSTATE_COMPLETE: Loop
        Set html = .document
    End With

    For Each post In html.getElementsByClassName("summary")
        If post.className = "excerpt" Then Debug.Print post.innerText: Exit For
    Next post
    IE.Quit
End Sub
 
Look at your For Each Loop... First one loops through elements ".getElementsByClassName("excerpt")", but your second only loops through elements in ".getElementsByClassName("summary")".

You either need additional loop within, or set the collection to the right level at top of the loop.
 
What you said definitely works, sir. If i could follow your instruction properly this is something you meant:

Code:
Sub fetch_specific_text()
    Dim IE As New InternetExplorer, html As New HTMLDocument
    Dim post As Object

    With IE
        .Visible = True
        .navigate "https://stackoverflow.com/questions/"
        Do While IE.readyState <> READYSTATE_COMPLETE: Loop
        Set html = .document
    End With

    For Each post In html.getElementsByClassName("summary")(0).getElementsByClassName("excerpt")
        If InStr(post.className, "excerpt") > 0 Then Debug.Print post.innerText: Exit For
    Next post
    IE.Quit
End Sub

But my question is: ain't this class "excerpt" along with other classes and tags within the class "summary"?

Just consider this example. All of them are within class "summary":

Code:
<div class="summary">
        <h3><a href="/questions/47395189/how-can-i-get-platform-specific-keybind-information" class="question-hyperlink">How can I get platform specific keybind information?</a></h3>
        <div class="excerpt">
            How can I get the information on what's key bound for a specific function such as copying, for example in Windows it is ctrl + c. How can I get that information with script?
        </div>        
        <div class="tags t-python t-cross-platform t-keyboard-shortcuts">
            <a href="/questions/tagged/python" class="post-tag" title="show questions tagged 'python'" rel="tag">python</a> <a href="/questions/tagged/cross-platform" class="post-tag" title="" rel="tag">cross-platform</a> <a href="/questions/tagged/keyboard-shortcuts" class="post-tag" title="" rel="tag">keyboard-shortcuts</a>
        </div>
        <div class="started fr">
            <div class="user-info ">
    <div class="user-action-time">
        asked <span title="2017-11-20 15:13:43Z" class="relativetime">11 mins ago</span>
    </div>
    <div class="user-gravatar32">
        <a href="/users/7032856/nae"><div class="gravatar-wrapper-32"><img src="https://lh6.googleusercontent.com/-v2P-pqI1pww/AAAAAAAAAAI/AAAAAAAAABg/BHOFSOpQ0BM/photo.jpg?sz=32" alt="" width="32" height="32"></div></a>
    </div>
    <div class="user-details">
        <a href="/users/7032856/nae">Nae</a>
        <div class="-flair">
            <span class="reputation-score" title="reputation score " dir="ltr">546</span><span title="1 silver badge"><span class="badge2"></span><span class="badgecount">1</span></span><span title="17 bronze badges"><span class="badge3"></span><span class="badgecount">17</span></span>
        </div>
    </div>
</div>
        </div>
    </div>
 
You have to understand what getElementsByXXX returns.

It always returns element collection. Hence why ".innerText" returns blank. In order to use ".innerText" you need to access specific element from collection.
 
I'll bear that in mind, sir. The thing is I got confused to see that the below way I can get all the text property within class "summary". So i thought in the first place that it may be possible what i asked for:
Code:
For Each post In html.getElementsByClassName("summary")
    Debug.Print post.innerText
Next post
 
Ah, my bad. Should have explained in following manner.

When you loop through like you did above. You are looping through only elements collection that match class name of "Summary".

Try doing this. And you will see why noting is printed when you check for "post.ClassName = "exerpt"".

Code:
    For Each post In html.getElementsByClassName("summary")
        Debug.Print post.className
    Next post

You should only see "summary" and no "exerpt" here. Hence, the need for second loop within each "summary".

Edit: Alternately directly access class name "excerpt" like below...
Code:
For Each post In html.getElementsByClassName("excerpt")
    Debug.Print post.innerText
Next post
 
Last edited:
Back
Top