• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Copy header and description from explorer link and paste it in excel

DurgeshN

New Member
Hi,

I need to copy and paste only header and description from explorer link. Below link extracts all test on explorer.

Expected Results as below-
Kolkata gets best cities award for tackling climate change

Kolkata, along with 10 other cities from across the globe, has been honoured with the best cities of 2016 award in recognition of its programme on solid waste management.

Kolkata, the only Indian city to win the honour, received the award during the C40 Mayors Summit held in Mexico. Other cities which won the award include Copenhagen, Sydney, and Paris.

So, let's discuss some questions related to this post:
Q1. Name of the Indian city honoured with the best cities of 2016 award in recognition of its programme on solid waste management?
Ans1. Kolkata

Code:
Sub Test()
Dim IE As Object

Set IE = CreateObject("InternetExplorer.Application")
With IE
.Visible = True
.Navigate "http://currentaffairs.adda247.com/2016/12/kolkata-gets-best-cities-award-for.html" ' should work for any URL
Do Until .ReadyState = 4: DoEvents: Loop

x = .document.body.innertext
x = Replace(x, Chr(10), Chr(13))
x = Split(x, Chr(13))
Range("A1").Resize(UBound(x)) = Application.Transpose(x)

.Quit
End With

End Sub
▬▬▬▬▬▬▬▬▬ Mod edit : thread moved to appropriate forum !
 
This should get you started. Instead of committing entire content into array. Just access the needed part of IE.Document via TagName, ClassName etc.

However, do note that ClassName are often site specific and even TagName use can vary from time to time. So likely you will need to adjust code for each site.

Code:
Sub Test()
Dim IE As Object, ieDoc As Object
Dim Url As String

Set IE = CreateObject("InternetExplorer.Application")
Url = "http://currentaffairs.adda247.com/2016/12/kolkata-gets-best-cities-award-for.html" ' should work for any URL
With IE
    .Visible = True
    .Navigate Url
   
    Do Until .ReadyState = 4: DoEvents: Loop
   
    Set ieDoc = IE.Document
   
    Debug.Print ieDoc.getElementsByTagName("Title")(0).innerText
    Debug.Print ieDoc.getElementsByClassName("post-body entry-content")(0).innerText


.Quit
End With

End Sub
You'll need to clean the text after as "post-body entry-content" has more than just the texts you want. I'd recommend splitting by "Source-The Hindu".
 
Hi !

Try this starter !​
Code:
Sub Demo1()
    Dim oDoc As Object
With CreateObject("Msxml2.XMLHTTP")
    .Open "GET", "http://currentaffairs.adda247.com/2016/12/kolkata-gets-best-cities-award-for.html", False
    .setRequestHeader "DNT", "1"
        On Error Resume Next
    .send
        On Error GoTo 0
    If .Status <> 200 Then Beep: Debug.Print .Status; " " & .StatusText: Exit Sub
    Set oDoc = CreateObject("htmlfile")
        oDoc.write .responseText
End With
If oDoc.frames.clipboardData.setData("Text", Replace$(oDoc.all("post-body-1907514957913783736").innerText, vbCrLf & vbCrLf, vbCrLf)) Then
    ActiveSheet.UsedRange.Columns(1).Clear
    ActiveSheet.Paste Cells(1)
    oDoc.frames.clipboardData.setData "Text", oDoc.getElementsByTagName("H3")(0).innerText
    ActiveSheet.Paste Cells(1)
    oDoc.frames.clipboardData.clearData "Text"
'    Cells(Rows.Count, 1).End(xlUp).Clear
End If
    Set oDoc = Nothing
End Sub
Do you like it ? So thanks to click on bottom right Like !
 
Back
Top