shahin
Active Member
Hi there! Is it possible to crawl a web page recursively? Using two or three requests it is possible to produce lots of links but that is not what i want. Actually I was thinking to do it myself but I don't know how to roll a newly produced link using function or something so that it will run until all the links in a page reach its' dead end. Here is what I wrote to extract the link of a page. Hope somebody will give me an idea how to make those links roll recursively. Thanks in advance.
Code:
Sub RecursiveCrawling()
Const url = "https://en.wikipedia.org/wiki/Main_Page"
Const mainlink = "https://en.wikipedia.org"
Dim topics As Object, post As Object
With CreateObject("MSXML2.serverXMLHTTP")
.Open "GET", url, False
.setRequestHeader "Content-Type", "text/xml"
.send
Set html = CreateObject("htmlfile")
html.body.innerHTML = .responseText
End With
Set topics = html.getElementsByTagName("a")
For Each post In topics
x = x + 1
Cells(x, 1) = post.href
Next post
Set topics = Nothing
End Sub
Last edited: