• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

HTML is different from view source

YasserKhalil

Well-Known Member
Hello everyone
I am still confused of using XMLHTTP approach ..
For example in this code I have exported the html to text file
Code:
Sub Test()
    Dim html        As Object
    Dim f As Integer
   
    With CreateObject("MSxml2.XMLHTTP")
        .Open "GET", "https://www.martialartsschoolsdirectory.com/united-states?page=1", False
        .send
        Set html = CreateObject("htmlfile")
        html.body.innerHTML = .responseText
    End With

        f = FreeFile()
        Open ThisWorkbook.Path & "\Sample.txt" For Output As #f
        Print #f, html.body.innerHTML
        Close #f

End Sub

But when navigating to the url and right click >> View Source :: I found it different from this in the text file
Can you explain why and how to get the same as View Source html page?
 
Do/Can web sites have the ability to detect what destination/system is accessing the site and adjusting what is presented accordingly ?
 
I'm not the full bottle on Web Scraping and so mostly do things by trial & error or just stumble through
 
html.body.innerHTML as the name suggests, only grabs body portion of source code. What you need is .write method to html. However, do note you can't have reference to HTML object library to use this method. It's one of the quirks in the library. Have a read of below thread for an example.

https://chandoo.org/forum/threads/h...ommand-in-xmlhttp-requests.36449/#post-218694

Edit: Also forgot to mention that many site uses Java Script or other methods to fill data on the page. Which isn't captured by using XMLHttp, as that only grabs base html and does not fire subsequent codes housed in html code.
 
Last edited:
Back
Top