1. Welcome to Chandoo.org Forums. Short message for you

    Hi Guest,

    Thanks for joining Chandoo.org forums. We are here to make you awesome in Excel. Before you post your first question, please read this short introduction guide. When posting or responding to questions please remember our values at Chandoo.org are: Humility, Passion, Fun, Awesomeness, Simplicity, Sharing Remember that we have people here for whom English is not there first language and we need to allow for this in our dealings.

    Yours,
    Chandoo
  2. Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  3. When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Using post data and xlmlhttp

Discussion in 'VBA Macros' started by YasserKhalil, Feb 12, 2018.

  1. YasserKhalil

    YasserKhalil Active Member

    Messages:
    968
    Hello everyone
    I am still trying to learn more about scraping and I could devise a code that enables me to get the desired results
    Here's the code
    Code (vb):
    Sub Test()
        Dim e          As Variant
        Dim ie          As Object
        Dim ulElem      As Object
        Dim liElem      As Object
        Dim anchElem    As Object
        Dim dt          As Date
        Dim lDay        As Integer
        Dim lMnth      As Integer
        Dim lYear      As Integer
        Dim r          As Long

        Set ie = CreateObject("InternetExplorer.Application")
        dt = Date - 2
        lDay = Day(dt)
        lMnth = Month(dt)
        lYear = Year(dt)

        With ie
            .Visible = False
            .navigate ("http://www.handelsregisterbekanntmachungen.de/?aktion=suche#Ergebnis")

            Do: DoEvents: Loop Until .readyState = 4

            For Each e In ie.document.getElementsByTagName("select")
                If Len(e.innerText) = 56 Then
                    e.selectedIndex = lDay
                ElseIf Len(e.innerText) = 18 Then
                    e.selectedIndex = lMnth
                ElseIf Left(e.innerText, 8) = "----2000" Then
                    e.selectedIndex = lYear - 1999
                ElseIf InStr(e.innerText, "Alle Bekanntmachungen") > 0 Then
                    e.selectedIndex = 1
                End If
            Next e

            For Each e In ie.document.getElementsByTagName("input")
                If e.Value = "Suche starten" Then e.Click: Exit For
            Next e
            Do: DoEvents: Loop Until .readyState = 4
            Application.Wait Now() + TimeValue("00:00:05")

            If InStr(ie.document.body.innerHTML, "Es wurden 0 Treffer gefunden.") > 0 Then
                MsgBox "No Results Found", vbExclamation: Exit Sub
            Else
                For Each ulElem In ie.document.getElementsByTagName("b")
                    For Each liElem In ulElem.getElementsByTagName("li")
                        Set anchElem = liElem.getElementsByTagName("a")
                        If anchElem.Length > 0 Then
                            r = r + 1
                            Cells(r, 1) = Mid(anchElem.Item(0).innerText, 11)
                        End If
                    Next liElem
                Next ulElem
            End If
        End With
    End Sub
    But as a matter of trying to learn more about xmlhttp I am seeking for a way to get the same results but without using ie .. so I think using xmlhttp will be more efficient specially I could see post data after setting up the desired choices for the search process ..

    Thanks advanced for help
  2. Marc L

    Marc L Excel Ninja

    Messages:
    4,044
  3. YasserKhalil

    YasserKhalil Active Member

    Messages:
    968
    Can you show me with snapshot please?
  4. Marc L

    Marc L Excel Ninja

    Messages:
    4,044
    Like in previous links …

    F12, Network tab (clear rows if needed) and operate manually
    within webpage : you will see request used.

    If you don't understand the read of previous links,
    at least crystal clear explain what you are trying to do,
    details all manual steps within webpage …
  5. YasserKhalil

    YasserKhalil Active Member

    Messages:
    968
    Thanks a lot
    Here's a snapshot of the manual steps for search process .. I have already done that using IE
    Show.png
  6. Marc L

    Marc L Excel Ninja

    Messages:
    4,044
    Once you've filled the form, before to click on button,
    activate inner inspector tool (F12), activate Network tab
    (clear rows if needed) then click button :
    once webpage completed with data, you will see within this Network tab
    the request used by the webbrowser (as explained in my links),
    just double click on it to see its details.

    This is the very basics of any request scrapping,
    as it's just about reading and observing …

    A good training, for shahin too ! …
    shahin likes this.
  7. YasserKhalil

    YasserKhalil Active Member

    Messages:
    968
    Thank you very much
    I already have done those steps
    001.png


    002.png

    and this is the form data content
    Code (vb):
    suchart=uneingeschr&button=Suche+starten&land=&gericht=&gericht_name=&seite=&l=&r=&all=false&vt=11&vm=2&vj=2018&bt=11&bm=2&bj=2018&fname=&fsitz=&rubrik=&az=&gegenstand=1&anzv=10&order=1
    What's the next step?
  8. Marc L

    Marc L Excel Ninja

    Messages:
    4,044

    Next step is to reproduce this request under VBA
    using some details from the inspector tool …​
  9. YasserKhalil

    YasserKhalil Active Member

    Messages:
    968
    Are there other details except for "Form data" that I will use?
  10. YasserKhalil

    YasserKhalil Active Member

    Messages:
    968
    I tried that code
    Code (vb):
    Sub Test_XMLHTTP()
        Dim XMLHTTP    As New MSXML2.XMLHTTP60
        Dim myUrl      As String
        Dim postData    As String
       
        myUrl = "http://www.handelsregisterbekanntmachungen.de/?aktion=suche#Ergebnis"
        postData = "suchart=uneingeschr&button=Suche+starten&land=&gericht=&gericht_name=&seite=&l=&r=&all=false&vt=11&vm=2&vj=2018&bt=11&bm=2&bj=2018&fname=&fsitz=&rubrik=&az=&gegenstand=1&anzv=10&order=1"
       
        XMLHTTP.Open "GET", myUrl, False
        XMLHTTP.send postData
       
        Debug.Print XMLHTTP.responseText
    End Sub
    but I expect the responseText to have the string: "Niedersachsen"
    but I didn't find that string within the responseText ...
  11. shahin

    shahin Active Member

    Messages:
    882
    Just tell me the criteria to fill up the form to populate results @yasser.
  12. YasserKhalil

    YasserKhalil Active Member

    Messages:
    968
    Hello my friend
    It is shown in post #7 in details .. and I got the "Form Data" part ..
  13. YasserKhalil

    YasserKhalil Active Member

    Messages:
    968
    I have replaced "GET" with "POST" .. and the same
  14. shahin

    shahin Active Member

    Messages:
    882
    Run the code and get the data you are trying to. String manipulation is necessary to split different portions. I'm having lunch. Catch you later.
    Code (vb):

    Sub Fetch_Data()
        Dim HTTP As New XMLHTTP60, HTML As New HTMLDocument
        Dim post As Object
       
        postdata = "suchart=uneingeschr&button=Suche+starten&land=&gericht=&gericht_name=&seite=&l=&r=&all=false&vt=11&vm=2&vj=2018&bt=11&bm=2&bj=2018&fname=&fsitz=&rubrik=&az=&gegenstand=1&anzv=10&order=1"
       
        With HTTP
            .Open "POST", "http://www.handelsregisterbekanntmachungen.de/?aktion=suche", False
            .setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
            .send postdata
            HTML.body.innerHTML = .responseText
        End With
       
        For Each post In HTML.getElementsByTagName("ul")
            Row = Row + 1: Cells(Row, 1) = post.innerText
        Next post
    End Sub
     
    YasserKhalil likes this.
  15. Marc L

    Marc L Excel Ninja

    Messages:
    4,044
    « You didn't find » does not always mean an issue,
    just be sure with using InStr VBA function to check text part within a text …
    YasserKhalil likes this.
  16. YasserKhalil

    YasserKhalil Active Member

    Messages:
    968
    Thanks a lot my friend
    The code works but this is not the desired HTML page ..the results are different from manual steps ..
    The first result should start with the string "Ginsterbusch" and I didn't find it ..
  17. Marc L

    Marc L Excel Ninja

    Messages:
    4,044

    Post a screenshot of webpage display after button hit (expecting)
    and you actual code, I'll try to take a glance after lunch break …​
  18. YasserKhalil

    YasserKhalil Active Member

    Messages:
    968
    Thanks alot Mr. MarcL
    I have this code now after Mr. Shain's great solution
    Code (vb):
    Sub Test_XMLHTTP()
        Dim http        As New XMLHTTP60
        Dim html        As New HTMLDocument
        Dim post        As Object
        Dim myUrl      As String
        Dim postData    As String
        Dim r          As Long

        myUrl = "http://www.handelsregisterbekanntmachungen.de/?aktion=suche#Ergebnis"
        postData = "suchart=uneingeschr&button=Suche+starten&land=&gericht=&gericht_name=&seite=&l=&r=&all=false&vt=11&vm=2&vj=2018&bt=11&bm=2&bj=2018&fname=&fsitz=&rubrik=&az=&gegenstand=1&anzv=10&order=1"

        With http
            .Open "POST", myUrl, False
            .setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
            .send postData
            html.body.innerHTML = .responseText
        End With

        Debug.Print html.body.innerHTML
    End Sub
    I manually copied the contents of the immediate window and paste into a text file and examine the string manually so as to make sure this is the correct html page
  19. Marc L

    Marc L Excel Ninja

    Messages:
    4,044

    Bad idea as sometimes content window can't keep all text (rows # limit) ! Directly check responseText with InStr …​
    shahin likes this.
  20. shahin

    shahin Active Member

    Messages:
    882
    To get only the targeted portion, you can try like below:
    Code (vb):

    Sub Fetch_Data()
        Dim HTTP As New XMLHTTP60, HTML As New HTMLDocument
        Dim post As Object, postdta As String
       
        postdata = "suchart=uneingeschr&button=Suche+starten&land=&gericht=&gericht_name=&seite=&l=&r=&all=false&vt=11&vm=2&vj=2018&bt=11&bm=2&bj=2018&fname=&fsitz=&rubrik=&az=&gegenstand=1&anzv=10&order=1"
       
        With HTTP
            .Open "POST", "http://www.handelsregisterbekanntmachungen.de/?aktion=suche", False
            .setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
            .send postdata
            HTML.body.innerHTML = .responseText
        End With

        For Each post In HTML.getElementsByTagName("li")
            With post.getElementsByTagName("ul")
                If .Length Then Row = Row + 1: Cells(Row, 1) = .Item(0).innerText
            End With
        Next post
    End Sub
     
    The result it populates (see the attached image):

    Attached Files:

    YasserKhalil likes this.
  21. shahin

    shahin Active Member

    Messages:
    882
    Thanks again to Marc L to teach me how to avoid "On Error Resume Next" by using the pattern of code I've pasted above.
    Marc L likes this.
  22. YasserKhalil

    YasserKhalil Active Member

    Messages:
    968
    I found my fault .. after many tries
    It was in the url string .. there is extra "#Ergebnis" so I have not got the desired resulls
    Now I started to get the process
  23. YasserKhalil

    YasserKhalil Active Member

    Messages:
    968
    @shahin
    Is that line necessary and why
    Code (vb):
    .setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
    What make us decide those 'setRequestHeader' lines ...?

    I tried once with that line and once without that line and I got different results .. so it is very important ..
    Can you explain that point please?
  24. shahin

    shahin Active Member

    Messages:
    882
    When you send a post request, you send it like filling a form because there are more or less few options you need to choose from the main page to populate results.
    YasserKhalil likes this.
  25. YasserKhalil

    YasserKhalil Active Member

    Messages:
    968
    Thank you very much for both of you Mr. MarcL and Mr. Sahhin for great support in this issue
    Now I got most of the process .. but training is better than thousands of words

    Here's the last edition of the code (for the followers of the thread)
    Code (vb):
    Sub Post_Search_Form_Data_Using_XMLHTTP_Loop_Through_Li_Tag_Name()
        Dim http        As New XMLHTTP60
        Dim html        As New HTMLDocument
        Dim post        As Object
        Dim myUrl      As String
        Dim postData    As String
        Dim dt          As Date
        Dim lDay        As Integer
        Dim lMnth      As Integer
        Dim lYear      As Integer
        Dim r          As Long

        dt = Date - 2
        lDay = Day(dt): lMnth = Month(dt): lYear = Year(dt)

        myUrl = "http://www.handelsregisterbekanntmachungen.de/?aktion=suche"
        postData = "suchart=uneingeschr&button=Suche+starten&land=&gericht=&gericht_name=&seite=&l=&r=&all=false&vt=" & lDay & "&vm=" & lMnth & "&vj=" & lYear & "&bt=" & lDay & "&bm=" & lMnth & "&bj=" & lYear & "&fname=&fsitz=&rubrik=&az=&gegenstand=1&anzv=10&order=1"

        With http
            .Open "POST", myUrl, False
            .setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
            .send postData
            html.body.innerHTML = .responseText
        End With
       
        For Each post In html.getElementsByTagName("li")
            With post.getElementsByTagName("ul")
                If .Length Then r = r + 1: Cells(r, 1) = .Item(0).innerText
            End With
        Next post
    End Sub

Share This Page