• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Web Scraping Help

I am wondering if someone can help me with this. I know very little about this type of VBA coding. I can scrap simple data, but this is a challenge. So here I will say many, many thanks! I have a list of numbers in an Excel spreadsheet and want to loop through them to extract data from the website.

Here is the website: https://horrycountytreasurer.qpaybi....aspx?id=652f3c24-8a75-4b71-937a-129102ee9223

When looking at the page source code, I know the search parameters are in rows 169-231 and the return data is in rows 233-277

My search parameters are:
Search Type = Real Estate
Payment Status = All
Tax Year = All
Search By = Map Number
Map Number = data from column A of the worksheet (examples below)

Each record in the spreadsheet should return about 11 rows of records.

What I would like to return is the number in column 1 of the worksheet beside each of the 11 records returned from the website.

Here are some of the numbers (Map Number on the website) in column A of the worksheet
1870105142
1810416023
1920503034
1502101053
1561003063
1552101004
 
I am wondering if someone can help me with this. I know very little about this type of VBA coding. I can scrap simple data, but this is a challenge. So here I will say many, many thanks! I have a list of numbers in an Excel spreadsheet and want to loop through them to extract data from the website.

Here is the website: https://horrycountytreasurer.qpaybi....aspx?id=652f3c24-8a75-4b71-937a-129102ee9223

When looking at the page source code, I know the search parameters are in rows 169-231 and the return data is in rows 233-277

My search parameters are:
Search Type = Real Estate
Payment Status = All
Tax Year = All
Search By = Map Number
Map Number = data from column A of the worksheet (examples below)

Each record in the spreadsheet should return about 11 rows of records.

What I would like to return is the number in column 1 of the worksheet beside each of the 11 records returned from the website.

Here are some of the numbers (Map Number on the website) in column A of the worksheet
1870105142
1810416023
1920503034
1502101053
1561003063
1552101004
Please check the attached workbook, goto Raw sheet and enter all the MAP Numbers there and then Click the button in the given sheet.

Try the program.

Thanks
 

Attachments

  • Online Tax Pyament Centre.xlsm
    27.2 KB · Views: 13
Ratan.

Absolutely perfect!

I would have never figured this out. I can scrap static data, but this was beyond me.

I cannot thank you enough.
 
Hi Ratan !

As piloting a webbrowser is often the worst way (less efficient, slowest),
to be reserved as a last option (see tutorials over the Web for Web scraping
by request or Power Query for example as well the request samples
within threads of this forum …), at least before piloting a webbrowser
that needs some observation, how webpage works, what happens
during and after its events, to produce an efficient code as possible,
♪ faster, smarter, stronger ♫ !​
Code:
Private Declare Sub Sleep Lib "kernel32" (ByVal Milliseconds&)

Sub Demo()
      Const TABL = "ctl00_MainContent_gvSearchResults"
        Dim oDoc As Object, oTbl As Object, Rg As Range, D As Date, R&, V
        Set oDoc = CreateObject("htmlfile")
    With CreateObject("InternetExplorer.Application")
             On Error GoTo Fin
            .Navigate "https://horrycountytreasurer.qpaybill.com"
            .Visible = True
             ActiveSheet.UsedRange.Offset(, 1).Clear
        With ActiveWindow
            .ScrollColumn = 1
            .ScrollRow = 3
             R = .VisibleRange.Rows(.VisibleRange.Rows.Count).Row
        End With
            While .ReadyState < 4:  DoEvents:  Wend
        With .Document.all
             .ctl00_MainContent_radRealEstateButton.Click
              Sleep 600
             .ctl00_MainContent_ddlCriteriaList.selectedIndex = 1
        End With
        For Each V In Cells(1).CurrentRegion.Columns(1).Value
                 D = Now + 0.00011
            With .Document.all
                 .ctl00_MainContent_txtCriteriaBox.Value = V
                  Set oTbl = .Item(TABL)
                 .ctl00_MainContent_btnSearch.Click
            End With
            If oTbl Is Nothing Then
                Do
                    If Now > D Then Error 2000
                    DoEvents
                Loop Until IsObject(.Document.all(TABL))
                    .Visible = False
            Else
                While oTbl.Rows.Length
                    If Now > D Then Error 2001
                    DoEvents
                Wend
            End If
               oDoc.body.innerHTML = .Document.body.outerHTML
            If oDoc.frames.clipboardData.setData("Text", oDoc.all(TABL).outerHTML) Then
                     Set Rg = Cells(Rows.Count, 3).End(xlUp)(4)
                     ActiveSheet.Paste Rg
                With Rg.CurrentRegion.Columns
                       .Item("A:I").WrapText = False
                       .Item("J:K").Clear
                    If .Rows(.Rows.Count).Row >= R Then
                        With ActiveWindow
                            .ScrollRow = Rg.Row - 1
                            R = .VisibleRange.Rows(.VisibleRange.Rows.Count).Row
                        End With
                    End If
                End With
            End If
        Next
            oDoc.frames.clipboardData.clearData "Text"
            ActiveWindow.ScrollRow = 1
            ActiveSheet.UsedRange.Columns.AutoFit
Fin:
        Set oDoc = Nothing:  Set oTbl = Nothing:  Set Rg = Nothing
        If Err.Number Then Beep: Debug.Print Err.Number; Err.Description
        .Quit
    End With
End Sub
Do you like it ? So thanks to click on bottom right Like !
 
Hi Ratan !

As piloting a webbrowser is often the worst way (less efficient, slowest),
to be reserved as a last option (see tutorials over the Web for Web scraping
by request or Power Query for example as well the request samples
within threads of this forum …), at least before piloting a webbrowser
that needs some observation, how webpage works, what happens
during and after its events, to produce an efficient code as possible,
♪ faster, smarter, stronger ♫ !​
Code:
Private Declare Sub Sleep Lib "kernel32" (ByVal Milliseconds&)

Sub Demo()
      Const TABL = "ctl00_MainContent_gvSearchResults"
        Dim oDoc As Object, oTbl As Object, Rg As Range, D As Date, R&, V
        Set oDoc = CreateObject("htmlfile")
    With CreateObject("InternetExplorer.Application")
             On Error GoTo Fin
            .Navigate "https://horrycountytreasurer.qpaybill.com"
            .Visible = True
             ActiveSheet.UsedRange.Offset(, 1).Clear
        With ActiveWindow
            .ScrollColumn = 1
            .ScrollRow = 3
             R = .VisibleRange.Rows(.VisibleRange.Rows.Count).Row
        End With
            While .ReadyState < 4:  DoEvents:  Wend
        With .Document.all
             .ctl00_MainContent_radRealEstateButton.Click
              Sleep 600
             .ctl00_MainContent_ddlCriteriaList.selectedIndex = 1
        End With
        For Each V In Cells(1).CurrentRegion.Columns(1).Value
                 D = Now + 0.00011
            With .Document.all
                 .ctl00_MainContent_txtCriteriaBox.Value = V
                  Set oTbl = .Item(TABL)
                 .ctl00_MainContent_btnSearch.Click
            End With
            If oTbl Is Nothing Then
                Do
                    If Now > D Then Error 2000
                    DoEvents
                Loop Until IsObject(.Document.all(TABL))
                    .Visible = False
            Else
                While oTbl.Rows.Length
                    If Now > D Then Error 2001
                    DoEvents
                Wend
            End If
               oDoc.body.innerHTML = .Document.body.outerHTML
            If oDoc.frames.clipboardData.setData("Text", oDoc.all(TABL).outerHTML) Then
                     Set Rg = Cells(Rows.Count, 3).End(xlUp)(4)
                     ActiveSheet.Paste Rg
                With Rg.CurrentRegion.Columns
                       .Item("A:I").WrapText = False
                       .Item("J:K").Clear
                    If .Rows(.Rows.Count).Row >= R Then
                        With ActiveWindow
                            .ScrollRow = Rg.Row - 1
                            R = .VisibleRange.Rows(.VisibleRange.Rows.Count).Row
                        End With
                    End If
                End With
            End If
        Next
            ActiveWindow.ScrollRow = 1
            ActiveSheet.UsedRange.Columns.AutoFit
Fin:
        Set oDoc = Nothing:  Set oTbl = Nothing:  Set Rg = Nothing
        If Err.Number Then Beep: Debug.Print Err.Number; Err.Description
        .Quit
    End With
End Sub
Do you like it ? So thanks to click on bottom right Like !
Sir can you please explain what are you trying to say and what should I learn exactly...?? :)
 
Please don't quote entirely the just preceding post …

If you just run my procedure,
don't you see it is ~5 times faster than yours ? ~10s vs ~50s !
As both pilot same IE on same webpage …

It could be even more faster using a direct copy from IE
(but that needs to authorize it in IE advanced options)
or by creating an array directly from the data table as you did.
(And should be far faster using a request instead of IE !)

So piloting IE needs synchronization
but most of the time it can be achieved on webpage elements
(just needs a basic understanding of DOM aka Document Object Model)
and not using each time a pause like Wait method,
here the only necessary Wait is after clicking on Real Estate !

If just after / before calling Busy you need Wait,
that's means using Busy is a wrong way
as it is not working for expected event,
making your code so terribly long !

The worst : you keep IE in memory by hiding it once achieving
to grab data instead of releasing it via Quit method …

Efficient way starts with understanding first how webpage works
before writing any codeline, so for a smart code

♪ work it, do it, make it : better, faster, stronger ♫ !
 
Back
Top