• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Extract data from cell on webpage table using VBA [SOLVED]

wicksta

New Member
Hi


I've some, but limited experience, with VBA and macros generally.


I am trying to extract data from pages such as this:


http://idoxpa.westminster.gov.uk/online-applications/applicationDetails.do?activeTab=summary&keyVal=MFZP5PRP01R00


Specifically the value next to the "Status" row.


The page is generated by inserting a reference into the box on:


http://idoxpa.westminster.gov.uk/online-applications/


In this case reference 13/00001


(The 'KEYVAL' in the url of the results page seems to be generated at random so I can't be used to access the file directly).


I have set up a macro that queries the first web page with a reference from Cell A1 (where the references follow the pattern 13/00001, 13/00002, etc), but I can't work out how to then extract the information from the relevant cell in the HTML table on the results page.


"Sub IE_Autiomation()

Dim i As Long

Dim IE As Object

Dim objElement As Object

Dim objCollection As Object

Dim ieTable As Object

Dim clip As Object


' Create InternetExplorer Object

Set IE = CreateObject("InternetExplorer.Application")


' You can uncoment Next line To see form results

IE.Visible = True


' Send the form data To URL As POST binary request

IE.Navigate "http://idoxpa.westminster.gov.uk/online-applications/"


' Wait while IE loading...

Do While IE.Busy

Application.Wait DateAdd("s", 1, Now)

Loop


Set objCollection = IE.document.getElementsByTagName("input")


i = 0

While i < objCollection.Length

If objCollection(i).Name = "searchCriteria.simpleSearchString" Then


' Set text for search

objCollection(i).Value = Range("A1").Value


Else

If objCollection(i).Type = "submit" And _

objCollection(i).Name = "" Then


' "Search" button is found

Set objElement = objCollection(i)


End If

End If

i = i + 1

Wend

objElement.Click ' click button to search


' Wait while IE re-loading...

Do While IE.Busy

Application.Wait DateAdd("s", 1, Now)

Loop


'Set ieTable = IE.all.Item("simpleDetailsTable")


result = IE.document.body.innerHTML


'copy the tables html to the clipboard and paste to teh sheet

'If Not ieTable Is Nothing Then


' clip.SetText "<html>" & ieTable.outerHTML & "</html>"

' clip.PutInClipboard

' Sheet1.Select

' Sheet1.Range("A1").Select

' Sheet1.PasteSpecial "Unicode Text"

'End If


'Sheet1.Range("A2") = IE.all.Item("simpleDetailsTable")


' Show IE

IE.Visible = True


' Clean up

Set IE = Nothing

Set objElement = Nothing

Set objCollection = Nothing


Application.StatusBar = ""


End Sub


You can see from the commented lines at the end that I have stumbled around a bit unsuccesfully.


I think, although I could be wrong, that I can't use the webquery function because I don't know the url I need from the outset, and as I only need one or two cells from the webpage I thought trying to extract them directly might be easier.


Any help would be very gratefully received, or pointers to similar posts.


Many thanks.
 
Hi, wicksta!


First of all welcome to Chandoo's website Excel forums. Thank you for your joining us and glad to have you here.


As a starting point I'd recommend you to read the green sticky topics at this forums main page. There you'll find general guidelines about how this site and community operates (introducing yourself, posting files, netiquette rules, and so on).


Among them you're prompted to perform searches within this site before posting, because maybe your question had been answered yet.


Feel free to play with different keywords so as to be led thru a wide variety of articles and posts, and if you don't find anything that solves your problem or guides you towards a solution, you'll always be welcome back here. Tell us what you've done, consider uploading a sample file as recommended, and somebody surely will read your post and help you.


And about questions in general...


If you haven't performed yet the search herein, try going to the topmost right zone of this page (Custom Search), type the keywords used in Tags field when creating the topic or other proper words and press Search button. You'd retrieve many links from this website, like the following one(s) -if any posted below-, maybe you find useful information and even the solution. If not please advise so as people who read it could get back to you as soon as possible.


And about this question in particular...


Maybe you find something useful in these links:

http://chandoo.org/forums/topic/macros-web-query-huge-headache-can-anyone-help-me

http://chandoo.org/forums/topic/specific-data-and-html-beyond-excels-capabilities


You could use the in-source-digging technique of last link or pilot IE browser or object browser. AFAICS you'd try getElementByID (tagName DIV, ID article-1 or article-content or idox; tagName TABLE, ID simpleDetailsTable) and check for a line like "Status *" into innerText, or getElementsByTagName (tagName TH), innerText = "Status" and next line will be the actual value. Since there are all TH/TD/TR labels it's difficult to retrieve them directly as if they were fields.


Regards!
 
Hi


Thanks very much for the pointers and the encouragement that I was on the right line.


The index technique referred to here:


http://stackoverflow.com/questions/16141394/excel-vba-get-content-from-online-html-table


was very helpful and I think has solved the problem.


Thanks!
 
Hi, wicksta!

Glad you solved it. Thanks for your feedback and for your kind words too. And welcome back whenever needed or wanted.

Regards!

PS: Thanks for sharing that webpage too :)
 
@wicksta

Do you think you could post your final product? I'm looking at pulling from a simple table from a live html and it looks to be easier than what you were trying to do. Thanks!
 
Back
Top