• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

How to get data from specific columns out of a table?

shahin

Active Member
Perhaps this question has been asked before in different threads within different forums. It is sometimes hard to find any exact match of any search. So, apology in advance if it is identical to any thread.

I've created a script in vba to parse tabular data from a webpage. My script can do it's job flawlessly. The thing is: I'm only after certain columns. So I wish to scrape 4 columns out of 8.

The eight column headers look exactly like:

Code:
Player From To Pos Ht Wt Birth_Date College

I wish to get the content under:

Code:
To Pos Ht Wt

This is the script which can get all the content under 8 columns from that table (once provided by sir chihiro):
Code:
Sub TableData()
    Dim HTTP As New ServerXMLHTTP60, html As New HTMLDocument
    Dim posts As Object, elem As Object, trow As Object

    With HTTP
        .Open "GET", "http://www.basketball-reference.com/players/a/", False
        .send
        html.body.innerHTML = .responseText
    End With
  
    Set posts = html.getElementsByTagName("table")(0)
  
    For Each elem In posts.Rows
        For Each trow In elem.Cells
            c = c + 1: Cells(r + 1, c) = trow.innerText
        Next trow
        c = 0
        r = r + 1
    Next elem
End Sub
 
I tried like below as well but this time I'm having nothing. No error either.
Code:
    Set posts = html.getElementsByTagName("table")
   
    For i = 2 To posts.Length - 1
        For Each elem In posts(i).Rows
            For Each trow In elem.Cells
                c = c + 1: Cells(r + 1, c) = trow.innerText
            Next trow
            c = 0
            r = r + 1
        Next elem
    Next i
 
May be
Code:
Sub TableData()
    Dim http        As New ServerXMLHTTP60
    Dim html        As New HTMLDocument
    Dim posts      As Object
    Dim elem        As Object
    Dim c          As Long
    Dim r          As Long
    Dim x          As Long

    With http
        .Open "GET", "http://www.basketball-reference.com/players/a/", False
        .send
        html.body.innerHTML = .responseText
    End With

    Set posts = html.getElementsByTagName("table")(0)

    For Each elem In posts.Rows
        For x = 0 To elem.Cells.Length - 1
            If x >= 2 And x <= 5 Then
                c = c + 1: Cells(r + 1, c) = elem.Cells(x).innerText
            End If
        Next x
        c = 0
        r = r + 1
    Next elem
End Sub
 
This is a very nice way to get data from that table shortening the width. Changing my script the way YasserKhalil showed, I can get the data out of that table from column 3. What If I wish to get the data from the same column out of that table shortening the length as well?

For scraping data from column 3 I did like below and I'm getting the total data from that column:

Code:
    For Each elem In posts.Rows
        For i = 0 To elem.Cells.Length - 1
            If i = 2 Then
                c = c + 1: Cells(r + 1, c) = elem.Cells(i).innerText
            End If
        Next i
        c = 0
        r = r + 1
    Next elem

If I can shorten the length of that column then the final output will more like be:

Code:
To
1995
1978
1989
2001
2003
 
Do you mean like that
Code:
Sub TableData()
    Dim http        As New ServerXMLHTTP60
    Dim html        As New HTMLDocument
    Dim posts      As Object
    Dim r          As Long
    Dim c          As Long
    Dim i          As Long
    Dim j          As Long

    With http
        .Open "GET", "http://www.basketball-reference.com/players/a/", False
        .send
        html.body.innerHTML = .responseText
    End With

    Set posts = html.getElementsByTagName("table")(0)

    For i = 0 To posts.Rows.Length - 1
        If i > 5 Then Exit For
        For j = 0 To posts.Rows(i).Cells.Length - 1
            If j = 2 Then
                c = c + 1: Cells(r + 1, c) = posts.Rows(i).Cells(j).innerText
            End If
        Next j
        c = 0
        r = r + 1
    Next i
End Sub
 
Back
Top