• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Ignoring Empty Cells in an array-based formula (Excel 2013)

S_Eden

New Member
I'm doing some data analysis and extraction on a large number of CSV files (read: several hundred). Because I don't have the time or knowledge to build an access database, here is what I've done so far:

1. I built a template spreadsheet, with all the formulas I need to use pre-populated, and this way I can open the each CSV file, copy the data, and paste it in. This data runs out to column G, and an initial calculation is performed, with the results going into column H.

2. This is where I run into trouble. I need to extract the top and bottom ten values from column H, put them in a column (in order), and also add the data from columns A and F in adjacent columns. I am using the LARGE and SMALL functions to grab the min/max values from column H, and INDEX/MATCH to grab the same-row values from columns A and F:

=LARGE/SMALL(H2:H3500,1....10)

=INDEX($A$1:$A$3500,MATCH(M3,$H$1:$H$3500,0))

=INDEX($F$1:$F$3500,MATCH(M3,$H$1:$H$3500,0))


My problem is that the CSV files I'm pulling this data from vary widely in the number of rows, anywhere from 300 to more than 2000. Because I don't know exactly how many rows I'm dealing with, I ran the column H calculations all the way down to row 3500, because I do know for a fact that these data will not exceed that many rows. However, there will always be cells in column H that have no basis for calculation, resulting in a #DIV/0 error, which carries over to the two LARGE/SMALL tables. Help?

(Bonus points if you can show me how to filter out all values less than a threshold number)
 
Hey S_eden,

You could do a few things. You could insert-->table. then paste your data in the table. The formulas you have in the table will automatically adjust down to the last row.
or
you could modify your formula that is giving you errors to =iferror(yourformulahere,average($A$1:A1)). this would put the average number instead of a error, and since your formula only grabs the top and bottom 10 values, these average values will not pop up in your result set.
or
You could make a dynamic named range which you could use in your index formula.

Also if you plan on going through 100 CSV files. You could build a macro that opens each file and pastes the data into a mastersheet, which you could then use for your formulas.

or post a sample file and hui can help u
 
Hi ,

To eliminate consideration of the error values in getting your largest / smallest 10 values , use the following array formulae , entered using CTRL SHIFT ENTER :

=IFERROR(LARGE(IF(ISNUMBER($H$2:$H$3500),$H$2:$H$3500),ROW(A1)),"")

=IFERROR(SMALL(IF(ISNUMBER($H$2:$H$3500),$H$2:$H$3500),ROW(A1)),"")

Enter them in any cell , and copy down.

Narayan
 
Back
Top