Excel Array Formulas have subtle beauty and raw power. Today I want to share how you can use Excel Array Formulas to find-out the 2nd largest sale amount corresponding to a particular product.
Lets say you have sales data like this:
Now, we all know that, to get 2nd largest sale amount, we can use the LARGE() formula. Like this:
But, how do we know what is the 2nd largest sale amount for the product “bricks”?
Now, that is tricky? Isn’t it?
This is when an array formula can help you.
First the formula:
Type below formula in a cell
and press CTRL+SHIFT+ENTER.
Now, the explanation:
Instead of directly giving the SALES_RANGE to LARGE formula, we are asking excel to include empty spaces “” wherever product <> “Bricks” thru the IF() formula inside LARGE().
Now, when you press CTRL+SHIFT+Enter, excel treats the formula as array formula and processes the lists of values inside nicely.
Download Excel File with Array Formula Examples:
I have put together 4 simple examples to help you understand the power of array formulas. Please download the file from here.
How do you use array formulas?
While array formulas are powerful, they are also difficult to write and explain. I try to solve problems using non-array formula methods as much as possible. But in some cases, like above, I end up writing array formulas.
What about you? Do you use them? How differently would you solve the above problem? Please share using comments.
29 Responses to “Finding 2nd Largest Item in a List that meets a Criteria [Excel Array Formulas]”
Thanks Chanddoo.... never thought of this use of the formula 🙂
Define cAmt = IF(PRODUCTS_RANGE="Bricks",SALES_RANGE)
=Large(cAmt,2) no need of CSE
@Sam.. interesting idea. I did not know that named ranges would eliminate the need for CSE.
All named formulas (named ranges included) are ARRAY formulas.
I delve into this in great detail in my new Excel Hero Academy, which starts in a few days!
For example, if the second value is same (t0001-100, t0003-100) for 2 Transaction IDs, based on this large formula, i will get the first transaction id only 2nd one it will not come.
In that what should do??? pl suggest.
Did you get any suggestion to your problem?
I am interesting in that because i have the same problem.
If first and second value is the same it will show only the first, but what about the second???
One option is to add a column with the row # divided by 10000 added to the significant value, then have the array formula target that column instead of the significant value.
So if you want to return the top 10 values of column A, have column F be +row()/10000. Then instead of telling it to rank the top 10 values of column A, have it rank the top 10 values of column F. That way if there's a tie in A it will be different values in F because 1/10000th of the row number is added.
a couple of things:
>what the heck does "CSE" stand for... maybe "Control/Shift/Enter"?
>every month or so, Chandoo comes up with another simple, sweet, elegant "communication" of Excel features. This is yet another 15 carrot diamond in a long string of diamonds. Thank you...
>i use indirect references a lot (as opposed to named ranges). my indirect reference is to a cell with a formula that extracts the sheet name and the cell range of the array through simple beg and end cell references. for me this works better than named references as the named range definition is once removed from view and named ranges are harder to house-keep (when you need hundreds of them).
>the elegant formula that Chandoo has presented works for just about any formula i have tried it on. this must be the underlying formula structure for COUNTIFS, SUMIFS and AVERAGEIFS. wow!
>this formula works even if the filters ("blocks", "bricks") are in independent reference cells which makes it very versatile for active dashboards and graphs.
>not only does Chandoo's formula work for SMALL and LARGE but also MIN, MAX, MEDIAN, STDEV, CORREL, QUARTILE, VAR, SUMSQ, etc. etc. again wow!
>i have not been able to get this concept to work with LINEST, anyone know why?
>i have not tried it yet with the matrix or financial formulas.
CSE is as you suggested Ctrl Shift Enter
i am having trouble getting the following to work...
if i add a final column to the example dataset called PAID_RANGE and add either "paid" or "unpaid" to each row in that column then why won't this formula work (for finding the largest block sale that has been paid):
@Bill: You can add additional conditions like this:
Both conditions go inside the AND() formula.
To solve this an avoid an array (CSE) formula I would use SUMPRODUCT:
I know this is a couple of days off from Oli's post and I apologize for my tardiness, but, in his SUMPRODUCT example above, what is "E2"?
@Chandoo: with regard to the example with an added criteria ("paid" vs "unpaid"), I tried rearranging the terms as you have suggested...
This works, sort of. If you examine each element of the array created within the formula, you find that when both filters are TRUE than you get the value from the SALES_RANGE array (this is as expected) but if one or more of the filters are FALSE then you get an error (#VALUE) assignment instead of the empty cell assignment (""). So the result of this formula is always #VALUE even when entered in CSE fashion.
This is a problem since almost all of the formulas (SMALL, LARGE, MIN, MAX, MEDIAN, STDEV, CORREL, QUARTILE, VAR, SUMSQ, etc.) work fine with an empty cell "" in the array but not with a #VALUE in the array. Also, note that for many of these listed formulas, zero can be a valid value so, I think, that the multiplicative (and additive approach for OR) would not work because zeros would be evaluated as array elements (for these example formulas).
I haven't really tested it, but might work in that instance? CSE entered
>your suggestion works. you do have to use CSE. i do not get exactly why it works (yet), but it does. i tried it for two conditions. each element in the formula array evaluated to the cell value in SALES_RANGE or to a blank cell "" when both filer conditions were not met. >note i changed the FALSE argument to "" although it seems that the FALSE word gets treated the same as an empty cell for the LARGE, MIN, MAX, MEDIAN types of formulas).
>i use the following formula to view the array value by value (this is the IF statement with an INDEX formula substituted for the LARGE formula overwrap):
where n=row number in array
Thank you, this is very useful. I am curious if there's an efficient way using just formulae (no VBA/macros/addins) to return the transaction ID instead of the amount?
To expand on the example in the article, imagine the list of transactions is ~200,000 rows. I want to list the top 10 transactions based on quantity of Bricks sold. I don't want the formula to return the #1 largest quantity of bricks sold, I want it to return the transaction ID that corresponds to the #1 quantity of bricks sold.
Obviously I can put the source data into a table and use filtering & sorting to see this, but in addition to Bricks and Blocks I also sell 30 other things, so it's a big time saver for me if Excel will list my top 10 transaction IDs for each item sold rather than me having to work the filters and extract the data myself.
I hope that makes sense.
Hi, I'm facing the same problem too here. Is there anyone who can solve this? thank you.
You can try =large(C2:C6 * (B2:B10 = "Bricks"),2) and Ctrl+Enter
I needed to thank you for this fantastic read!!
I definitely enjoyed every little bit of it. I've got you book marked to look at new things
There is one issue with this formula that it is not working properly when i drag this to consecutive rows for large data.
Kindly guide on this.
Can you please post your file or formula
I am trying to use this formula in a vba Code but it is not working. My code is something like: ActiveSheet.WorksheetFunction.FormulaArray = "=LARGE(IF(F2:F9000="Bricks",G2:G9000),10)"
ActiveSheet.WorksheetFunction.FormulaArray = "=LARGE(IF(F2:F9000=""Bricks"",G2:G9000),10)"
Bricks in double quotes
I have downloaded your example spreadsheet and for some reason after I clicked on the cells that you have the formulas in(G6 & G7), it turns in to "#Value." Could you please let me know what that could mean? I think there is something wrong with my setting. Thank you.
Thanks for the info above. But what if say using the same data set, I wanted to know what product gives the highest sales amount and the total of the highest sales product amount?
Or say there is an additional column of a Month. In which month the sales is highest and which product?
To simplify, on which month, which product produces the highest sales amount?
Is this possible?
You done a great job. Really very useful informative post about excel.
thanks for posting, can someone please help with this question. see list below for example with excel equation, what am I doing wrong? I want to use the large function so it's sorting column B by largest to smallest but excluding Rookie in the sort, thanks in advance.
=LARGE(IF(A1:A6,""&"Rookie",B1:B6,1) this is the equation but it's not working.
Column A Column B
Varsity 1 100,000
Varsity 4 120,000
Varsity 2 60,000
Varsity 3 200,000
thanks in advance.