Formula Forensics No. 003 – Lukes Reward
Hello everyone. My name is Luke M, and I’ve been coming to Chandoo’s site for about a year now.
I love solving problems, and helping people out on the forums.
Due partly to Hui’s challenge about submitting articles, and 3G’s comment the other day about this particular formula, I decided to try my hand at writing an article.
I Hope it helps explain some cool Excel tricks that I like to use. 🙂
The Problem
Often, I see a request from an individual asking for a formula that will be able to generate a list based off of some criteria, with no spaces/blanks.
As Chandoo & Hui have often pointed out, there’s not much Excel can’t do if you know the right functions to use.
Today, we’ll take a look at how we can accomplish this particular task.
Suppose I want to be able to generate a list of all the Vegetables.
In cell E2, I put this array formula:
=IF(COUNTIF(A:A,$D$2) < ROWS($E$2:E2), “”, INDEX(B:B, SMALL( IF($A$2:$A$10 =$D$2, ROW( $A$2:$A$10)), ROW(A1))))
Remember, array formulas need to be confirmed using Ctrl+Shift+Enter, and will have curly brackets { } around the formula if done correctly.
I then copy the cell downward as far as I think will ever be necessary to display all the records (i.e. E7).
Lets Look Inside
Let’s take a closer look at how the formula works.
Front Half
First, let’s look at the IF function’s logic check.
=IF(COUNTIF(A:A,$D$2)<ROWS($E$2:E2),””, INDEX(B:B, SMALL( IF($A$2:$A$10 = $D$2, ROW($A$2:$A$10)), ROW(A1))))
We’re using the COUNTIF function to determine the total number of records that meet our criteria. We’re then comparing this to a ROWS function. The ROWS function simply returns the number of rows given in the argument. Note that the first part of the range callout uses an absolute reference and will not change, while the latter part is relative and will change as the formula is copied down. Thus, in the first cell, the ROWS function evaluates to 1. The next cell, it will evaluate to 2, then 3, and so on. So, the IF statement is checking to see if the number of records returned so far (i.e., formula used) is greater than the total number of possible records. If this is true, return a blank (i.e., “”).
Back Half
The latter half of the formula is where things get tricky.
For this part, let’s work our way from the inside out.
We start off with another IF function:
INDEX(B:B,SMALL(IF($A$2:$A$10=$D$2, ROW($B$2:$B$10)), ROW(A1))))
This section compares A2:A10 with our criteria given in cell D2. So, the array if A2:A10 starts off looking like this:
{Fruit, Fruit, Vegetable, Vegetable, Fruit, Vegetable, “”, “”, “”}
When we compare it with the criteria, it becomes this:
{False, False, True, True, False, True, False, False, False}
Looking at the return values in our IF function, we see that only a True result is stated, the ROW.
INDEX(B:B,SMALL(IF($A$2:$A$10=$D$2, ROW($A$2:$A$10)), ROW(A1))))
So, each True value from the array above will be replaced with the corresponding Row value.
This causes the array to become this:
{False, False, 4, 5, False, 7, False, False, False}
Now that we have a nice array with some numbers in it, this gets fed into the SMALL function.
INDEX(B:B,SMALL(IF($A$2:$A$10=$D$2, ROW($A$2:$A$10)), ROW(A1))))
The ROW function at the end will serve as a type of counter.
In E2, where we initially place the formula, this will evaluate to 1, thus telling the SMALL function to return the 1^{st} smallest number.
In E3, it will evaluate to 2, and the SMALL function will return the 2^{nd} smallest number, and so.
So, taking the 1^{st} smallest number from our array, we get the number 4.
We then take this to the INDEX function
INDEX(B:B,SMALL(IF($A$2:$A$10=$D$2, ROW($A$2:$A$10)), ROW(A1))))
Note that we need to callout the entire column, since we are plugging in row numbers.
The 4^{th} row in column B leads us to the value “Broccoli”.
The next formula will return the 5^{th} row, “Spinach”.
The 3^{rd} formula will return the 7^{th} row, “Peas”.
This method can be adapted for use with multiple criteria. We would just need to expand the IF function logic checks so that only the correct rows are returned.
PS. If it gets too confusing, the first part of the formula can be omitted.
=IF(COUNTIF(A:A,$D$2)<ROWS($E$2:E2),””, INDEX(B:B, SMALL( IF($A$2:$A$10 = $D$2, ROW($A$2:$A$10)), ROW(A1))))
It is just there to hide any unwanted #NUM errors after all the pertinent records have been displayed.
EXTENSION
To see how this technique can be extended to use multiple criteria please read the follow up post at:
http://chandoo.org/wp/2014/11/10/formulaforensicsno003blukesrewardpartii/
DOWNLOAD
You can download a copy of the above file and follow along, Download Here.
OTHER POSTS IN THIS SERIES
You can learn more about how to pull Excel Formulas apart in the following posts
Formula Forensic 001 – Tarun’s Problem
Formula Forensic 002 – Joyce’s Question
THANKYOU and a CHALLENGE
Firstly a Congratulations to Luke M on taking up the challenge and on your First Post at Chandoo.org.
Thankyou for explaining to us all how this formula, which has appeared a number of times on the Chandoo.org Forums, works.
The contents of the Post are published as Luke submitted it with only minor formatting changes.
My Challenge to you is this:
If you have a neat formula that you would like to share and explain, try putting pen to paper and draft up a Post as Luke Did above.
If you have a formula that you would like explained but don’t want to write a post also send it in to Chandoo or Hui.
 

Leave a Reply
Making Small Multiples in Excel [Charting Technique]  Add Data to Charts with Copy Paste [Quick Tip] 
60 Responses to “Formula Forensics No. 003 – Lukes Reward”
Hi Luke M , as always great post, this would also work using a drop down menu for the criteria rather than typing in the criteria in cell D2, of course the Type would also have to be changed to 1,2,3 etc depending on the criteria ( 1 for Vegetable, 2 for Fruit etc ), but the formula is sound .... I don't use array formulas much as most folk I work with as basic to intermediate users ,,, but my internal online community is growing and more people are learning about how powerful Excel can be and how much of a useful tool it really is.
Hello Luke,
I have encountered the same kind of problem.
My approach was to create a helper column which concatenate the name and the number of its occurrence.
In your example in C2 I will use : =A2&"_"&Countif(A$2:A2,A2) => giving Fruit_1
then copy it all the way down.
The formula in E2 will become : =INDEX(B:B,MATCH(D2&"_"&ROW(A1)1,C:C,0))
Nb :
For ease of use I will number the line in column F, so in F2 => =F1+1
and replace in E2 "ROW(A1)1" with F2
=> no matter which line you start at the first line will alway be numbered 1
Edit: By Hui
Kudos for understanding and using such a formula.
Personally I'd keep it simple and use a Pivot Table.
Thanks everyone!
Great technique and very nicley explained! I will definitely use these ideas.
I think you have one small typo. The line that begins "In E2, it will evaluate to 2..." should actually read "In E3, it will evaluate to 2...", if I'm not mistaken. If I am, apologies * 10^3.
@F106dart
Thanx & Fixed
Cell E2 and E7
These formulas in the download example do not show up as an Array the rest of them in Column E do... is this a typo?
Hi, In the download example, I see the first vegetable is Apple! Apple is a fruit!
@ Red Shift Ensign, Eric
The Download file is fixed
I had entered the formulas in E2 & E7 as opposed to Array Entering them
This formula is very interesting. Would it be possible to enter a second variable. Say we added in column "C" the color, could we generate the same type of search, with Criteria 1 being Vegetable and Criteria 2 being Green?
Can we add a second argument in the formula?
Thanks
Thanks for the post. I've watched numerous videos on youtube explaining similar data extraction formulas, and I never quite "got it". This article was very helpful! Thanks for your effort.
Hi,
The formula given above is really very nice & interesting, but i think there is an easier solution for the same. That is useing filters. However I can say that it as "the filter formula".
@Mahendra
In Excel there are usually a number of ways to do anything
As part of a system Filters require input or VBA code
This solution negates either and hence it's use in dashboards and other reports is neat and tidy
thank Luke for great post. I've tweeked for 2007 to use IFERROR(...) and make it easier for me:
=IFERROR(INDEX(B:B,SMALL(IF($A$2:$A$10=$D$2,ROW($A$2:$A$10)),ROW(A1))),"")
BTW to do multiple filters you need to adopt a slightly IF(...) criteria:
=IFERROR(INDEX(B:B,SMALL(IF(($A$2:$A$10=$D$2)*($B$2:$B$10="Broccoli"),ROW($A$2:$A$10)),ROW(A1))),"")
in this case I have used ($A$2:$A$10=$D$2)*($B$2:$B$10="Broccoli") to bring back "vegitables" called "broccoli". Small works if you have an array of 0s and 1s and seams to work in this case perfectly.
I tried this example for completeness:
=IFERROR(INDEX(B:B,SMALL(IF(($A$2:$A$10=$D$2)*(LEFT($B$2:$B$10,1)"B"),ROW($A$2:$A$10)),ROW(A1))),"")
@Réal A
Yes, you can do multiple criteria. The trick is multiplying the criteria against each other, as Stephen showed in his comments.
@Stephen,
Thanks for the 2007 tweak. The IFERROR is one of few things I like about 2007.
Have a question about the "front half" explanation  it mentions testing to see whether the number of returns is "greater than" the total possible. It looks like "less than" in your formula. Please confirm 
I appreciate these forensic formulae SO MUCH. Keep up your good work.
Great practical situation and well explained Luke!
well, Luke inspired me to take this to the next level. I've been looking for a nonVBA solution to something that is easily done in google aps. {=sort(filter(...))}
=IFERROR(INDEX(B:B,MATCH(SMALL(IF(($A$2:$A$10=$E$2),COUNTIF($B$2:$B$10,$H$2&$B$2:$B$10)),ROW($A$1:$A$10)),COUNTIF($B$2:$B$10,$H$2&$B$2:$B$10),0)+1),"")
cell H2 has either "=" based on a drop down list elsewhere.
Took a while to work it out though.
p.s thanks to Chandoo on this http://chandoo.org/wp/2008/10/22/sortingtextinexcelusingformulas/
@All.. you can download Stephen's example here:
http://img.chandoo.org/playground/filterandsortstephen.xls
hooooorayyy!!!!!
I'm going to print this out and take it with me everywhere. 🙂 it really helps clarify the formula. very well written Luke!!
Thanks for picking my challenge Hui! Luke, this formula has solved so many of my reporting questions...I cannot thank you enough!!
kudos to you all!
That is so clever. Thanks Luke for explaining it so well 🙂
Guys, this is awesome. Very much appreciated effort!
Last few weeks, I am working on understanding the excel formulas. But, I need your guys’ opinion on how and from where to start with. I really appreciate all your valuable thought.
[...] Formula Forensics 003 – Lukes Reward [...]
[...] you would like to share and explain, try putting pen to paper and draft up a Post as Luke did in Formula Forensics 003 or like [...]
@Muru
Start with constructing formulas that you understand, probably based on whatever business your into, or look for existing models on the net similarly based around the business your in.
It is important to start on models of the business you are in, because at least you will understand the logic behind what the formula is trying to model.
Then unwrap the models from the outside in
Use F9 on partial blocks of formulas as demonstrated above.
Keep an eye out for Named Ranges/Formulas, formulas can be hidden away inside those which simplifies the model but makes reading them slightly harder.
Another technique is to work through the formula with known values, put values into the formula that you know, and follow the formula through step by step. You may have to iteratively work backwards through the model if large number of helper cells are used.
Keep reading the Formula Forensics series here: http://chandoo.org/wp/tag/formulaforensics/
I am sure Hui, your thought will definitely help me to move one step up on the formula ladder.
Hi Luke,
Thanks for the giving this tip. truely cool tip.
I applied this code to my sheet and it works just fine....One problem i am facing is, i dont want to repeat the result. i.e:
If same vegetable are coming in B row the result coloumn should show only one time...any tip for that?
@Ashwin,
We'll need to add some additional criteria to the IF function in the back half. Also, since we're always taking a new, unique value, we need to change the SMALL function to MIN. How's this for a start:
=INDEX(B:B, MIN( IF(($A$2:$A$10 =$D$2)*(ISNA(MATCH($B$2:$B$10,E$1:E1,0))), ROW( $A$2:$A$10))))
Note that it's still an array formula. Unneeded cells will display a 0.
Hi Luke, this is what I was after as well, thanks alot! Really well written and explained.
Is there a way to list the data horizontally (in the next column along) rather than vertically with the above formula (so no duplicate returns)?
Its been doing my head in and Hui kindly directed me here from the forums.
Thanks,
Mike
@Mike
in G2 try:
=IF(COUNTIF($A:$A,$D$2) < COLUMNS($G$2:G2),"", INDEX($B:$B, SMALL(IF($A$2:$A$10=$D$2, ROW($A$2:$A$10)), COLUMNS($G$2:G2))))
Ctrl+Shift+EnterThen copy across
Hi Hui, I'm having trouble adapting this for use with structured references in an excel table
For example if the column A is tbl_Produce[Type] and column B is tbl_Produce[Name] I tried
={IF(COUNTIF(tbl_Produce[Type],$D$2) < COLUMNS($G$2:G2),"", INDEX(tbl_Produce[Name], SMALL(IF(tbl_Produce[Type]=$D$2, ROW(tbl_Produce[Type])), COLUMNS($G$2:G2))))}
...but without success. Would appreciate a hint. Many thanks
Hi Luke,
I'm just wondering  could this be modified so that the output in E2 would be the sum of Brocolli, Spinach and Peas? I'm working with numerical data and need to sum across the rows so that I have only one row per Criteria. I'm a complete newcomer to Excel so am really struggling to figure out how that could be done, if at all.
Thanks!
@Steph.. you can use SUMIFS formula for this. See:
http://chandoo.org/wp/2010/04/20/introductiontoexcelsumifsformula/
Hi, I have some troubles with this formula. If I apply it to some basic numbers, it appears to work fine, but when I refer to some cells on a different sheet which are not just numbers but values based on a vlookup or other formulas, it stops working. What could the problem?
Any help would be appreciated!
@Roqy
Any chance of posting a file ?
Refer: http://chandoo.org/forums/topic/postingasampleworkbook
I think there is a simple formula to work around:
=IF(SEARCH($E$2,$A$2:$A$10),$B$2:$B$10)
Here we will have the same results.
Waiting for your comments.
Use it as an array Formula.
For Blank values:
=IFERROR(INDEX($A$2:$B$10,LARGE(IFERROR(IF(SEARCH($E$2,$A$2:$A$10),MATCH($B$2:$B$10,$B$2:$B$10,0)),""),ROW($A$1:$A$10)),2),"")
@ALL, Waiting for your comments!
{=IFERROR(INDEX($A$2:$B$10,LARGE(IFERROR(IF(SEARCH($E$2,$A$2:$A$10),
MATCH($B$2:$B$10,$B$2:$B$10,0)),""),
ROW($A$1:$A$10)),2),"")}
I'm using the above fuction, which works perfectly for what I am doing, I was wondering if there was a way to copy the format of the Name cells along with values by adding onto this function or is this only possible with a macro?
I am very excited to see that there is a way to do this. I have been looking for just such a solution, although my goal is a bit more complex. I am a General Ledger Accountant and have to reconcile and post all of our companies revenue. I am uploading a copy of the above example workbook, if I can figure out how to do it. This is my first post on Chandoo. My copy of Luke's workbook has a couple of extra tabs for example purposes. These tabs include some data that needs to be broken out across a number of separate worksheets to make the reconciliation process simpler.
I would love a formula that I could use on each worksheet, that would look at the master list ("My Data" Worksheet), and based on the GL code in column C, pull all six columns of data into the appropriate worksheet (see sheet names), without a bunch of blank rows.
Right now I have individual sheets for each GL account, 35 in all, pull all data from the "My Data" tab, and then just filter by the GL code column. Because my data set can be well over a thousand rows, the workbook is quite large, not to mention I have report worksheets for each GL code as well, giving me over 70 worksheets in all, because I have to reconcile each account independently.
I would love any assistance I can get on creating formulas, or even a macro to get this done. The formulas included above are a bit out of my depth, so I can't say I really understand them yet.
Thank you!
Gary
what if i wanted to get all three matching values in one cell..i mean if i press f9 it should all 3 in one cell
@Chaitanya
Can you post the question in the Chandoo.org Forums
http://forum.chandoo.org/
Please attach a sample file as I can't visualize what you require
this question is realted to the question above..so it wont make sense to post it in some forum...so is there any solution to this???
@Chaitanya
Except that "Some Forum" is part of Chandoo.org and is a vibrant community visited by many people and is used by lots of clever people to assist others solving there problems, unlike this Post which is now 4 years old and unless somebody reads the post and then bothers to read the comments may remain unseen
Attaching a sample file may well help you and the post above (which remains unanswered after 1+ years)
You may also be interested in my follow up post to this post at:
http://chandoo.org/wp/2014/11/10/formulaforensicsno003blukesrewardpartii/
Where I deal with applying this with multiple criteria
thank you! will look into it..
I still think my question is not yet answered...Can someone answer this? to get all the matching values in one cell...
@CHAITANYA
I wouldn't even try to think about doing that or how to do it
If your really serious about having this A,B,C,D etc in one cell
I'd use the technique here in a helper column, Then append the results together
@ Hui
actually it can be done by sorting the value to be matched in an alphabetical order initially and then use a offset function to get the range of matched values in one cell...but I do not want to sort it everytime...i was actually trying to get it using a single formula...
Ok, this helps a lot but I have a question, what should I do if I want the output column to display everything BUT vegetables? I know this table only shows fruits and vegetables, but if it showed fruits, vegetables, meats, grains, etc and I want the output table to show everything that is not a Vegetable, how should I do?
@Boricua
To filter not matching a field use
E2: =IF(COUNTIF(A:A,"<>"&$D$2)<ROWS($E$2:E2),"" ,INDEX(B:B, SMALL(IF($A$2:$A$10 <>$D$2,ROW($A$2:$A$10)),ROW(A1)))) Ctrl+Shift+Enter
You just made my day! Thanks a lot!
This formula is very useful. But my job is not done yet as I need to get a list of unique values. In other words this formula shows all the values, even if the appear more than once. I need a single value to show up just once. Any ideas how to get that?
Many thanks
WOW!!
I have been looking for this solution for over a month. I still have no clue how to understand the functions used in this magically created formula, but when I copied and paste it into Google, MODIFY THE QUOTATION MARKS to be google quotation marks (As in, reenter them ""), IT WORKED!
I entered the formula in each column changing the Index(B:B to the column that had the data I wanted displayed and it work like a charm!
In above example
@Ryan
This is one of the best formulas i've ever seen and regularly on my to use list
You may also like a version where multiple criteria are included:
Refer: http://chandoo.org/wp/2014/11/10/formulaforensicsno003blukesrewardpartii/
I could do it using the below formula:
=IFERROR(INDEX($B$2:$B$8, MATCH(0,IF($D$2=$A$2:$A$8, COUNTIF($E$1:E1, $B$2:$B$8)),0)),"")
Chandoo,
Is there a way to add in to the formula something that specifies WHICH column of criteria to use? For example, in your formula: =IF(COUNTIF(A:A,$D$2) < ROWS($E$2:E2), “”, INDEX(B:B, SMALL( IF($A$2:$A$10 =$D$2, ROW( $A$2:$A$10)), ROW(A1)))) ....
Imagine the columns "Name", "Criteria", and "Output" are shifted over to D, F, and G respectively, and other criteria columns, such as "Color" and "Shape" are added in columns B and C. I want to be able to choose column A, B, or C ("Type", "Color", or "Shape) based on a dropdown, for example. The formula should then adjust the ranges below in your original formula based on the what the user has selected in the dropdown:
COUNTIF(A:A
SMALL( IF($A$2:$A$10
ROW( $A$2:$A$10
So for example, if the user wants the criteria to be based on Column B, a specific "Color", all of the above A ranges would turn to B ranges.
Is there a way to do this!?
@Nico
Any range can be specified as an Offset from another range
eg: If the formula above $A$2:$A$10
it can be replaced with Offset($A$2:$A$10,,Z1)
where Z1 is a Column Offset number and can be chosen from a Drop down etc
The issue is that you also have to change the criteria
so:
=IF(COUNTIF(A:A,$D$2) < ROWS($E$2:E2), “”, INDEX(B:B, SMALL( IF($A$2:$A$10 =$D$2, ROW( $A$2:$A$10)), ROW(A1)))) might become: =IF(COUNTIF(offset(A:A,,z1),offset($D$2,,z1)) < ROWS($E$2:E2), “”, INDEX(B:B, SMALL( IF(offset($A$2:$A$10,,z1) =offset($D$2,,z1), ROW( $A$2:$A$10)), ROW(A1))))