Formula Forensics No. 003 – Lukes Reward

Posted on November 18th, 2011 in Formula Forensics , Learn Excel , Luke , Posts by Luke - 54 comments

Hello everyone. My name is Luke M, and I’ve been coming to Chandoo’s site for about a year now.

I love solving problems, and helping people out on the forums.

Due partly to Hui’s challenge about submitting articles, and 3G’s comment the other day about this particular formula, I decided to try my hand at writing an article.

I Hope it helps explain some cool Excel tricks that I like to use. 🙂

 

The Problem

Often, I see a request from an individual asking for a formula that will be able to generate a list based off of some criteria, with no spaces/blanks.

As Chandoo & Hui have often pointed out, there’s not much Excel can’t do if you know the right functions to use.

Today, we’ll take a look at how we can accomplish this particular task.

Suppose I want to be able to generate a list of all the Vegetables.

In cell E2, I put this array formula:

=IF(COUNTIF(A:A,$D$2) < ROWS($E$2:E2), “”,  INDEX(B:B,  SMALL( IF($A$2:$A$10 =$D$2, ROW( $A$2:$A$10)), ROW(A1))))

Remember, array formulas need to be confirmed using Ctrl+Shift+Enter, and will have curly brackets { } around the formula if done correctly.

I then copy the cell downward as far as I think will ever be necessary to display all the records (i.e. E7).

Lets Look Inside

Let’s take a closer look at how the formula works.

Front Half

First, let’s look at the IF function’s logic check.

=IF(COUNTIF(A:A,$D$2)<ROWS($E$2:E2),””, INDEX(B:B, SMALL( IF($A$2:$A$10 = $D$2, ROW($A$2:$A$10)), ROW(A1))))

We’re using the COUNTIF function to determine the total number of records that meet our criteria. We’re then comparing this to a ROWS function. The ROWS function simply returns the number of rows given in the argument. Note that the first part of the range callout uses an absolute reference and will not change, while the latter part is relative and will change as the formula is copied down. Thus, in the first cell, the ROWS function evaluates to 1. The next cell, it will evaluate to 2, then 3, and so on. So, the IF statement is checking to see if the number of records returned so far (i.e., formula used) is greater than the total number of possible records. If this is true, return a blank (i.e., “”).

Back Half

The latter half of the formula is where things get tricky.

For this part, let’s work our way from the inside out.

We start off with another IF function:

INDEX(B:B,SMALL(IF($A$2:$A$10=$D$2, ROW($B$2:$B$10)), ROW(A1))))

This section compares A2:A10 with our criteria given in cell D2. So, the array if A2:A10 starts off looking like this:

{Fruit, Fruit, Vegetable, Vegetable, Fruit, Vegetable, “”, “”, “”}

When we compare it with the criteria, it becomes this:

{False, False, True, True, False, True, False, False, False}

Looking at the return values in our IF function, we see that only a True result is stated, the ROW.

 

INDEX(B:B,SMALL(IF($A$2:$A$10=$D$2, ROW($A$2:$A$10)), ROW(A1))))

So, each True value from the array above will be replaced with the corresponding Row value.

This causes the array to become this:

{False, False, 4, 5, False, 7, False, False, False}

Now that we have a nice array with some numbers in it, this gets fed into the SMALL function.

INDEX(B:B,SMALL(IF($A$2:$A$10=$D$2, ROW($A$2:$A$10)), ROW(A1))))

 

The ROW function at the end will serve as a type of counter.

In E2, where we initially place the formula, this will evaluate to 1, thus telling the SMALL function to return the 1st smallest number.

In E3, it will evaluate to 2, and the SMALL function will return the 2nd smallest number, and so.

So, taking the 1st smallest number from our array, we get the number 4.

We then take this to the INDEX function

INDEX(B:B,SMALL(IF($A$2:$A$10=$D$2, ROW($A$2:$A$10)), ROW(A1))))

Note that we need to callout the entire column, since we are plugging in row numbers.

The 4th row in column B leads us to the value “Broccoli”.

The next formula will return the 5th row, “Spinach”.

The 3rd formula will return the 7th row, “Peas”.

 

This method can be adapted for use with multiple criteria. We would just need to expand the IF function logic checks so that only the correct rows are returned.

 

PS. If it gets too confusing, the first part of the formula can be omitted.

=IF(COUNTIF(A:A,$D$2)<ROWS($E$2:E2),””, INDEX(B:B, SMALL( IF($A$2:$A$10 = $D$2, ROW($A$2:$A$10)), ROW(A1))))

It is just there to hide any unwanted #NUM errors after all the pertinent records have been displayed.

 

EXTENSION

To see how this technique can be extended to use multiple criteria please read the follow up post at:

http://chandoo.org/wp/2014/11/10/formula-forensics-no-003b-lukes-reward-part-ii/

 

DOWNLOAD

You can download a copy of the above file and follow along, Download Here.

 

OTHER POSTS IN THIS SERIES

You can learn more about how to pull Excel Formulas apart in the following posts

Formula Forensic 001 – Tarun’s Problem

Formula Forensic 002 – Joyce’s Question

 

THANK-YOU and a CHALLENGE

Firstly a Congratulations to Luke M on taking up the challenge and on your First Post at Chandoo.org.

Thank-you for explaining to us all how this formula, which has appeared a number of times on the Chandoo.org Forums, works.

The contents of the Post are published as Luke submitted it with only minor formatting changes.

 

My Challenge to you is this:

If you have a neat formula that you would like to share and explain, try putting pen to paper and draft up a Post as Luke Did above.

If you have a formula that you would like explained but don’t want to write a post also send it in to Chandoo or Hui.

Send it to Chandoo or Hui.

Written by Hui...
Tags: , , , , , , ,
Home: Chandoo.org Main Page
? Doubt: Ask an Excel Question

54 Responses to “Formula Forensics No. 003 – Lukes Reward”

  1. Busymanjohn says:

    Hi Luke M , as always great post, this would also work using a drop down menu for the criteria rather than typing in the criteria in cell D2, of course the Type would also have to be changed to 1,2,3 etc depending on the criteria ( 1 for Vegetable, 2 for Fruit etc ), but the formula is sound .... I don't use array formulas much as most folk I work with as basic to intermediate users ,,, but my internal online community is growing and more people are learning about how powerful Excel can be and how much of a useful tool it really is.

  2. Matth78 says:

    Hello Luke,

    I have encountered the same kind of problem.
    My approach was to create a helper column which concatenate the name and the number of its occurrence.

    In your example in C2 I will use : =A2&"_"&Countif(A$2:A2,A2) => giving Fruit_1
    then copy it all the way down.
    The formula in E2 will become : =INDEX(B:B,MATCH(D2&"_"&ROW(A1)-1,C:C,0))

    Nb :
    For ease of use I will number the line in column F, so in F2 => =F1+1
    and replace in E2 "ROW(A1)-1" with F2
    => no matter which line you start at the first line will alway be numbered 1

    Edit: By Hui

  3. Yard says:

    Kudos for understanding and using such a formula.

    Personally I'd keep it simple and use a Pivot Table.

  4. Luke M says:

    Thanks everyone!

  5. F106dart says:

    Great technique and very nicley explained! I will definitely use these ideas.

    I think you have one small typo. The line that begins "In E2, it will evaluate to 2..." should actually read "In E3, it will evaluate to 2...", if I'm not mistaken. If I am, apologies * 10^3.

  6. Hui... says:

    @F106dart
    Thanx & Fixed

  7. red Shirt Ensign says:

    Cell E2 and E7
    These formulas in the download example do not show up as an Array the rest of them in Column E do... is this a typo?

  8. Eric says:

    Hi, In the download example, I see the first vegetable is Apple! Apple is a fruit!

  9. Hui... says:

    @ Red Shift Ensign, Eric
    The Download file is fixed
    I had entered the formulas in E2 & E7 as opposed to Array Entering them

  10. Réal A says:

    This formula is very interesting. Would it be possible to enter a second variable. Say we added in column "C" the color, could we generate the same type of search, with Criteria 1 being Vegetable and Criteria 2 being Green?

    Can we add a second argument in the formula?

    Thanks

  11. James says:

    Thanks for the post. I've watched numerous videos on youtube explaining similar data extraction formulas, and I never quite "got it". This article was very helpful! Thanks for your effort.

  12. mahendra says:

    Hi,
    The formula given above is really very nice & interesting, but i think there is an easier solution for the same. That is useing filters. However I can say that it as "the filter formula".

  13. Hui... says:

    @Mahendra
    In Excel there are usually a number of ways to do anything
    As part of a system Filters require input or VBA code
    This solution negates either and hence it's use in dashboards and other reports is neat and tidy

  14. Stephen says:

    thank Luke for great post. I've tweeked for 2007 to use IFERROR(...) and make it easier for me:
    =IFERROR(INDEX(B:B,SMALL(IF($A$2:$A$10=$D$2,ROW($A$2:$A$10)),ROW(A1))),"")

  15. Stephen says:

    BTW to do multiple filters you need to adopt a slightly IF(...) criteria:
    =IFERROR(INDEX(B:B,SMALL(IF(($A$2:$A$10=$D$2)*($B$2:$B$10="Broccoli"),ROW($A$2:$A$10)),ROW(A1))),"")

    in this case I have used ($A$2:$A$10=$D$2)*($B$2:$B$10="Broccoli") to bring back "vegitables" called "broccoli". Small works if you have an array of 0s and 1s and seams to work in this case perfectly.

    I tried this example for completeness:
    =IFERROR(INDEX(B:B,SMALL(IF(($A$2:$A$10=$D$2)*(LEFT($B$2:$B$10,1)"B"),ROW($A$2:$A$10)),ROW(A1))),"")

  16. Luke M says:

    @Réal A
    Yes, you can do multiple criteria. The trick is multiplying the criteria against each other, as Stephen showed in his comments.

    @Stephen,
    Thanks for the 2007 tweak. The IFERROR is one of few things I like about 2007.

  17. Isaac says:

    Have a question about the "front half" explanation -- it mentions testing to see whether the number of returns is "greater than" the total possible. It looks like "less than" in your formula. Please confirm --

    I appreciate these forensic formulae SO MUCH. Keep up your good work.

  18. john@reddybay says:

    Great practical situation and well explained Luke!

  19. Stephen says:

    well, Luke inspired me to take this to the next level. I've been looking for a non-VBA solution to something that is easily done in google aps. {=sort(filter(...))}

    =IFERROR(INDEX(B:B,MATCH(SMALL(IF(($A$2:$A$10=$E$2),COUNTIF($B$2:$B$10,$H$2&$B$2:$B$10)),ROW($A$1:$A$10)),COUNTIF($B$2:$B$10,$H$2&$B$2:$B$10),0)+1),"")

    cell H2 has either "=" based on a drop down list elsewhere.
    Took a while to work it out though.
    p.s thanks to Chandoo on this http://chandoo.org/wp/2008/10/22/sorting-text-in-excel-using-formulas/

  20. 3G says:

    hooooorayyy!!!!!

    I'm going to print this out and take it with me everywhere. 🙂 it really helps clarify the formula. very well written Luke!!

    Thanks for picking my challenge Hui! Luke, this formula has solved so many of my reporting questions...I cannot thank you enough!!

    kudos to you all!

  21. Kim says:

    That is so clever. Thanks Luke for explaining it so well 🙂

  22. Muru says:

    Guys, this is awesome. Very much appreciated effort!
    Last few weeks, I am working on understanding the excel formulas. But, I need your guys’ opinion on how and from where to start with. I really appreciate all your valuable thought.

  23. [...] you would like to share and explain, try putting pen to paper and draft up a Post as Luke did in Formula Forensics 003 or like [...]

  24. Hui... says:

    @Muru
    Start with constructing formulas that you understand, probably based on whatever business your into, or look for existing models on the net similarly based around the business your in.
    It is important to start on models of the business you are in, because at least you will understand the logic behind what the formula is trying to model.
    Then unwrap the models from the outside in
    Use F9 on partial blocks of formulas as demonstrated above.
    Keep an eye out for Named Ranges/Formulas, formulas can be hidden away inside those which simplifies the model but makes reading them slightly harder.
    Another technique is to work through the formula with known values, put values into the formula that you know, and follow the formula through step by step. You may have to iteratively work backwards through the model if large number of helper cells are used.
    Keep reading the Formula Forensics series here: http://chandoo.org/wp/tag/formula-forensics/

  25. Muru says:

    I am sure Hui, your thought will definitely help me to move one step up on the formula ladder.

  26. Ashwin says:

    Hi Luke,
    Thanks for the giving this tip. truely cool tip.
    I applied this code to my sheet and it works just fine....One problem i am facing is, i dont want to repeat the result. i.e:
    If same vegetable are coming in B row the result coloumn should show only one time...any tip for that?

    • Luke M says:

      @Ashwin,

      We'll need to add some additional criteria to the IF function in the back half. Also, since we're always taking a new, unique value, we need to change the SMALL function to MIN. How's this for a start:
      =INDEX(B:B,  MIN( IF(($A$2:$A$10 =$D$2)*(ISNA(MATCH($B$2:$B$10,E$1:E1,0))), ROW( $A$2:$A$10)))) 

      Note that it's still an array formula. Un-needed cells will display a 0. 

      • Michael says:

        Hi Luke, this is what I was after as well, thanks alot! Really well written and explained.

        Is there a way to list the data horizontally (in the next column along) rather than vertically with the above formula (so no duplicate returns)?  

        Its been doing my head in and Hui kindly directed me here from the forums.

        Thanks,
        Mike 

        • Hui says:

          @Mike
           
          in G2 try: =IF(COUNTIF($A:$A,$D$2) < COLUMNS($G$2:G2),"", INDEX($B:$B, SMALL(IF($A$2:$A$10=$D$2, ROW($A$2:$A$10)), COLUMNS($G$2:G2))))  Ctrl+Shift+Enter
          Then copy across
           

  27. Steph says:

    Hi Luke,
    I'm just wondering - could this be modified so that the output in E2 would be the sum of Brocolli, Spinach and Peas? I'm working with numerical data and need to sum across the rows so that I have only one row per Criteria. I'm a complete new-comer to Excel so am really struggling to figure out how that could be done, if at all.
    Thanks!

  28. Roqy says:

    Hi, I have some troubles with this formula. If I apply it to some basic numbers, it appears to work fine, but when I refer to some cells on a different sheet which are not just numbers but values based on a vlookup or other formulas, it stops working. What could the problem?

    Any help would be appreciated!

  29. Vishal Srivastava says:

    I think there is a simple formula to work around:

    =IF(SEARCH($E$2,$A$2:$A$10),$B$2:$B$10)

    Here we will have the same results.

    Waiting for your comments.

  30. For Blank values:
    =IFERROR(INDEX($A$2:$B$10,LARGE(IFERROR(IF(SEARCH($E$2,$A$2:$A$10),MATCH($B$2:$B$10,$B$2:$B$10,0)),""),ROW($A$1:$A$10)),2),"")

    @ALL, Waiting for your comments!

  31. {=IFERROR(INDEX($A$2:$B$10,LARGE(IFERROR(IF(SEARCH($E$2,$A$2:$A$10),
    MATCH($B$2:$B$10,$B$2:$B$10,0)),""),
    ROW($A$1:$A$10)),2),"")}

  32. Jor says:

    I'm using the above fuction, which works perfectly for what I am doing, I was wondering if there was a way to copy the format of the Name cells along with values by adding onto this function or is this only possible with a macro?

  33. Gary Lundblad says:

    I am very excited to see that there is a way to do this. I have been looking for just such a solution, although my goal is a bit more complex. I am a General Ledger Accountant and have to reconcile and post all of our companies revenue. I am uploading a copy of the above example workbook, if I can figure out how to do it. This is my first post on Chandoo. My copy of Luke's workbook has a couple of extra tabs for example purposes. These tabs include some data that needs to be broken out across a number of separate worksheets to make the reconciliation process simpler.

    I would love a formula that I could use on each worksheet, that would look at the master list ("My Data" Worksheet), and based on the GL code in column C, pull all six columns of data into the appropriate worksheet (see sheet names), without a bunch of blank rows.

    Right now I have individual sheets for each GL account, 35 in all, pull all data from the "My Data" tab, and then just filter by the GL code column. Because my data set can be well over a thousand rows, the workbook is quite large, not to mention I have report worksheets for each GL code as well, giving me over 70 worksheets in all, because I have to reconcile each account independently.

    I would love any assistance I can get on creating formulas, or even a macro to get this done. The formulas included above are a bit out of my depth, so I can't say I really understand them yet.

    Thank you!

    Gary

  34. CHAITANYA says:

    what if i wanted to get all three matching values in one cell..i mean if i press f9 it should all 3 in one cell

  35. CHAITANYA says:

    this question is realted to the question above..so it wont make sense to post it in some forum...so is there any solution to this???

  36. CHAITANYA says:

    I still think my question is not yet answered...Can someone answer this? to get all the matching values in one cell...

    • Hui... says:

      @CHAITANYA
      I wouldn't even try to think about doing that or how to do it
      If your really serious about having this A,B,C,D etc in one cell
      I'd use the technique here in a helper column, Then append the results together

  37. CHAITANYA says:

    @ Hui
    actually it can be done by sorting the value to be matched in an alphabetical order initially and then use a offset function to get the range of matched values in one cell...but I do not want to sort it everytime...i was actually trying to get it using a single formula...

  38. Boricua says:

    Ok, this helps a lot but I have a question, what should I do if I want the output column to display everything BUT vegetables? I know this table only shows fruits and vegetables, but if it showed fruits, vegetables, meats, grains, etc and I want the output table to show everything that is not a Vegetable, how should I do?

  39. stefano says:

    This formula is very useful. But my job is not done yet as I need to get a list of unique values. In other words this formula shows all the values, even if the appear more than once. I need a single value to show up just once. Any ideas how to get that?
    Many thanks

Leave a Reply