• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Lookup top 10 values in a table with conditions

MBegg

New Member
Hi all,


I'm dealing with a large data table, and want to be able to lookup the rows with the top 10 values (in one column) that match a given condition (in a different column). So far I've done it with some intermediate steps, but am hoping that I can cut these out as it's starting to get quite slow as the table gets larger. So my current method is:


1. Pull out entries that match the condition.

{=INDEX([row name column],MATCH(1,(COUNTIF([already looked up?])=0)*IF([Condition],1,0),0))}


2. Lookup values for these entries

=INDEX([values column],MATCH([row name],[row name column],0))


3. Rank the entries extracted

=RANK([row name],[list of names looked up])


4. Remove any repeat rankings to give a discrete list

=[row rank]+COUNTIF([other rows already given the same rank])


5. Reorder to give the top 10 in order

=INDEX([rankings],MATCH({1-10},[non-repeat rankings]))

=INDEX([row names],MATCH({1-10},[non-repeat rankings]))

=INDEX([values],MATCH({1-10},[non-repeat rankings]))


I've uploaded an example file here:


http://dl.dropbox.com/u/7313338/Lookup%20top%2010%20with%20condition.xlsx


Any suggestions as to how I could get the same result more efficiently?
 
Hi ,


Use the following formula in E4 and copy it downwards :


=IFERROR(LARGE(IF($D$4:$D$16="yes",$C$4:$C$16,""),ROW(A1)),"")


This is to be entered as an array formula ( using CTRL SHIFT ENTER ).


This will work only with Excel 2007 and later.


Narayan
 
Hi,

I've download your workbook and made some adjustment to Narayank991 post

=LARGE(IF($D$4:$D$16="yes";$C$4:$C$16;"");H4) as array formula in D36 (last table)

and

=INDEX($B$4:$B$16;MATCH(J4;$C$4:$C$16;0)) in C36


and copy it downwards.
 
Hi guys, thanks for your responses. You guys have both proposed the following solution:


1. Lookup the top 10 largest values in the values column

=LARGE(IF([conditions column]="yes",[values column],""),{1:10})


2. Lookup the row name from the values

=INDEX([names column],MATCH([ranked value],[values column],0))


This works great, apart from when there is more than one row with the same value. Though you get the top 10 values correctly, if more than one is the same when you perform the index / match it simply finds the same name entry. However, this made me realise that you can eliminate this issue using the same countif construction I had to do the name lookup in my original. This then cuts out all of the intermediate steps. Method as follows:


1. Lookup the top 10 largest values in the values column

=IFERROR(LARGE(IF($D$4:$D$16="yes",$C$4:$C$16,""),ROW()-ROW($D$19)),"")


2. Lookup the row name from the value, not allowing repeats

=IFERROR(INDEX($B$4:$B$16,MATCH(1,IF(D20=$C$4:$C$16,1,0)*(COUNTIF($C$19:C19,$B$4:$B$16)=0),0)),"")


This then gives exactly the same result as my original example - see updated file linked below.


*HOWEVER*


I've realised that this isn't quite what I need to do! (it's a while since I wrote the original so had forgotten quite what I was doing...)


I'm actually not looking for the row with the highest value, but the parameter. So the actual process I'm trying to replicate is as follows:


1. Pull out unique parameter values that have entries which match the condition.

{=INDEX([row name column],MATCH(1,(COUNTIF([already looked up?])=0)*IF([Condition],1,0),0))}


2. Lookup the sum of values for these parameters where the entries match the condition

=SUMPRODUCT(([name]=[names column])*([Condition]="yes"),[values column])


3. Rank the totals extracted

=RANK([row name],[list of names looked up])


4. Remove any repeat rankings to give a discrete list

=[row rank]+COUNTIF([other rows already given the same rank])


5. Reorder to give the top 10 in order

=INDEX([rankings],MATCH({1-10},[non-repeat rankings]))

=INDEX([row names],MATCH({1-10},[non-repeat rankings]))

=INDEX([values],MATCH({1-10},[non-repeat rankings]))


I've made an example version of this in a new version of the file - see sheet titled "More difficult...":


http://dl.dropbox.com/u/7313338/Lookup%20top%2010%20with%20condition%20-%20summed.xlsx
 
Solved it myself! Used a pivot table with "top 10" filter applied, all total / header rows removed to fit in with the rest of the sheet, then wrote a macro to refresh when the state of the conditions are changed.


The actual application of this is a lot more complicated than the example version I uploaded, multiple filters that can be turned on / off to give an output for a dashboard report. Unfortunately can't share as it's a work thing, but rest assured it's pretty cool - also much zippier with the pivots than with the two formulae I wrote originally:


1. Get the names


=IFERROR(INDEX(Table_Data[Task],MATCH(1,IF(FilterBy_Region?,Table_Data[Person Region]=Filter_Region,1)*IF(FilterBy_Country?,Table_Data[Client Country]=Filter_Country,1)*IF(FilterBy_Role?,Table_Data[Person Role]=Filter_Role,1)*IF(FilterBy_Person?,Table_Data[Person Name]=Filter_Person,1)*IF(FilterBy_Client?,Table_Data[Client Name]=Filter_Client,1)*IF(FilterBy_Project?,Table_Data[Project Name]=Filter_Project,1)*IF(FilterBy_Task_Type?,Table_Data[Task Type]=Filter_Task_Type,1)*(0=COUNTIF($O$28:O28,Table_Data[Task])),0)),"")


2. Get the totals


=SUMPRODUCT((Table_Data[Task]=O29)*IF(FilterBy_Region?,Table_Data[Person Region]=Filter_Region,1)*IF(FilterBy_Country?,Table_Data[Client Country]=Filter_Country,1)*IF(FilterBy_Role?,Table_Data[Person Role]=Filter_Role,1)*IF(FilterBy_Person?,Table_Data[Person Name]=Filter_Person,1)*IF(FilterBy_Client?,Table_Data[Client Name]=Filter_Client,1)*IF(FilterBy_Project?,Table_Data[Project Name]=Filter_Project,1)*IF(FilterBy_Task_Type?,Table_Data[Task Type]=Filter_Task_Type,1),Table_Data[Weeks])
 
Back
Top