Top 5 with above average – Power Pivot Trick
Welcome to Power Mondays. Every Monday, learn all about Power BI, Power Query & Power Pivot in full length examples, videos or tips. In this installment, learn how to get top 5 list with a twist.
Let’s say you are analyzing sales data and you want to know who are your top 5 with above average sales persons?
Of course, this is simple, you just create a pivot to see total sales by person and then sort the pivot. First five rows have the answer you need. You can even apply a value filter > top 5 to show only their data.
But what if there is a twist in the story?
You need to show top 5 sales people (by order quantity) with above average orders, like below:
Now things get interesting. We couldn’t create a regular pivot to get this answer. So what next?
Calculating Top 5 with Above Average sales in Power Pivot
We can use Power Pivot to answer such questions elegantly. Read this 3 step tutorial to learn the process.
1. Arrange the data and send it to model
Let’s say your sales data looks like this. First column is the name of sales person and second column is order quantity. Convert your data to a table.
Now insert a pivot from this table. Make sure you enabled the “Add this data to data model” option so we can build measures thru Power Pivot.
2. Create measures to find out “Above average orders”
Go to Pivot table field list and right click on the table name. You will see “Add measure” option. Click on it.
We will create 3 measures.
- Order Count:= COUNTROWS(data)
- Average Order Size := CALCULATE(AVERAGE(data[Orders]), ALL(data))
- Order Count > Avg. Order Size := CALCULATE([Order Count], FILTER(data, data[Orders] > [Average Order Size]))
Let’s understand these measures:
- Order Count: This simply counts the rows in data table. As each row is one order, you get order count.
- Average Order Size: This tells us the average order size for ALL orders ; hence the ALL(data)
- Order Count > Avg. Order Size: This calculates [Order Count] (ie row count) by first filtering orders that are > [Average Order Size]. So we end up counting orders where the order size is more than the total average.
3. Set up the pivot and apply top 5 filter
Now that we calculated all the necessary bits, let’s put this in to action.
- Add sales person to row labels and add Order count > Avg. order size to values area
- Now go to filter on “Sales Person” > Value Filters > Top 10…
- Set up the top 5 value filter as shown below.
You are done. You will get top 5 sales persons by above average order count. Feel free to reverse the top 5 to bottom 5 or calculate some other measures.
Download top 5 with a twist – Example workbook
Click here to download top 5 with a twist example workbook. Examine the measures by going to Data > Manage Data Model or use them in a pivot. Go ahead and create some other measure and test it.
More DAX for you…
If you are learning Power Pivot, check out these examples too:
- How to get percentage of something calculations in Excel Pivot Tables
- Distinct count and blanks with Power Pivot
- Calculating top n, bottom n, others as measures [powerpivotpro]
- And of course, introduction to DAX if you are new.
Thanks to @mjm8686’s tweet which inspired this post.
Hello Awesome...
My name is Chandoo. Thanks for dropping by. My mission is to make you awesome in Excel & your work. I live in Wellington, New Zealand. When I am not F9ing my formulas, I cycle, cook or play lego with my kids. Know more about me.
I hope you enjoyed this article. Visit Excel for Beginner or Advanced Excel pages to learn more or join my online video class to master Excel.
Thank you and see you around.
Related articles:
|
Leave a Reply
« Announcing Excel School v2.0 – Quick intro and details inside | Make funky and creative hand-drawn chart in Excel – Quick tutorial » |
9 Responses to “Top 5 with above average – Power Pivot Trick”
Very good, as usual, Chandoo.
I know what you will say but I suggest you to consider this alternative to your calculation of the average, using the MEDIAN
=CALCULATE(median(Data[Orders]), ALL(Data))
I suggest this because whilst your data appear normal, there are some outliers in there and you know they will skew the mean.
You will tell me this was a DAX exercise and not a statistics exercise but ... be ready!
That is a nice tweak. I would certainly use MEDIAN or other types of benchmarks too (say a target value).
Hi Chandoo -
Awesome post, and I'm glad to see you incorporate more Power Query and Power Pivot material into your blog!
I know this post is more aimed at DAX beginners, but I want to caution the final measure that you supplied. On large datasets, this will run very slowly.
The reason is you're using FILTER() over a fact table, AND you're referring to a measure in the 2nd parameter, Since FILTER() iterates over every row in the table, the [Average Order Size] measure will be evaluated every time.
A more optimized version is this:
Order Count > Avg. Order Size :=
VAR AverageOrder = [Average Order Size]
RETURN
CALCULATE (
[Order Count],
FILTER (
data,
data[Orders] > AverageOrder
)
)
In this case we save the [Average Order Size] value into a variable, and then compare the data[Orders] column to the variable each time. Now the Average Order Size measure is only evaluated once, and this will save a lot of time.
I know it's a hard balance to make content easy to understand yet still robust, but I think there's an opportunity to instill best practices with DAX in beginner-level posts.
Thank you Chris for the insightful comments. I must admit that I was a bit sloppy with the measure. Although I use vars in Power BI all the time, somehow when I write measures in Excel, I forget them. Thanks for sharing the sample. I suggest all DAX practitioners to take note of the key idea - "doing operations on every row of fact table is a NO NO".
Very nice and good post I like your post thanks for sharing.
Hi Chandoo
I really like Dax Queries when returning TOP n values. I thought I'd share my solution.
Btw, your new power mondays series rocks
Top 5 Salesman Orders Above AVG =VAR AVGTotalSales = CALCULATE ( AVERAGE ( data[Orders] ), ALL ( data ) )RETURN TOPN ( 5, ADDCOLUMNS ( VALUES ( data[Salesman] ), "Orders Above Average", COUNTROWS ( CALCULATETABLE ( data, data[Orders] > AVGTotalSales ) ) ), [Orders Above Average] )
Thank you Livio for this example. I always shied away from using TOPN, may be because when I first learned it (2013?) I found it very hard. Can you explain why this works and what the output of this measure will be?
PS: Thanks for the love. I am enjoying the Power Mondays series too. There is so much to share.
Hi Chandoo,
apologies for my late reply.
TOPN is not hard to use, actually would be a no brainer for you.
The first argument is the top N rows you want to return,
The second argument is the table on which ( per each row ) the expression at argument 3 is evaluated. In the other arguments you can specify the sorting type, ASC or DESC and other expressions that can be used to sort by more than 1 expression. TOPN is an iteration so it creates a row context.
TOP 5 Above AVG =
VAR AvgOrderAll =
AVERAGE ( data[Orders] ) -- calculate the average for all salesmen
RETURN
TOPN (
5,
-- top N results to return
-- table grouped by salesman and with a count of order above average
SUMMARIZECOLUMNS (
data[Salesman],
"OrderAboveAVG", COUNTROWS ( CALCULATETABLE ( data, data[Orders] > AvgOrderAll ) )
),
[OrderAboveAVG], -- sorting done by number of orders above average
DESC
)
The output will be 2 columns one with Salesman ID and the other with Orders above average count. Actually by using this method you can see that the results returned are 7 and not 5 because you have 5 salesmen who have a count of 16, whereas this is not shown by the pivot table which is excluding 2 salesmen even though they also have a count of 16.
Hello Mr.Candoo, being excel enthusiast. I am a big fan of you, I have a 2 Queries, may be you can resolve it.
1.Is there a way or option tick box at pivot (without calculate d field) where I can average my values ignoring the Bank, empty, zero cell/results. It is observed that, there's result variation at Normal table average total vs pivot average, try data with blank cells, need to be ignored
2. Many a time conditional formatting or the color coding rules at templates are designed by the template creators. However these rules needs to be explained to user in varfying the correct results. Eg. IF I set a conditional formatting that if cell results "above 5" will turn out to Red, when I handover the file, user is unaware why the cell has turn out to red, unless I explain him about what condition I had set. To avoid this situation, I suggest while setting conditional rules, there should be a additional field, to write the narrative, that is why, How the formula is set., which should pop-up like validation remark when a conditional cell is selected.