“Start with a joke.” My boss used to say when I am nervous about an upcoming presentation. Although, I am not nervous to post this article, I think a joke will always help.
So here it goes:

[originally posted on 5th May 2008]
Now to more serious matters.

VLOOKUP (and other lookup formulas) are very powerful and quite practical. They can fetch you the information you are looking for from a heap of data.
Now that we have seen the power of VLOOKUP thru several posts this week, I want to test your understanding of these formulas by presenting 3 challenges.
Download the excel workbook with these challenges
Click here to download excel workbook with all the data for these challenges.
Challenge # 1: Price After Discount
We come across this problem quite often. You have a list of discount codes and applicable quantity thresholds. For eg. you may sell an item at $50, but if I buy more than 1 item, you will give a 10% discount. The discount goes up as I purchase more quantity.
Now, given a list of item quantities, how do you calculate the amount payable using lookup formulas? That is our first challenge.

Challenge # 2: Price after accumulated quantity discount
This is essentially same as above formula, but the discounts apply on accumulated quantities bought so far. For eg. I will get first item for 0% discount, 2nd and 3rd items for 10% discount, 4th item for 15% discount … 26th item for 50% discount etc.
Now, given a list of customer names and quantities they bought (in the same order), how do you calculate the amount payable for each transaction?

Challenge # 3: Closest price based on the quantity purchased
This is an interesting challenge. The price after discount is determined based on the quantity bought. For eg. the discount thresholds are 1, 3, 5, 10, 25 etc. Now, given a quantity of items bought, we determine the price by finding the closest threshold to it. So, a quantity of 7 will get the price from threshold 5 as against 10.

Few guidelines on solving these challenges:
Although the above problem might appear simple, the solution is not so straightforward.
- Use a variety of formulas: Do not just rely VLOOKUP. Instead experiment with formulas like SUMIF, COUNTIF, INDEX, MATCH etc. to get results
- Use helper columns: Break down the problem in to several steps and use helper columns to get the results
- Use pen & paper: Write down the logic first, then simulate it in excel using formulas. It clears your mind fast.
- Many solutions exist: Each problem can be solved in several different ways. So once you find a solution, feel free to explore other options
- Share your solutions: Use comments box to share your solutions with us. I am always looking for new ways to solve problems. So teach me…
Solution to the Challenges:
Here is a workbook with one set of solutions for the problems. As I said, many other solutions do exist. So use this workbook as an indication of what is possible.
Click here to download excel workbook with all the data for these challenges.
One Link to More VLOOKUP Awesomeness:
Debra at Contextures has chipped in with some interesting videos on VLOOKUP formulas. Check them out here.
The 2nd Joke:
It is quite difficult to set an expectation and then meet it. More so with jokes. But do you know that Chandoo.org’s 404 pages show Excel error messages? For example go to http://chandoo.org/wp/missing_file/. Refresh the page to see a different message. 🙂
It is Diwali (the festival of lights) in India this weekend. So I am going to spend time with family, light some fireworks and relax. I wish you a happy Diwali if you celebrate one. Even otherwise, I wish a lot of light and warmth in to your life this year.
















8 Responses to “Pivot Tables from large data-sets – 5 examples”
Do you have links to any sites that can provide free, large, test data sets. Both large in diversity and large in total number of rows.
Good question Ron. I suggest checking out kaggle.com, data.world or create your own with randbetween(). You can also get a complex business data-set from Microsoft Power BI website. It is contoso retail data.
Hi Chandoo,
I work with large data sets all the time (80-200MB files with 100Ks of rows and 20-40 columns) and I've taken a few steps to reduce the size (20-60MB) so they can better shared and work more quickly. These steps include: creating custom calculations in the pivot instead of having additional data columns, deleting the data tab and saving as an xlsb. I've even tried indexmatch instead of vlookup--although I'm not sure that saved much. Are there any other tricks to further reduce the file size? thanks, Steve
Hi Steve,
Good tips on how to reduce the file size and / or process time. Another thing I would definitely try is to use Data Model to load the data rather than keep it in the file. You would be,
1. connect to source data file thru Power Query
2. filter away any columns / rows that are not needed
3. load the data to model
4. make pivots from it
This would reduce the file size while providing all the answers you need.
Give it a try. See this video for some help - https://www.youtube.com/watch?v=5u7bpysO3FQ
Normally when Excel processes data it utilizes all four cores on a processor. Is it true that Excel reduces to only using two cores When calculating tables? Same issue if there were two cores present, it would reduce to one in a table?
I ask because, I have personally noticed when i use tables the data is much slower than if I would have filtered it. I like tables for obvious reasons when working with datasets. Is this true.
John:
I don't know if it is true that Excel Table processing only uses 2 threads/cores, but it is entirely possible. The program has to be enabled to handle multiple parallel threads. Excel Lists/Tables were added long ago, at a time when 2 processes was a reasonable upper limit. And, it could be that there simply is no way to program table processing to use more than 2 threads at a time...
When I've got a large data set, I will set my Excel priority to High thru Task Manager to allow it to use more available processing. Never use RealTime priority or you're completely locked up until Excel finishes.
That is a good tip Jen...