The use of the Sumproduct function for doing multiple criteria Sum If’s is possibly one of the greatest extensions of an Excel function beyond what it was primarily designed for. Maybe it was actually designed with that in mind ?
However Sumproduct can be extended even further through use 2D Ranges together with carefully constructed queries.
The examples below are included in the Example File, Excel 2003 Example File.
Scenario 1: Lookup a value within a 2D Range matching 2 criteria
You have a table of Dates and Fruit Sold and Number Sold each Day
How many Bananas did I sell on the 4thMay?
In the above I have setup 3 Named Ranges
Named ranges are used as it makes the reading of forthcoming formulas easier.
Fruit: C2:H2
Dates: B3:B12
FruitData: C3:H12
So, How many Bananas did I sell on the 4th May?
Using the equation =SUMPRODUCT((Fruit=D16)*(Date=D15)*FruitData)
Returns the correct answer 31
Related: Doing 2way lookups in Excel
Scenario 2: Sum all values within a 2D Range matching 2 criteria
You have a table of Dates and Cars Sold and Number Sold each Day. There are multiple entries for on various days, possibly from various salesmen.
How many Holden cars did I sell on the 3rd May?
So, How many Holden cars did I sell on the 3rd May?
Using the equation =SUMPRODUCT((Dates=D17)*(Cars=D18)*CarData)
Returns the correct answer 9 = (1 + 5 + 3)
Scenario 3: Sum values within a 2D Range matching multiple unordered criteria
You have a table of Dates and Cars Sold and Number Sold each Day, There are multiple Entries for on various days.
How many Ford and Suzuki cars did I sell on the 10th May?
So, How many Ford and Suzuki cars did I sell on the 10th May?
Using the equation =SUMPRODUCT((Dates=D24)*((Cars=D25)+ (Cars=E25))*CarData)
Returns the correct answer 13 = (4 + 5 + 3 + 1)
Note that this can be extended to add additional queries where the Car Type can be entered in any cell in the Range D25:H25
=SUMPRODUCT((Dates=D24)*((Cars=D25)+ (Cars=E25) + (Cars=F25) + (Cars=G25) + (Cars=H25))*CarData)
Scenario 4: Sum values within a 2D Range matching multiple ordered criteria
You have a table of Dates and Cars Sold and Number Sold each Day, There are multiple Entries for on various days.
How many Toyota and Holden cars did I sell on the 10th May?
How many Toyota and Holden cars did I sell on the 10th May?
Using the equation =SUMPRODUCT((Dates=D30)*(Cars=D31:H31)*CarData)
Returns the correct answer 21 = (3 + 6 + 6 + 6)
Note that this can be extended to allowing additional queries but the Car Type must be entered into the same position as in the Header Row.
How Does This Work?
The above techniques is using matrix arithmetic to setup a conjunctive truth table within the Sumproduct formula.
Using =SUMPRODUCT((B4:B6=D10)*(C3:E3=D9)*(C4:E6))
The conjunctive truth table logic (B4:B6=D10)*(C3:E3=D9) is simply saying make a matix of elements that are true when the conditions are met and false otherwise
Sumproduct then takes this and multiplies and it by the data values and accumulates the values to get the sum of the matching values.
It is important to note that the Width and Height of the Criteria Row and Column must match the Width and Height of the data area or a #Value! error is returnd.
The Maths
To understand and explain how this works I will use a simple model with 3 rows and 3 columns see below
The formula: =SUMPRODUCT((B4:B6=D10)*(C3:E3=D9)*(C4:E6)), shown above consists of 3 areas
(B4:B6=D10) is a 3 Rows x 1 Column range
(C3:E3=D9) is a 1 Row x 3 Columns range
(C4:E6) is a 3 Row x 3 Column range
Breaking the formula into components
=SUMPRODUCT((B4:B6=D10)*(C3:E3=D9)*(C4:E6))
(B4:B6=D10)*(C3:E3=D9) is the same as multiplying 2 arrays, representing the 2 areas as shown below
You can see that where the components are True I have put a 1 and a 0 where they are false
Where the Date was 3-May Excel evaluates this to 1 and similarly where the Fruit was a Banana, Excel evaluates this to 1.
Where the criteria isn’t met Excel evaluates this to a 0
The multiplication of a 3 x 1 and a 1 x 3 array is a 3 x 3 array
Representing the (B4:B6=D10)*(C3:E3=D9) part of the equation
Next this is multiplied by the data area
=SUMPRODUCT((B4:B6=D10)*(C3:E3=D9)*(C4:E6))
This is the same as multiplying two 3×3 arrays which produces a 3 x 3 array, below:
Sumproduct then adds up all the array components to get the final answer of 3.
Modifications
The Data Area can be included in the Truth Table Logic or as a seperate component of Sumproduct.
=SUMPRODUCT((B4:B6=D10)*(C3:E3=D9)*(C4:E6)) and =SUMPRODUCT((B4:B6=D10)*(C3:E3=D9), (C4:E6)) are both equal
Multiple “OR” crietria can be added by use of the+ operator within criteria
In Scenario 3 above, we sum the number of Ford or Suzuki cars sold on the 10th May.
SUMPRODUCT((Dates=D24)*((Cars=D25) + (Cars=E25) + (Cars=F25) + (Cars=G25) + (Cars=H25))*CarData)
The Or logic is added to the criteria by use of the + operator above within the criteria for Cars
the And logic is added by use of the * between the Dates and Cars criteria
Other Logic Elements
You can add Greater Than (>), Less Than (<) etc and other logic elements to the queries to suit your requirements.
Sample File
The examples below are included in the Example File, Excel 2003 Example File.
What do you think of the above technique ?
What do you think of the above technique ?
Let us know in the comments below.

























8 Responses to “Pivot Tables from large data-sets – 5 examples”
Do you have links to any sites that can provide free, large, test data sets. Both large in diversity and large in total number of rows.
Good question Ron. I suggest checking out kaggle.com, data.world or create your own with randbetween(). You can also get a complex business data-set from Microsoft Power BI website. It is contoso retail data.
Hi Chandoo,
I work with large data sets all the time (80-200MB files with 100Ks of rows and 20-40 columns) and I've taken a few steps to reduce the size (20-60MB) so they can better shared and work more quickly. These steps include: creating custom calculations in the pivot instead of having additional data columns, deleting the data tab and saving as an xlsb. I've even tried indexmatch instead of vlookup--although I'm not sure that saved much. Are there any other tricks to further reduce the file size? thanks, Steve
Hi Steve,
Good tips on how to reduce the file size and / or process time. Another thing I would definitely try is to use Data Model to load the data rather than keep it in the file. You would be,
1. connect to source data file thru Power Query
2. filter away any columns / rows that are not needed
3. load the data to model
4. make pivots from it
This would reduce the file size while providing all the answers you need.
Give it a try. See this video for some help - https://www.youtube.com/watch?v=5u7bpysO3FQ
Normally when Excel processes data it utilizes all four cores on a processor. Is it true that Excel reduces to only using two cores When calculating tables? Same issue if there were two cores present, it would reduce to one in a table?
I ask because, I have personally noticed when i use tables the data is much slower than if I would have filtered it. I like tables for obvious reasons when working with datasets. Is this true.
John:
I don't know if it is true that Excel Table processing only uses 2 threads/cores, but it is entirely possible. The program has to be enabled to handle multiple parallel threads. Excel Lists/Tables were added long ago, at a time when 2 processes was a reasonable upper limit. And, it could be that there simply is no way to program table processing to use more than 2 threads at a time...
When I've got a large data set, I will set my Excel priority to High thru Task Manager to allow it to use more available processing. Never use RealTime priority or you're completely locked up until Excel finishes.
That is a good tip Jen...