Advanced Sumproduct Queries

Share

Facebook
Twitter
LinkedIn

The use of the Sumproduct function for doing multiple criteria Sum If’s is possibly one of the greatest extensions of an Excel function beyond what it was primarily designed for. Maybe it was actually designed with that in mind ?

However Sumproduct can be extended even further through use 2D Ranges together with carefully constructed queries.

The examples below are included in the Example File, Excel 2003 Example File.

Scenario 1: Lookup a value within a 2D Range matching 2 criteria

You have a table of Dates and Fruit Sold and Number Sold each Day

How many Bananas did I sell on the 4thMay?

In the above I have setup 3 Named Ranges

Named ranges are used as it makes the reading of forthcoming formulas easier.

Fruit:                     C2:H2

Dates:                   B3:B12

FruitData:            C3:H12

So, How many Bananas did I sell on the 4th May?

Using the equation =SUMPRODUCT((Fruit=D16)*(Date=D15)*FruitData)

Returns the correct answer 31

Related: Doing 2way lookups in Excel

Scenario 2: Sum all values within a 2D Range matching 2 criteria

You have a table of Dates and Cars Sold and Number Sold each Day. There are multiple entries for on various days, possibly from various salesmen.

How many Holden cars did I sell on the 3rd May?

So, How many Holden cars did I sell on the 3rd May?

Using the equation =SUMPRODUCT((Dates=D17)*(Cars=D18)*CarData)

Returns the correct answer 9 = (1 + 5 + 3)

Scenario 3: Sum values within a 2D Range matching multiple unordered criteria

You have a table of Dates and Cars Sold and Number Sold each Day, There are multiple Entries for on various days.

How many Ford and Suzuki cars did I sell on the 10th May?

So, How many Ford and Suzuki cars did I sell on the 10th May?

Using the equation =SUMPRODUCT((Dates=D24)*((Cars=D25)+ (Cars=E25))*CarData)

Returns the correct answer 13 = (4 + 5 + 3 + 1)

Note that this can be extended to add additional queries where the Car Type can be entered in any cell in the Range D25:H25

=SUMPRODUCT((Dates=D24)*((Cars=D25)+ (Cars=E25) + (Cars=F25) + (Cars=G25) + (Cars=H25))*CarData)

Scenario 4: Sum values within a 2D Range matching multiple ordered criteria

You have a table of Dates and Cars Sold and Number Sold each Day, There are multiple Entries for on various days.

How many Toyota and Holden cars did I sell on the 10th May?

How many Toyota and Holden cars did I sell on the 10th May?

Using the equation =SUMPRODUCT((Dates=D30)*(Cars=D31:H31)*CarData)

Returns the correct answer 21 = (3 + 6 + 6 + 6)

Note that this can be extended to allowing additional queries but the Car Type must be entered into the same position as in the Header Row.

How Does This Work?

The above techniques is using matrix arithmetic to setup a conjunctive truth table within the Sumproduct formula.

Using =SUMPRODUCT((B4:B6=D10)*(C3:E3=D9)*(C4:E6))

The conjunctive truth table logic (B4:B6=D10)*(C3:E3=D9) is simply saying make a matix of elements that are true when the conditions are met and false otherwise

Sumproduct then takes this and multiplies and it by the data values and accumulates the values to get the sum of the matching values.

It is important to note that the Width and Height of the Criteria Row and Column must match the Width and Height of the data area or a #Value! error is returnd.

The Maths

To understand and explain how this works I will use a simple model with 3 rows and 3 columns see below

The formula: =SUMPRODUCT((B4:B6=D10)*(C3:E3=D9)*(C4:E6)), shown above consists of 3 areas

(B4:B6=D10) is a 3 Rows x 1 Column range

(C3:E3=D9) is a 1 Row x 3 Columns range

(C4:E6) is a 3 Row x 3 Column range

Breaking the formula into components

=SUMPRODUCT((B4:B6=D10)*(C3:E3=D9)*(C4:E6))

(B4:B6=D10)*(C3:E3=D9) is the same as multiplying 2 arrays, representing the 2 areas as shown below

You can see that where the components are True I have put a 1 and a 0 where they are false

Where the Date was 3-May Excel evaluates this to 1 and similarly where the Fruit was a Banana, Excel evaluates this to 1.

Where the criteria isn’t met Excel evaluates this to a 0

 

The multiplication of a 3 x 1 and a 1 x 3 array is a 3 x 3 array

Representing the (B4:B6=D10)*(C3:E3=D9) part of the equation

 

Next this is multiplied by the data area

=SUMPRODUCT((B4:B6=D10)*(C3:E3=D9)*(C4:E6))

 

 

 

This is the same as multiplying two 3×3 arrays which produces a 3 x 3 array, below:

Sumproduct then adds up all the array components to get the final answer of 3.

Modifications

The Data Area can be included in the Truth Table Logic or as a seperate component of Sumproduct.

=SUMPRODUCT((B4:B6=D10)*(C3:E3=D9)*(C4:E6)) and =SUMPRODUCT((B4:B6=D10)*(C3:E3=D9), (C4:E6)) are both equal

 

Multiple “OR” crietria can be added by use of the+ operator within criteria

In Scenario 3 above, we sum the number of Ford or Suzuki cars sold on the 10th May.

SUMPRODUCT((Dates=D24)*((Cars=D25) + (Cars=E25) + (Cars=F25) + (Cars=G25) + (Cars=H25))*CarData)

The Or logic is added to the criteria by use of the + operator above within the criteria for Cars

the And logic is added by use of the * between the Dates and Cars criteria

Other Logic Elements

You can add Greater Than (>), Less Than (<) etc and other logic elements to the queries to suit your requirements.

Sample File

The examples below are included in the Example File, Excel 2003 Example File.

What do you think of the above technique ?

What do you think of the above technique ?

Let us know in the comments below.

More Tips & Resources:

Facebook
Twitter
LinkedIn

Share this tip with your colleagues

Excel and Power BI tips - Chandoo.org Newsletter

Get FREE Excel + Power BI Tips

Simple, fun and useful emails, once per week.

Learn & be awesome.

Welcome to Chandoo.org

Thank you so much for visiting. My aim is to make you awesome in Excel & Power BI. I do this by sharing videos, tips, examples and downloads on this website. There are more than 1,000 pages with all things Excel, Power BI, Dashboards & VBA here. Go ahead and spend few minutes to be AWESOME.

Read my storyFREE Excel tips book

Overall I learned a lot and I thought you did a great job of explaining how to do things. This will definitely elevate my reporting in the future.
Rebekah S
Reporting Analyst
Excel formula list - 100+ examples and howto guide for you

From simple to complex, there is a formula for every occasion. Check out the list now.

Calendars, invoices, trackers and much more. All free, fun and fantastic.

Advanced Pivot Table tricks

Power Query, Data model, DAX, Filters, Slicers, Conditional formats and beautiful charts. It's all here.

Still on fence about Power BI? In this getting started guide, learn what is Power BI, how to get it and how to create your first report from scratch.

8 Responses to “Pivot Tables from large data-sets – 5 examples”

  1. Ron S says:

    Do you have links to any sites that can provide free, large, test data sets. Both large in diversity and large in total number of rows.

    • Chandoo says:

      Good question Ron. I suggest checking out kaggle.com, data.world or create your own with randbetween(). You can also get a complex business data-set from Microsoft Power BI website. It is contoso retail data.

  2. Steve J says:

    Hi Chandoo,
    I work with large data sets all the time (80-200MB files with 100Ks of rows and 20-40 columns) and I've taken a few steps to reduce the size (20-60MB) so they can better shared and work more quickly. These steps include: creating custom calculations in the pivot instead of having additional data columns, deleting the data tab and saving as an xlsb. I've even tried indexmatch instead of vlookup--although I'm not sure that saved much. Are there any other tricks to further reduce the file size? thanks, Steve

    • Chandoo says:

      Hi Steve,

      Good tips on how to reduce the file size and / or process time. Another thing I would definitely try is to use Data Model to load the data rather than keep it in the file. You would be,
      1. connect to source data file thru Power Query
      2. filter away any columns / rows that are not needed
      3. load the data to model
      4. make pivots from it

      This would reduce the file size while providing all the answers you need.

      Give it a try. See this video for some help - https://www.youtube.com/watch?v=5u7bpysO3FQ

  3. John Price says:

    Normally when Excel processes data it utilizes all four cores on a processor. Is it true that Excel reduces to only using two cores When calculating tables? Same issue if there were two cores present, it would reduce to one in a table?
    I ask because, I have personally noticed when i use tables the data is much slower than if I would have filtered it. I like tables for obvious reasons when working with datasets. Is this true.

    • Ron MVP says:

      John:
      I don't know if it is true that Excel Table processing only uses 2 threads/cores, but it is entirely possible. The program has to be enabled to handle multiple parallel threads. Excel Lists/Tables were added long ago, at a time when 2 processes was a reasonable upper limit. And, it could be that there simply is no way to program table processing to use more than 2 threads at a time...

  4. Jen says:

    When I've got a large data set, I will set my Excel priority to High thru Task Manager to allow it to use more available processing. Never use RealTime priority or you're completely locked up until Excel finishes.

Leave a Reply