fbpx
Search
Close this search box.

Formula Forensics No. 010 Count How Many Times a List of Values Occurs in a Range

Share

Facebook
Twitter
LinkedIn

A few weeks ago Gewilson asked on the Chandoo.org Forums, “Can I simplify my formula?

=176 – (SUMIF($B10:$AF10,”PD”,$B11:$AF11) +SUMIF($B10:$AF10,”FA”,$B11:$AF11) +SUMIF($B10:$AF10,”PS”,$B11:$AF11) +SUMIF($B10:$AF10,”PN”,$B11:$AF11) +SUMIF($B10:$AF10,”F1″,$B11:$AF11) +SUMIF($B10:$AF10,”P1″,$B11:$AF11) +SUMIF($B10:$AF10,”F7″,$B11:$AF11))

SirJB7 responded with a nice Sumproduct solution:

=176 – SUMPRODUCT(((B$10:AF$10)=({“PD”;”FA”;”PS”;”PN”;”F1″;”F7″})) *(B$11:AF$11))

So Today we will pull this apart to see what inside, I think what we find may surprise you.

 

SirJB7’s Formula

=176 – SUMPRODUCT(((B$10:AF$10)=({“PD”;”FA”;”PS”;”PN”;”F1″;”F7″})) *(B$11:AF$11))

To Simplify things I am going to use a Truncated set of data and adjust the formula accordingly

We will examine:

=176 – SUMPRODUCT(((B$10:I$10)=({“PD”;”FA”;”PS”;”PN”})) *(B$11:I$11))

This problem has a smaller Range B10:I10 instead of B10:AF10

as well as 2 less possible solutions {“PD”;”FA”;”PS”;”PN”} instead of {“PD”;”FA”;”PS”;”PN”;”F1″;”F7″}

The reason for this will soon become evident.

As usual you can download a Sample File to follow along with. Download Here.

Lets go:

=176 – SUMPRODUCT(((B$10:I$10)=({“PD”;”FA”;”PS”;”PN”})) *(B$11:I$11))

We can see above that the formula is subtracting the result of a Sumproduct from a Fixed Number 176. So we really only need to focus on the Sumproduct part of the formula.

As we saw In Formula Forensics 007 – Sumproduct, Sumproduct adds up the products of the constituent arrays.

In this case

SUMPRODUCT(((B$10:I$10)=({“PD”;”FA”;”PS”;”PN”;})) *(B$11:I$11))

Has only 1 constituent array. The array does consist of 2 components

SUMPRODUCT(((B$10:I$10)=({“PD”;”FA”;”PS”;”PN”})) *(B$11:I$11))

These is a Logic component ((B$10:AF$10)=({“PD”;”FA”;”PS”;”PN”}))

and a Numerical Component (B$11:I$11)

Which are then multiplied together.

 

Looking at the Logical Component first

((B$10:I$10)=({“PD”;”FA”;”PS”;”PN”}))

The formula is checking the Range B10:I10 against an Array of possible solutions {“PD”;”FA”;”PS”;”PN”}

That is, it is checking each value in our list {“PD”;”FA”;”PS”;”PN”}, against each cell in the range B10:I10.

If we type the above equation=((B$10:I$10)=({“PD”;”FA”;”PS”;”PN”})) into a spare cell C14, and press F9

Excel returns ={FALSE,FALSE,FALSE,FALSE,TRUE,FALSE,FALSE,FALSE;FALSE,TRUE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE;FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE;FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE}

What the …

If we look closely at the above array we will see that it contains a lot of True/Falses separated by ,’s and a few ;’s

={FALSE,FALSE,FALSE,FALSE,TRUE,FALSE,FALSE,FALSE;FALSE,TRUE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE;FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE;FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE}

Specifically there are 4 blocks of 8 True/Falses separated by ,’s, each block is separated by a ;

In Total 4 x 8 = 32 Values

What this is, is an array representing the multiplication of the 8 cells in the range B10:I10 with each element of the possible solution array

Each row of the Array is separated from the next by a ;

Each element in each row is separated by a ,

This is best displayed like:

You can see why I simplified the size of the original problem.

So we have an Array of True/Falses ={FALSE,FALSE,FALSE,FALSE,TRUE,FALSE,FALSE,FALSE;FALSE,TRUE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE;FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE;FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE}

Which is now multiplied by the next component of the Sumproduct

({FALSE,FALSE,FALSE,FALSE,TRUE,FALSE,FALSE,FALSE;FALSE,TRUE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE;FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE;FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE}) *(B$11:I$11)

In a spare cell, say C23 enter

=({FALSE,FALSE,FALSE,FALSE,TRUE,FALSE,FALSE,FALSE;FALSE,TRUE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE;FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE;FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE}) *(B$11:I$11) and press F9

Excel returns

={0,0,0,0,10,0,0,0;0,10,0,0,0,0,0,0;0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0}

Note that this array has the same 8 column x 4 Row layout as above, except that all the True have been replaced by the values in the Score cells B11:I11

Sumproduct now kicks in and adds these up

=Sumproduct({0,0,0,0,10,0,0,0;0,10,0,0,0,0,0,0;0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0})

To get 20

Which is subtracted from our original number

=176 – SUMPRODUCT(((B$10:I$10)=({“PD”;”FA”;”PS”;”PN”})) *(B$11:I$11))

= 176 – Sumproduct({0,0,0,0,10,0,0,0;0,10,0,0,0,0,0,0;0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0})

= 176 – 20

= 156

 

Download

You can download a copy of the above file and follow along, Download Here.

 

Other Posts In This Series

You can learn more about how to pull Excel Formulas apart in the following posts

Formula Forensic Series:

 

We Need Your Help

I have received a few more ideas since last week and these will feature in coming weeks.

I do need more ideas though and so I need your help.

If you have a neat formula that you would like to share and explain, try putting pen to paper and draft up a Post like above or;

If you have a formula that you would like explained but don’t want to write a post also send it in to Chandoo or Hui.

 

Facebook
Twitter
LinkedIn

Share this tip with your colleagues

Excel and Power BI tips - Chandoo.org Newsletter

Get FREE Excel + Power BI Tips

Simple, fun and useful emails, once per week.

Learn & be awesome.

Welcome to Chandoo.org

Thank you so much for visiting. My aim is to make you awesome in Excel & Power BI. I do this by sharing videos, tips, examples and downloads on this website. There are more than 1,000 pages with all things Excel, Power BI, Dashboards & VBA here. Go ahead and spend few minutes to be AWESOME.

Read my storyFREE Excel tips book

Excel School made me great at work.
5/5

– Brenda

Excel formula list - 100+ examples and howto guide for you

From simple to complex, there is a formula for every occasion. Check out the list now.

Calendars, invoices, trackers and much more. All free, fun and fantastic.

Advanced Pivot Table tricks

Power Query, Data model, DAX, Filters, Slicers, Conditional formats and beautiful charts. It's all here.

Still on fence about Power BI? In this getting started guide, learn what is Power BI, how to get it and how to create your first report from scratch.

Weighted Average in Excel with Percentage Weights

Weighted Average in Excel [Formulas]

Learn how to calculate weighted averages in excel using formulas. In this article we will learn what a weighted average is and how to Excel’s SUMPRODUCT formula to calculate weighted average / weighted mean.

What is weighted average?

Wikipedia defines weighted average as, “The weighted mean is similar to an arithmetic mean …, where instead of each of the data points contributing equally to the final average, some data points contribute more than others.”

Calculating weighted averages in excel is not straight forward as there is no built-in formula. But we can use SUMPRODUCT formula to easily calculate them. Read on to find out how.

21 Responses to “Formula Forensics No. 010 Count How Many Times a List of Values Occurs in a Range”

  1. Unai says:

    I thought that as we only had to sum one array we could have used SUM instead of SUMPRODUCT, but I have had to enter array formula to get the same result. Why is it?

    • Cameron says:

      Well, if you only had one value to check for, you would use a SUMIF instead of SUM, but the issue is that the problem requires a check for 6 values. What the formula is doing is taking the first array (Your list of existing values) and creating a new array of TRUE/FALSE for each test value. Each of those new arrays is separated by the semicolons.

      What I would truly like to know, is why SUMIFS would not work for this solution instead? I can't see the original forum thread, so perhaps it must be done in 2003.

  2. Luke M says:

    Linky at the top wasn't working for me, I think the "replies=6" portion was messing it up. If anyone else is interested:
    http://chandoo.org/forums/topic/sumif-then-subtract#post-16283

  3. Nick Burns says:

    Here's an ever so slightly shorter formula (in terms of number of characters) - going back to the use of the sumif():

    =176-SUM(SUMIF(B$10:I$10,{"PD";"FA";"PS";"PN";"F1";"F7"},B$11:I$11))

    • SirJB7 says:

      Hi, Nick Burns!
      You're right!
      I wouldn't bet that that with SUMIF would work when it didn't with SUMIFS.
      Very nice approach!
      Regards!

  4. SirJB7 says:

    @Luke M: Thanks for adding the correct link, it didn't worked for me neither.

    @Cameron:
    Hi!
    I firstly tried to solve it using SUMIFS instead of SUMPRODUCT, but it didn't worked, it only did for the first element of the array.
    Give a look at this to check out: http://www.2shared.com/file/cHZEttV0/sumif_then_subtract__for_gewil.html
    SUMIFS hasn't the ability to perform matrix-based operations, so... welcome back to the SUMPRODUCT, and it did the job.
    Hope I helped you.
    Regards!

  5. Hui... says:

    @All, SirJB7, Cameron
    There are slight differences between how Sumproduct and SumIf and SumIfs work
    If you look at the Excel help for each function, specifically the Syntax section

    SUMPRODUCT(array1, [array2], [array3], ...)
    SUMIF(range, criteria, [sum_range])
    SUMIFS(sum_range, criteria_range1, criteria1, [criteria_range2, criteria2], ...)

    You can see that Sumproduct requires Arrays as Inputs
    SumIf/s require ranges

    This is telling us that an Array can be a Range, but that a Range can't be an Array
    It also explains why you don't need to use Ctrl Shift Enter with Sumproduct as it is accepting the input as an Array and is treated as such by default.

  6. Nick Burns says:

    It is nice to know that both solutions (using sumif or sumproduct) are still backwards compatible with earlier versions of Excel.

  7. andy holaday says:

    I think this is cool and it's good to understand the array notation in the formula.
    For my money, I prefer to expose the list {"PD”;”FA”;”PS”;”PN”} plainly in the worksheet, and reference the named range instead of embedding the list in formulae. This ensures consistency, which is especially important if the list is used in multiple places and might ever need to be changed. Implementation is simple:
    Somewhere in the workbook type out this column of values:
    Transactions_To_Include
    PD
    FA
    PS
    PN

    Highlight the range and hit Ctrl+Shift+F3 to create a named range with the default "name in top row" option.

    Use new formula
    =176 - SUMPRODUCT(((B$10:I$10)=(Transactions_To_Include)) *(B$11:I$11))

    I will post another solution that has even better transparency from an audit standpoint.

  8. andy holaday says:

    Assuming helper rows are allowed, I would set up the named range as I previously posted. Add a helper formula in B12:
    =--IFERROR(MATCH(B$10,Transactions_To_Include,0)>0,0)
    and extend to the right.

    Then use this new formula to calculate the conditional totals:
    =176-SUMPRODUCT(B$11:I$11,B$12:I$12)
    or
    =176-SUMIF(B$12:I$12,1,B$11:I$11)

    The reason I recommend this approach is it obvious which columns are being summed in the total.

    It is good to keep in mind that others may need to parse your workbooks, or that you may need to understand your own logic months or years later. Using fancy formulae to do everything in one shot can be fun, but it can also add to the risk that mistakes will be made and makes it more difficult to understand the logic and flow. Breaking down the logic in steps is like "showing your work" -- just as important now as it was back in school.

  9. andy holaday says:

    Final missive... Just to be clear, I'm not trying to bash SirJB7's approach -- like I said, I do think it's cool, and it presented a useful technique that is far superior to the OP's.
    My comments are driven by experience. Over the years I have learned (usually the hard way) that keeping things simple and transparent helps ensure the solution will be accurate and sustainable.
    Using tables to drive business rules is a major point I'm trying to make here. Any programmer confronted with the option to "hard code" a list of values vs. look up from a table knows what I mean.

  10. Nick Burns says:

    @Andy - thanks for that. Your approach makes it easier to update if parameters change. I tried using the table with my SUMIF() approach, but unfortunately it didn't work. So, the SUMPRODUCZ() in conjunction with a Lookup Table is more ideal.

  11. andy holaday says:

    FYI

    In my second post that offered a SUMIF approach, this formula was not rendered correctly by the blog engine:

    =[minus][minus]IFERROR(MATCH(B$10,Transactions_To_Include,0)>0,0)

    where [minus][minus] means type the - symbol twice.

  12. Sajan says:

    Hi,
    Another way to simplify the original formula is as follows:

    Setup the comparison values ("PD”;”FA”;”PS”;”PN”;”F1?;”F7?) somewhere.  Say, A1:A6.

    Then, you can use the following formula:
    =176 - SUMPRODUCT((COUNTIF(A1:A6,B10:AF10)=1)*(B11:AF11))

    One advantage of this approach is that you can edit the values in A1:A6 as needed.  (You could also use a Named Range, ensuring that your formula would not need to be modified even if the comparison values grow or shrink.)

    Cheers,
    Sajan.

  13. Peter Bartholomew says:

    Will a time come when some of these masterpieces need to be revisited with modern Office 365 in mind?

    For example, the array formula
    = 176 - SUM( SUMIFS( values, codes, {"PD","FA","PS","PN"} ) )
    works fine and CSE has largely gone.

    • Hui... says:

      @Peter

      You are of course correct, that a number of new functions will make older solutions obsolete.

      The only problem is that I know a lot of people still use Excel XP or 2007, from 2002/2007 era. As such these new functions don't exist.

  14. Peter Bartholomew says:

    @Hui
    As you say, it will be many years before these solutions truly become obsolete. The challenge is that we now seem to have reached a fork in Excel development in which at least two radically different solutions are available for any given problem, depending on the version of Excel that is in use.

    Of course, there is also the question of which is harder to upgrade, the code or the user (but perhaps that is a point better left unspoken!).

    • Hui... says:

      @Peter

      I think it is much worse than that
      I don't believe Microsoft have any intent of updating Excel 2019 etc to keep up with Excel365

      XLookup comes to mind

      This will create a large divide

Leave a Reply