• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Help with INDEX/MATCH

Status
Not open for further replies.
Hi guys, got stuck with the INDEX/MATCH functions, getting an incorrect result. Need to get a summary by product type by region in the example below (combined number of units of fruit and vegetables for both regions).

When I added =SUMIF($B:$B,"Fruit",((INDEX($C:$H,,MATCH($N2,$C$1:$H$1,0))))) I got 40, which was taking only the results in column C. I need to add up all the columns (C through H).

Please help!
 

Attachments

  • Example 04.26.22.xlsx
    10 KB · Views: 8
Last edited:
I understand that INDEX/MATCH might not be ideal as it's looking for the first column that matches the description. The company frowns upon using macros, so apart from that, what formulas can I use to cover all the columns in the range?
 
Hi and welcome
as you seem to be working for a company, there will probably be more demands for analysis in the future, which, with you original layout, will be very difficult.
The good news : no VBA needed !!
You are mixing up the two elements of data capture and final reporting. The two require quite different treatments.

A lot of people start by designing the form that they expect to see as the final report or which at first glance seems the best way of capturing data, and then wonder why it's so difficult to subsequently analyze and summarize or extract information from it. Yours exhibits all those features.

You should always capture data in a simple two dimensional table and worry about reporting information from it afterwards. Without exception doing this you will always be able to easily obtain management information. Rarely is this the case if you start the other way round.
You will also throw open the whole wonderful world of the powerful Pivot table functionality.

So before you get too far with this I created a simple two dimensional Excel table for you to analyze based on your original table ( my example does NOT contain all your data..) . This way the Pivot Table will adapt automatically to the size of the table ( the table can be made user-friendly with some Data Validation if needed)

After doing that the answer to your requirement would be to use a Pivot Table
 

Attachments

  • Example 04.26.22.xlsx
    14.9 KB · Views: 5
I don't disagree with @pecoflyer, but I should point out that the array methods available within 365 makes the use of normalised data less important. The formula
Code:
= MAKEARRAY(2,2,
    LAMBDA(r,c,
      LET(
         mask, (RegionHdr=INDEX(Region,r))*(TypeHdr=INDEX(Type,c)),
         SUM(IF(mask,Qty))
      )
    )
  )
is similar in approach to @bosco_yip but wrapped within MAKEARRAY to return the result as a single array.
78676
 

Attachments

  • Example 04.26.22.xlsx
    14.1 KB · Views: 11
This is not entirely relevant to the current post but, for those interested in Lambda functions, I have put the calculation into a Lambda function
Code:
SUMIFSλ = LAMBDA(Qty, CritArr1, critVal1, CritArr2, critVal2,
    MAP(IF(critVal2<>"",critVal1), IF(critVal1<>"",critVal2),
      LAMBDA(aVal1,aVal2,
           SUM(IF((CritArr1=aVal1)*(CritArr2=aVal2),Qty))
      )
   )
);
Generalising this, one can also pass the aggregation function as a parameter, so allowing AGGIFSλ
Code:
AGGIFSλ = LAMBDA(Qty, CritArr1, critVal1, CritArr2, critVal2, Fnλ,
    MAP(IF(critVal2<>"",critVal1), IF(critVal1<>"",critVal2),
      LAMBDA(aVal1,aVal2,
           Fnλ(IF((CritArr1=aVal1)*(CritArr2=aVal2),Qty))
      )
   )
);

SUMλ = LAMBDA(x,SUM(x));

COUNTλ = LAMBDA(x,COUNT(x));

MAXλ = LAMBDA(x,MAX(x));
to perform multiple conditional aggregations in two dimensions.
78689
 
Thank you all for your input, it's greatly appreciated!

I will have to go with SUMPRODUCT. I am well versed in Pivot tables (and love those), but the trick was to create something that was not dynamic and could not be refreshed/updated by other users. SUMPRODUCT should work for this example!

I will try other solutions to educate myself on the subject, so thanks again for your help and creativity!
 
This is not entirely relevant to the current post but, for those interested in Lambda functions, I have put the calculation into a Lambda function
Code:
SUMIFSλ = LAMBDA(Qty, CritArr1, critVal1, CritArr2, critVal2,
    MAP(IF(critVal2<>"",critVal1), IF(critVal1<>"",critVal2),
      LAMBDA(aVal1,aVal2,
           SUM(IF((CritArr1=aVal1)*(CritArr2=aVal2),Qty))
      )
   )
);
Generalising this, one can also pass the aggregation function as a parameter, so allowing AGGIFSλ
Code:
AGGIFSλ = LAMBDA(Qty, CritArr1, critVal1, CritArr2, critVal2, Fnλ,
    MAP(IF(critVal2<>"",critVal1), IF(critVal1<>"",critVal2),
      LAMBDA(aVal1,aVal2,
           Fnλ(IF((CritArr1=aVal1)*(CritArr2=aVal2),Qty))
      )
   )
);

SUMλ = LAMBDA(x,SUM(x));

COUNTλ = LAMBDA(x,COUNT(x));

MAXλ = LAMBDA(x,MAX(x));
to perform multiple conditional aggregations in two dimensions.
View attachment 78689
Hi Peter,

I don't have access to some of these functions and therefore it doesn't work. However, in this particular case, the SUMPRODUCT solution is simpler and therefore easily maintainable or am I missing something here (apart from these latest functions :D )?
 
Hi Peter,

I don't have access to some of these functions and therefore it doesn't work. However, in this particular case, the SUMPRODUCT solution is simpler and therefore easily maintainable or am I missing something here (apart from these latest functions :D )?
Simpler, or just more familiar?

If you look at the SUMPRODUCT solution, the question is why SUMPRODUCT and not SUM? Where is the function performing a product? Both formulations create a 2D mask from two 1D Boolean arrays. The mechanism for removing unwanted values is slightly different, the IF resulting in FALSE rather than 0 for eliminated terms. If you are seeking SUMIFS functionality either is fine, if you are interested in COUNTIFS the first is better, though the latter can be reformulated. The answer that SUMPRODUCT is there as a wrapper to perform array calculations rather than implicit intersection based calculation is itself far from simple.

A major plus of the modern methods is that the function SUMIFSλ follows the same syntax as SUMIFS and, once written, can be used elsewhere in the workbook without reimplementation, thus simplifying ongoing development. The entire solution is delivered as one dynamic range, so reducing the points of potential error (e.g. incorrectly placed '$' making the basic formula copy/paste incorrectly (a common cause of error within traditionally developed workbooks).

The final step of introducing an AGGREGATE IFS function and passing the Lambda functions to it as parameters, rather than using 'magic' numbers is just showing off; probably a solution without a use case to justify it!

There are some simplifications of the new methods. I have been able to abandon A1 and R1C1 notations and commit to defined names. I have little need for the complexities of relative referencing versus absolute, I can focus on the array rather than the cell and need not concern myself with array size; I do not make the decision of how far to copy/fill the formulae (in most cases it is clear but there have been some monumental errors caused by missing terms from a formula).

For me, simplicity is achieved by structuring solutions appropriately; basic methods applied in excessive volume is a recipe for complexity.

I owe you an apology for answering at far too great a length, but I felt I at least owe you some explanation for disrupting the forum!
 
Simpler, or just more familiar?

If you look at the SUMPRODUCT solution, the question is why SUMPRODUCT and not SUM? Where is the function performing a product? Both formulations create a 2D mask from two 1D Boolean arrays. The mechanism for removing unwanted values is slightly different, the IF resulting in FALSE rather than 0 for eliminated terms. If you are seeking SUMIFS functionality either is fine, if you are interested in COUNTIFS the first is better, though the latter can be reformulated. The answer that SUMPRODUCT is there as a wrapper to perform array calculations rather than implicit intersection based calculation is itself far from simple.

A major plus of the modern methods is that the function SUMIFSλ follows the same syntax as SUMIFS and, once written, can be used elsewhere in the workbook without reimplementation, thus simplifying ongoing development. The entire solution is delivered as one dynamic range, so reducing the points of potential error (e.g. incorrectly placed '$' making the basic formula copy/paste incorrectly (a common cause of error within traditionally developed workbooks).

The final step of introducing an AGGREGATE IFS function and passing the Lambda functions to it as parameters, rather than using 'magic' numbers is just showing off; probably a solution without a use case to justify it!

There are some simplifications of the new methods. I have been able to abandon A1 and R1C1 notations and commit to defined names. I have little need for the complexities of relative referencing versus absolute, I can focus on the array rather than the cell and need not concern myself with array size; I do not make the decision of how far to copy/fill the formulae (in most cases it is clear but there have been some monumental errors caused by missing terms from a formula).

For me, simplicity is achieved by structuring solutions appropriately; basic methods applied in excessive volume is a recipe for complexity.

I owe you an apology for answering at far too great a length, but I felt I at least owe you some explanation for disrupting the forum!
I have read in your posts about self documenting approach when it comes to writing functions and I don't really disagree with the approach in full. In fact, I had employed the simplest of table and self documenting formula approach for simple interest calculation as a test but the response left me a bit bemused with people responding that it was too difficult to grasp whilst standard approach using ranges was easier(familiar?) on them.

Coming back to your approach, at the very least requires usage of:
1. 6 Functions : MAKEARRAY,LAMBDA,LET,INDEX,SUM,IF
2. Named Ranges
Now comparing these to bosco_yip's which uses 2 Functions (Two conditions inside SUMPRODUCT are an IF in principle) and no names at all. Using your terminology, I am more familiar with this route and I find it simpler. And I am not convinced with the merit of your approach, if we take into account backward compatibility and availability of some of these functions across organizations / cross-sections of users who probably will use it.

To OP and others: my apology for digressing from this topic. @Peter Bartholomew if you think there's further merit in discussing this topic then we can continue in the lounge.
 
I have read in your posts about self documenting approach when it comes to writing functions and I don't really disagree with the approach in full. In fact, I had employed the simplest of table and self documenting formula approach for simple interest calculation as a test but the response left me a bit bemused with people responding that it was too difficult to grasp whilst standard approach using ranges was easier(familiar?) on them.
I think different users have different expectations. The core of Excel users are probably the group that wish to manipulate their numbers, but would deny that they are programming. There is also a major group that uses Excel only as a means of recording information, maybe just shopping lists. Microsoft found that more than half of spreadsheets do not contain any formulas (but, perhaps surprisingly, they are likely to contain merged cells).

I come from a background in which I had access to mainframe computers and would normally have written a computer program to perform calculations, so my expectations of Excel were somewhat different. In the present case my formulas are more complicated by your measures, That said, I could solve the problem using '+' only, if I were prepared to select individual numbers manually. Using the new functionality, a reasonable aim might be to provide a function

= SUMIFSλ(Quantity, RegionArray, RegionValue, TypeArray, TypeValue)

that could respond to changes in the number of regions or product types dynamically and could be used in many places throughout the workbook; say applied to data from different months. I am, in essence, writing a UDF but using spreadsheet formulas rather than VBA.

If you wish to move discussion to the lounge I am open to the idea.
 
@deciog
I have made a change to the aggregation formula that you may like. It uses the beta release functions TEXTBEFORE and TEXTAFTER to extract the criterion values. Some conditional formatting allows the tabular output to expand and contract dynamically as new regions/types are introduced.
Code:
AGGIFSλ = LAMBDA(Qty, CritArr1, critVal1, CritArr2, critVal2, Fnλ,
    MAP(critVal1 &"|"& critVal2,
      LAMBDA(aVal,
           Fnλ(IF((CritArr1&""=TEXTBEFORE(aVal,"|"))*(CritArr2&""=TEXTAFTER(aVal,"|")),Qty))
      )
   )
);
 

Attachments

  • 2D conditional Aggregation.xlsx
    18.5 KB · Views: 4
Moderator note:
Please, try to focus
From Apr 26, 2022
Hi guys, got stuck with the INDEX/MATCH functions, getting an incorrect result. Need to get a summary by product type by region in the example below (combined number of units of fruit and vegetables for both regions).
to #7 reply
This thread is closed now.
 
Status
Not open for further replies.
Back
Top