• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Find the minimum every line, and summarize.

bines53

Active Member
Hello friends,

Challenge light,

To find the amount of 58, as it appears in the formula in cell C13, column D is illustrative.

Without the use of volatile functions.

David
 

Attachments

  • minimum every line.xlsx
    7.9 KB · Views: 77
Hi shrivallabha,

Just look at the function OFFSET ,Makes me sick :(.

By the way there is a good substitute for the function OFFSET , I mentioned several times in the past.


David
 
Last edited:
On Sunday I will give the formula,

Maybe ,one of the members, will give a brief formula without CSE ( Control + Shift + Enter)


David
 
Earlier responses are probably preferable but another option that maybe could be useful if the input ranges were derived from a longer formula:

=-SUM(MMULT(A1:B10,{-1,-1;-1,1}/2)^{1,2}^{1,0.5})

based on MIN(A1:B1)=(A1+B1-ABS(A1-B1))/2.
 
Earlier responses are probably preferable but another option that maybe could be useful if the input ranges were derived from a longer formula:

=-SUM(MMULT(A1:B10,{-1,-1;-1,1}/2)^{1,2}^{1,0.5})

based on MIN(A1:B1)=(A1+B1-ABS(A1-B1))/2.
Hi Lori ,

Is there a scalable version , which can take care of 3 or more columns , without using David's bête noire ?

Narayan
 
Hello friends,

My solution,

=SUMPRODUCT((((A1:A10>B1:B10)+{1,0}=1))*A1:B10)

There is a function built-in, expandable to 4 columns, to download the plugin here
http://www.businessfunctions.com/index.php?pageno=65
=SumCombine(5,A1:A10,B1:B10)

David
I love this formula @bine53. I have understood the concept of the formula. But why should A1:A10>B1:B10)+{1,0}=1 be wrapped around two brackets before being multiplied with A1:B10? The excel screen tip is not helpful in determining the number of brackets. Please help me out. Thanks in advance.
 
Actually, you only need single bracket.
=SUMPRODUCT(((A1:A10>B1:B10)+{1,0}=1)*A1:B10)
Will give you same answer.
 
Hi shrivallabha,

Just look at the function OFFSET ,Makes me sick :(.

By the way there is a good substitute for the function OFFSET , I mentioned several times in the past.


David
What substitute are you referring to Sir
But not easily scalable :( If volatile functions were allowed then formula like below one will give more flexibility:
WARNING (for David): volatile functions aheado_O
=SUMPRODUCT(SUBTOTAL(5,OFFSET(A1:B1,ROW(A1:A10)-1,0)))
Hello ninja, I have struggled to understand how your formula works. I have used evaluate function but stuck where the offset portion evaluates to #VALUE! error. Kindly please unravel the formula for me and others. Thanks.
 
Evaluate option doesn't work for volatile functions. To know what volatile functions are and how they affect you can read:
http://www.decisionmodels.com/calcsecretsi.htm

For SUMPRODUCT and its usage, read first three sticky posts (step by step #1 to #3):
http://www.vbaexpress.com/forum/forumdisplay.php?98-SUMPRODUCT-And-Other-Array-Functions

After reading, it would not be too difficult to guess what's happening.
1. OFFSET function generates A1:B1, A2:B2,.......A10:B10 ranges.
2. SUBTOTAL function gives MIN (first argument set to 5) to each pass given by step 1.
3. SUMPRODUCT just then sums up the evaluated results of SUBTOTAL.

I don't (mostly) like the way I explain things but hope this helps.
 
Evaluate option doesn't work for volatile functions. To know what volatile functions are and how they affect you can read:
http://www.decisionmodels.com/calcsecretsi.htm

For SUMPRODUCT and its usage, read first three sticky posts (step by step #1 to #3):
http://www.vbaexpress.com/forum/forumdisplay.php?98-SUMPRODUCT-And-Other-Array-Functions

After reading, it would not be too difficult to guess what's happening.
1. OFFSET function generates A1:B1, A2:B2,.......A10:B10 ranges.
2. SUBTOTAL function gives MIN (first argument set to 5) to each pass given by step 1.
3. SUMPRODUCT just then sums up the evaluated results of SUBTOTAL.

I don't (mostly) like the way I explain things but hope this helps.
Thanks a lot. It makes sense now. Just wondering if there is a way of using a function which is not as volatile as OFFSET?
 
Thanks a lot. It makes sense now. Just wondering if there is a way of using a function which is not as volatile as OFFSET?
There's no single way to substitute OFFSET all the time but in most cases it can be avoided by using different constructs / approaches e.g. all examples above barring OFFSET one give the same result and do not use it.
 
Hi Narayan,
A scalable and non-volatile version would be quite a lot longer, i think. eg with CSE:

=SUM(LARGE(A1:B10,MOD(LARGE(ROW(A1:B10)*10^6+RANK(A1:B10,A1:B10),(ROW(A1:B10)-MIN(ROW(A1:B10)))*COLUMNS(A1:B10)+1),10^6)))
Thanks for this robust formula @Lori. Please give us a tutorial on how you came up with the formula if you do not mind. It is too complex for beginners like me. Thanks
 
@Ufoo - In the recent follow up challenge to this one, a simplified formula for the particular data setup was suggested. You can see if you can follow that one by stepping through the Evaluate Formula tool and then come back to this formula which is a generalisation of that approach.

But I'm afraid i have similar reservations to Shrivallabha and wish i had the skills of Narayank when it comes to clear explanations - perhaps he would be so kind as to assist if he happens to pass by again...
 
Hi Ufoo ,

I wish I had Lori's brilliance to come up with such a formula in the first place.

I have attached a file with some data so that you can use it to follow along with this explanation.

The principle is brilliant in its simplicity.

The LARGE function treats both uni-dimensional ranges / arrays and multiple row multiple column ranges alike ; knowing this is key to understanding this formula.

In essence , if we can convert the two dimensional range , with multiple rows and multiple columns , into a uni-dimensional range consisting of only one column , then the task can be redefined as follows :

Given a single column range , can we identify which indexes we should look at , so that we get the smallest number in each row ?

To start with , suppose we have a single column range , say A1:A20 ; to find out the smallest value in each row is obvious ; we just take each value itself !

Suppose now we extend the range to two columns , A1:B20. To find out the smallest value in each row , if we convert the two column range to a single column range with double the original number of rows , is not so easy. Lori's formula tells us that we need to take the following values from the 40 element range :

{9;25;25;36;5;27;23;28;34;18;23;38;28;40;36;38;32;34;18;32}

This principle remains the same even when we extend the problem to more than 2 columns.

So how does this work ?

To explain this visually , I have inserted the RANK function in columns J through V ; the RANK function also is as generous as the LARGE function , treating uni-dimensional and two dimensional ranges alike.

In columns L and M , you can see that the values given in the array above have been conditionally formatted ; this makes it clear that what the LARGE function is returning are the maximum ranks in each row.

Don't be misled !

The maximum value given by the RANK function , with the third parameter omitted , corresponds to the smallest value in the range !

So , in effect , we are able to extract the smallest values in the original range , by looking at the largest rank values in the array formed by the RANK function.

But the problem is , it is not enough to look at the largest rank values by themselves ! We need to look at the largest rank values in each row. Hence the use of the ROW function.

If we merely look at the ROW function , and combine it with the RANK function , the two values happen to be in the same region , at least in the data set which I have inserted in the uploaded workbook ; because of this , combining the two will invariably change the overall value to an extent where we will not be able to retrieve the RANK value at a later stage.

Hence the multiplication by 10^6 ; this value of 10^6 has been used because using a multiplier of 10^6 appears to be safe given that the numbers in the original data set are small in value. If the numbers in the data set also happen to be of this magnitude , this number will have to be increased.

Let us take a more detailed look at how this works.

The RANK function , when it is applied to the entire range A1:D20 , returns the following matrix of values ; for the sake of brevity , I am posting the first 3 rows from the matrix :

{23,59,19,49;1,36,24,1;10,62,70,1}

Now , let us combine this with the ROW function as it has been used , as follows :

{1000023,1000059,1000019,1000049;2000001,2000036,2000024,2000001;3000010,3000062,3000070,3000001}

Using the ROW function has automatically segregated the values of each row , in their own slabs i.e. the first four values are in the slab :

1000000 to 1000099 , assuming that the original data set has values between 0 and 99

The next 4 values in the array are in the slab :

2000000 to 2000099

and the next 4 values in the array are in the slab :

3000000 to 3000099

and so on for the remaining rows in the data range.

If we now use the LARGE function on these combined values , it is clear that the 4 values in the last row ( row 20 ) , will be at the top of the resulting array ; these will be followed by the 4 values in row 19 , then the 4 values in row 18 , and so on , till the last 4 elements will be the 4 values in row 1.

But remember , all these values will be ranked in descending order.

Since we have 4 elements in each row , all we have to do is look at every fourth element , starting with the first !

Thus , if we look at the following elements :

1 , 5 , 9 , 13 , 17 ,...., 73 , 77

we will be able to extract the largest rank values in each row ; obviously , these will correspond to the smallest data values in each row !

Thus , all that is remaining is to derive an array of values as follows :

{1;5;9;13;17;21;25;29;33;37;41;45;49;53;57;61;65;69;73;77}

This is exactly what is being done by the following portion of the formula :

(ROW(A1:D20)-MIN(ROW(A1:D20)))*COLUMNS(A1:D20)+1

The portion highlighted in blue returns an array of values from 0 through 19.

Multiplying this by the number of columns , 4 in this case , gives an array of values :

{0;4;8;12;16;20;24;28;32;36;40;44;48;52;56;60;64;68;72;76}

Adding 1 to this results in the array that we wanted.

Thus , to summarize everything we have detailed so far :

1. Use the RANK function to return a matrix of values , ranking the original data in an order where a rank of 1 is assigned to the largest value in the original data set , and 80 is assigned to the smallest value in the original data set.

2. Combine this with the ROW function , multiplied by a large enough number that the RANK values do not change even after the combination. Using the ROW function automatically ensures that the resulting values are grouped row-wise , making it easier to extract the row-wise maximum values.

3. Extract the row-wise maximum values by looking at every nth value in the array , starting from the first , where n is the number of columns in the original data set.

What we now have is an array as follows :

{20000079;19000045;18000052;17000065;16000072;15000072;14000052;13000065;12000072;11000065;10000049;9000072;8000065;7000079;6000072;5000072;4000070;3000070;2000036;1000059}

The values of interest are only the values :

{79;45;52;65;72;72;52;65;72;65;49;72;65;79;72;72;70;70;36;59}

What do the above values represent ?

The value 79 represents the largest value in the array of ranks , whose complete list is available in columns S , T , U and V.

We do not have the value 80 because there are two ranks of 79 , since there are two instances of the value 1 , the smallest value in the original data set. Because the RANK function starts from a rank of 1 ( for the largest data value ) , as it goes down the data set , assigning higher rank values , when it comes to duplicates , it assigns the same rank to all duplicate instances of the same data value , and then skips ranks when assigning the next rank to the next smaller value in the data set.

Extracting the above rank values is simple and uses the MOD function , with the same value which was used earlier as the multiplier ; this magic is akin to the old Arabian puzzle , which involved dividing 19 camels in the ratios 1/2 , 1/4 and 1/5 ! All that needed to be done was to add a camel to make the total 20 , and then , after the division was done , take it back !

All that is left is to use this final array of ranks as the second parameter to the LARGE function , with the original data set as the first parameter ; thus , using :

=LARGE(A1:D20 , {79;45;52;65;72;72;52;65;72;65;49;72;65;79;72;72;70;70;36;59})

will return the values :

{1;11;8;4;2;2;8;4;2;4;10;2;4;1;2;2;3;3;13;7}

which are the minimum values in each row of data , starting with 1 as the smallest value in row 20 , 11 as the smallest value in row 19 , and so on , till we come to 7 as the smallest value in the first row.

Any doubts ?

Narayan
 

Attachments

  • Book42.xlsx
    10.6 KB · Views: 24
Unwieldy for column scalability:
=SUM(IF(A1:A10<=B1:B10,A1:A10,B1:B10))
Edit: John Jairo beat me to it!

A simple function for 3 columns.

=SUM(IF(IF(A1:A10<B1:B10,A1:A10,B1:B10)<C1:C10,IF(A1:A10<B1:B10,A1:A10,B1:B10),C1:C10))

A simple function for 4 columns.

=SUM(IF(IF(A1:A10<B1:B10,A1:A10,B1:B10)<IF(C1:C10<C1:C10,C1:C10,D1:D10),IF(A1:A10<B1:B10,A1:A10,B1:B10),IF(C1:C10<D1:D10,C1:C10,D1:D10)))

Interestingly the nesting requirements are binary. i.e, 8 columns requires only 3 levels of nesting.

with best regards
Arun N
 
Last edited:
Hi Ufoo ,

I wish I had Lori's brilliance to come up with such a formula in the first place.

I have attached a file with some data so that you can use it to follow along with this explanation.

The principle is brilliant in its simplicity.

The LARGE function treats both uni-dimensional ranges / arrays and multiple row multiple column ranges alike ; knowing this is key to understanding this formula.

In essence , if we can convert the two dimensional range , with multiple rows and multiple columns , into a uni-dimensional range consisting of only one column , then the task can be redefined as follows :

Given a single column range , can we identify which indexes we should look at , so that we get the smallest number in each row ?

To start with , suppose we have a single column range , say A1:A20 ; to find out the smallest value in each row is obvious ; we just take each value itself !

Suppose now we extend the range to two columns , A1:B20. To find out the smallest value in each row , if we convert the two column range to a single column range with double the original number of rows , is not so easy. Lori's formula tells us that we need to take the following values from the 40 element range :

{9;25;25;36;5;27;23;28;34;18;23;38;28;40;36;38;32;34;18;32}

This principle remains the same even when we extend the problem to more than 2 columns.

So how does this work ?

To explain this visually , I have inserted the RANK function in columns J through V ; the RANK function also is as generous as the LARGE function , treating uni-dimensional and two dimensional ranges alike.

In columns L and M , you can see that the values given in the array above have been conditionally formatted ; this makes it clear that what the LARGE function is returning are the maximum ranks in each row.

Don't be misled !

The maximum value given by the RANK function , with the third parameter omitted , corresponds to the smallest value in the range !

So , in effect , we are able to extract the smallest values in the original range , by looking at the largest rank values in the array formed by the RANK function.

But the problem is , it is not enough to look at the largest rank values by themselves ! We need to look at the largest rank values in each row. Hence the use of the ROW function.

If we merely look at the ROW function , and combine it with the RANK function , the two values happen to be in the same region , at least in the data set which I have inserted in the uploaded workbook ; because of this , combining the two will invariably change the overall value to an extent where we will not be able to retrieve the RANK value at a later stage.

Hence the multiplication by 10^6 ; this value of 10^6 has been used because using a multiplier of 10^6 appears to be safe given that the numbers in the original data set are small in value. If the numbers in the data set also happen to be of this magnitude , this number will have to be increased.

Let us take a more detailed look at how this works.

The RANK function , when it is applied to the entire range A1:D20 , returns the following matrix of values ; for the sake of brevity , I am posting the first 3 rows from the matrix :

{23,59,19,49;1,36,24,1;10,62,70,1}

Now , let us combine this with the ROW function as it has been used , as follows :

{1000023,1000059,1000019,1000049;2000001,2000036,2000024,2000001;3000010,3000062,3000070,3000001}

Using the ROW function has automatically segregated the values of each row , in their own slabs i.e. the first four values are in the slab :

1000000 to 1000099 , assuming that the original data set has values between 0 and 99

The next 4 values in the array are in the slab :

2000000 to 2000099

and the next 4 values in the array are in the slab :

3000000 to 3000099

and so on for the remaining rows in the data range.

If we now use the LARGE function on these combined values , it is clear that the 4 values in the last row ( row 20 ) , will be at the top of the resulting array ; these will be followed by the 4 values in row 19 , then the 4 values in row 18 , and so on , till the last 4 elements will be the 4 values in row 1.

But remember , all these values will be ranked in descending order.

Since we have 4 elements in each row , all we have to do is look at every fourth element , starting with the first !

Thus , if we look at the following elements :

1 , 5 , 9 , 13 , 17 ,...., 73 , 77

we will be able to extract the largest rank values in each row ; obviously , these will correspond to the smallest data values in each row !

Thus , all that is remaining is to derive an array of values as follows :

{1;5;9;13;17;21;25;29;33;37;41;45;49;53;57;61;65;69;73;77}

This is exactly what is being done by the following portion of the formula :

(ROW(A1:D20)-MIN(ROW(A1:D20)))*COLUMNS(A1:D20)+1

The portion highlighted in blue returns an array of values from 0 through 19.

Multiplying this by the number of columns , 4 in this case , gives an array of values :

{0;4;8;12;16;20;24;28;32;36;40;44;48;52;56;60;64;68;72;76}

Adding 1 to this results in the array that we wanted.

Thus , to summarize everything we have detailed so far :

1. Use the RANK function to return a matrix of values , ranking the original data in an order where a rank of 1 is assigned to the largest value in the original data set , and 80 is assigned to the smallest value in the original data set.

2. Combine this with the ROW function , multiplied by a large enough number that the RANK values do not change even after the combination. Using the ROW function automatically ensures that the resulting values are grouped row-wise , making it easier to extract the row-wise maximum values.

3. Extract the row-wise maximum values by looking at every nth value in the array , starting from the first , where n is the number of columns in the original data set.

What we now have is an array as follows :

{20000079;19000045;18000052;17000065;16000072;15000072;14000052;13000065;12000072;11000065;10000049;9000072;8000065;7000079;6000072;5000072;4000070;3000070;2000036;1000059}

The values of interest are only the values :

{79;45;52;65;72;72;52;65;72;65;49;72;65;79;72;72;70;70;36;59}

What do the above values represent ?

The value 79 represents the largest value in the array of ranks , whose complete list is available in columns S , T , U and V.

We do not have the value 80 because there are two ranks of 79 , since there are two instances of the value 1 , the smallest value in the original data set. Because the RANK function starts from a rank of 1 ( for the largest data value ) , as it goes down the data set , assigning higher rank values , when it comes to duplicates , it assigns the same rank to all duplicate instances of the same data value , and then skips ranks when assigning the next rank to the next smaller value in the data set.

Extracting the above rank values is simple and uses the MOD function , with the same value which was used earlier as the multiplier ; this magic is akin to the old Arabian puzzle , which involved dividing 19 camels in the ratios 1/2 , 1/4 and 1/5 ! All that needed to be done was to add a camel to make the total 20 , and then , after the division was done , take it back !

All that is left is to use this final array of ranks as the second parameter to the LARGE function , with the original data set as the first parameter ; thus , using :

=LARGE(A1:D20 , {79;45;52;65;72;72;52;65;72;65;49;72;65;79;72;72;70;70;36;59})

will return the values :

{1;11;8;4;2;2;8;4;2;4;10;2;4;1;2;2;3;3;13;7}

which are the minimum values in each row of data , starting with 1 as the smallest value in row 20 , 11 as the smallest value in row 19 , and so on , till we come to 7 as the smallest value in the first row.

Any doubts ?

Narayan
Thanks a lot @Narayan.
 
Back
Top