• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

longest consecutive sequence of a value

bines53

Active Member
Hello friends,

I am looking for a formula similar to
{=LARGE(FREQUENCY(IF(A2:A31=D3,ROW(A2:A31)),IF(A2:A31<>D3,ROW(A2:A31))),1)}
I arrived without CSE ,
=LARGE((FREQUENCY((A2:A31=D3)*ROW(A2:A31),(A2:A31<>D3)*ROW(A2:A31))),1)
{0;21;0;2;0;0;2;0;0;0;0;0;0;3;0;0;0;1;0;0;0;0;0;1;0;0;0;0;0;0;0},
The desired outcome, should be 3.
21,Is redundant,He also, number of rows, No. 10 which does not appear, which I examine the sequence.

Thank you !
 

Attachments

  • test.20.xlsx
    8.1 KB · Views: 7
Last edited:
The reason is that, in the array version, the IF statement returns Boolean FALSEs to the arrays being passed to FREQUENCY, which are ignored by that function.

In the non-array, product version, any rows which do not meet your criterion instead result in a 0, not a Boolean FALSE, and so those zeroes are then passed to FREQUENCY. And of course 0 is a perfectly legitimate numerical value to be considered by that function.

So whereas:

=LARGE(FREQUENCY(IF(A2:A31=D3,ROW(A2:A31)),IF(A2:A31<>D3,ROW(A2:A31))),1)

would resolve as, for example:

=LARGE(FREQUENCY({FALSE;FALSE;FALSE;FALSE;6;FALSE;8;9;10;FALSE;FALSE;FALSE;FALSE;15;FALSE;FALSE;18;FALSE;20;21;FALSE;FALSE;FALSE;FALSE;26;FALSE;28;FALSE;30;31},{2;3;4;5;FALSE;7;FALSE;FALSE;FALSE;11;12;13;14;FALSE;16;17;FALSE;19;FALSE;FALSE;22;23;24;25;FALSE;27;FALSE;29;FALSE;FALSE}),1)

the equivalent non-array, non-IF statement version:

=LARGE((FREQUENCY((A2:A31=D3)*ROW(A2:A31),(A2:A31<>D3)*ROW(A2:A31))),1)

would resolve as:

=LARGE((FREQUENCY({0;0;0;0;6;0;8;9;10;0;0;0;0;15;0;0;18;0;20;21;0;0;0;0;26;0;28;0;30;31},{2;3;4;5;0;7;0;0;0;11;12;13;14;0;16;17;0;19;0;0;22;23;24;25;0;27;0;29;0;0})),1)

and, as I mentioned, there is a gulf of difference here between those values being numerical zeroes or Boolean FALSEs.

Regards
 
Hi,

The problem is only first part,IF(A2:A31=D3,ROW(A2:A31)),Interestingly, no other function that can convert 0 to FALSEs ?

Regards
 
Sorry - I'm not sure what you mean? Why do you say that the problem is "only the first part"?

Regards
 
Hi XOR LX,

In the array version ,The formula can be ,
=LARGE((FREQUENCY(IF(A2:A31=H23,ROW(A2:A31)),(A2:A31<>H23)*ROW(A2:A31))),1),That is, once, with a function IF.

Regards
 
Last edited:
I see. Yes, it can, at least for the purpose of this exercise (it should be pointed out that, although the non-zero entries in the resulting arrays are the same, the dimensions of the two, in general, are not).

But that still doesn't help you with your issue re zeroes in the non-array version.

Are you desperate to have a non-CSE solution to this? I can provide one, though it will be quite convoluted, and certainly much less efficient than the equivalent CSE version.

This is often the price you pay for seeking to "avoid" array formulas, I'm afraid. Sometimes there is just no good substitute for CSE!

Regards
 
Hello friends,

I found the solution !

=AGGREGATE(14,6,(FREQUENCY((A2:A32=H23)*ROW(A2:A32),(A2:A32<>H23)*ROW(A2:A32)))/(A2:A32<>H23),1)

Regards
 
Hello friends,

I have a different problem ,
The same data,this array formula,
{=SUM(--(FREQUENCY(IF($A$2:$A$31=10,ROW($A$2:$A$31)),($A$2:$A$31<>10)*ROW($A$2:$A$31))=2))},
Checks, some sequences of 2, there are a number of 10, the answer is 2.
non-CSE solution ,
=AGGREGATE(14,6,(FREQUENCY(($A$2:$A$32=10)*ROW($A$2:$A$32),($A$2:$A$32<>10)*ROW($A$2:$A$32))=2)/($A$2:$A$32<>10),1),
How do I insert the function SUM.


Thank you !
 
Hi again.

This time you should be able to use your original non-array version, with a small amendment to account for the case where the number of entries in the range which are not equal to 10 is also 2, i.e. (non-array):

=SUM(--(FREQUENCY(($A$2:$A$31=10)*ROW($A$2:$A$31),($A$2:$A$31<>10)*ROW($A$2:$A$31))=2))-(COUNTIF($A$2:$A$31,"<>10")=2)

Regards
 
Hi XOR LX,

There is an error in the formula,If you change 2 to 3, the result should be 1.

This version that,
=AGGREGATE(14,6,SIGN(FREQUENCY(($A$2:$A$32=H23)*ROW($A$2:$A$32),($A$2:$A$32<>H23)*ROW($A$2:$A$32))=H24),1),I get
{0;0;0;1;0;0;1;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0}
I need just summarized ?

Regards
 
Hi XOR LX,

I think the last part, -(COUNTIF($A$2:$A$31,"<>10")=2),Is redundant.
=SUM(--(FREQUENCY(($A$2:$A$31=10)*ROW($A$2:$A$31),($A$2:$A$31<>10)*ROW($A$2:$A$31))=2)) ,It is enough ?

I would prefer a solution, with AGGREGATE ,Because, if I set 21 instead of 2, I will get the result of 1, that's not true.

Regards
 
Last edited:
I think the last part, -(COUNTIF($A$2:$A$31,"<>10")=2),Is redundant.
=SUM(--(FREQUENCY(($A$2:$A$31=10)*ROW($A$2:$A$31),($A$2:$A$31<>10)*ROW($A$2:$A$31))=2)) ,It is enough ?

Did you not read my post? What happens, for example, if the range contains precisely 2 cells which are not equal to 10, and all the rest are?

I would prefer a solution, with AGGREGATE ,Because, if I set 21 instead of 2, I will get the result of 1, that's not true.

Sorry - I don't understand what you mean here. Can you clarify?

Regards
 
First of all, when you reference an actual cell as one of the criteria in COUNTIF(S)/SUMIF(S), you need to be careful with your syntax, as it is not the same as when you reference a constant in this parameter. You have:

COUNTIF($A$2:$A$31,"<>D3")

though this should be:

COUNTIF($A$2:$A$31,"<>"&D3)

Secondly, to show you why this additional clause is necessary, let's return to the case where D4 contains 2, not 21 (and D3 still contains 10).

Now make the entries in A2:A31 equal to the following:

Put 10 in A2 and A3.
Leave A4 and A5 empty.
Put 10 in every cell in A6:A31.

What is your expected result here? I imagine it is 1, correct?

But what does this formula give:

=SUM(--(FREQUENCY(($A$2:$A$31=D3)*ROW($A$2:$A$31),($A$2:$A$31<>D3)*ROW($A$2:$A$31))=D4))

?

And what does this one give:

=SUM(--(FREQUENCY(($A$2:$A$31=D3)*ROW($A$2:$A$31),($A$2:$A$31<>D3)*ROW($A$2:$A$31))=D4))-(COUNTIF($A$2:$A$31,"<>"&D3)=D4)

?

Perhaps you should have a think and try to work out why this additional clause is necessary.

Regards
 
Hi XOR LX,

After I set without errors,

=SUM(--(FREQUENCY(($A$2:$A$31=D3)*ROW($A$2:$A$31),($A$2:$A$31<>D3)*ROW($A$2:$A$31))=D4))-(COUNTIF($A$2:$A$31,"<>"&D3)=D4)

I understand that you're absolutely right !

Thank you!

Regards
 

Attachments

  • test.20-1.xlsx
    8.3 KB · Views: 2
I realized I was wrong, the formula ,COUNTIF($A$2:$A$31,"<>D3"),
I deleted the post with the error, after I fixed, we both came at exactly the same second, with recent posts.

Regards,
 
You are ready "to suffer", more questions from me ?

I want to say, you're adding a lot, this wonderful forum, I wish I had the time, to explore better, and contribute to, as I mainly contributed to here.
I want to say something honestly, I would like to see more solutions without volatile, and without array.

Regards

David
 
To me, there can never be any "suffering" involved in discussing the intricacies of Excel functions! On the contrary!

And I must say that it is refreshing to see someone whose questions on that subject suggest an interest which goes beyond the routine I-need-to-know-how-to-do-this-I-don't-really-care-how-it-works-just-tell-me mentality of the average forum poster.

No. Your questions are important, in that they encourage us to debate time-and-tested constructions and to seek out alternative, non-standard solutions. And even if I personally might not entirely agree with the motive behind those questions, that's not to say that I would prefer it if you did not ask them.

I would say this, though. Non-volatility: yes, agreed; something we should always strive to achieve in our formulas (I have recently decided to give up ROW/INDIRECT constructions in favour of the ROW/INDEX:INDEX approach).

But non-array? Not so sure I agree with you that this is something to be avoided. Firstly, "fear" of array formulas is something which seems to have arisen less out of any rational, logical reasoning than out of some sort of internet-perpetuated urban myth that such formulas are somehow "harmful" - evil, malicious things that, unless prevented from doing so, would destroy your entire hard drive given half a chance.

Secondly, people often forget that many functions which do not require CSE - and so which are apparently "acceptable", are still nevertheless array formulas. SUMPRODUCT IS an array formula. LOOKUP IS an array formula. AGGREGATE IS an array formula. INDEX with a zero row or column parameter IS an array formula. As are many more. The fact that they do not require a certain combination of keystrokes in order to commit them does not in any way mean that they are any "less" of an array formula than those that are entered as such.

Regards
 
Hi ,

To add to what you have said , I think the problem is the perceived speed of SUMPRODUCT vs. SUM entered using CTRL SHIFT ENTER , or in general any non-CSE formula vis-a-vis a CSE formula.

I do not know whether any blog or website gives details of this analysis , but logically it seems to me that any function which is going to deal with arrays will take around the same time , unless the actual implementation of the function manages to be more efficient in some miraculous way.

I can understand that SUMIF(S) will be faster or more efficient than SUMPRODUCT when you use whole column references , but there is a reason for it. Is there any real reason why , when we are dealing with arrays , SUMPRODUCT will be faster than SUM entered using CTRL SHIFT ENTER ?

Also , even if we grant that SUMPRODUCT or any other function which does not need CSE is faster than another function which needs to use CSE , is the speed difference between the two so significant ? Are there any statistics to back up this conclusion ? Here we are not discussing the difference that using something like AGGREGATE might make , since this one function is a combination of several functions , and it is possible that the internal implementation of this might make it superior in speed , even apart from its versatility.

Narayan
 
Last edited:
Is there any real reason why , when we are dealing with arrays , SUMPRODUCT will be faster than SUM entered using CTRL SHIFT ENTER ?

I doubt it. And I agree with you that the issue is largely one of perception re the two forms.

Basically, almost almost every function can be made to operate over either a single value or an array of values. Usually, the native form of these functions is that which operates over a single value; to coerce operation over an array uses the same function though entered into the worksheet in a different manner (with CTRL+SHIFT+ENTER).

Microsoft could equally have decided, just as I understand is the case with Google Sheets, to make it so that there exist two separate functions for this purpose, e.g. SUM, ARRAYSUM, etc.

However, there are some native functions which are such that operating over a single value - or single range - would make very little sense. In these cases Microsoft evidently decided that it would be more useful to make it so that the native form of these functions was the one which operated over an array, thus removing the need for the dual CSE/Non-CSE set-up existing with the vast majority of other functions.

MMULT is such an example: given that this function is in any case designed to take the (matrix) product of two arrays, a "non-array" version which operated over a single input would be redundant.

Perhaps Microsoft introduced SUMPRODUCT along lines of similar thought, i.e. that because the array version of SUM was so widely-used they thought it would be of benefit to users to have this functionality without the need for the pressing of a few extra keys. I don't know to be sure. But, whatever the reason, I agree with you that I cannot see why there should be any difference between the performance of the two versions.

Regards

P.S. I should clarify my statement re LOOKUP, INDEX and AGGREGATE, since it is not in every use that I am claiming that these functions are operating as array formulas.

These are not, as far as I'm concerned, array formulas:

=AGGREGATE(14,6,A1:A10,1)

=INDEX(A1:A10,,0)

=LOOKUP(10,A1:A10)


But these are:

=AGGREGATE(14,6,A1:A10/(B1:B10="B"),1)

=MAX(INDEX((B1:B10="B")*A1:A10,,0))


=LOOKUP(1,0/(B1:B10="B"),A1:A10)
 
Hello friends,

Those who already know me here, know, I work with a lot of heavy data, constantly updated, I would recommend,
Index function, everything possible, to combine the index.
Any formula should SUMPRODUCT convert it with functions sum and index.
MMULT-,Very fast, excellent.
AGGREGATE- huge funcion, extremely fast, yet we have not discovered enough the benefits.
I have to say again, without volatile, and without an array.

Regards
 
Back
Top