• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Find a series of numbers that appear in any order

bines53

Active Member
Hello friends ,

Find several times, appears in this series of numbers 1,1,1,2 .

Has a well-known formula, is not flexible enough and convenient in my opinion,

=SUM(--(MMULT(COUNTIF(H1:K1,A1:D20),{1;1;1;1})=10)),

Of course, the challenge is to find another formula.

My formula works only on positive numbers (including 0)

Conditioning is, without volatile functions.

Good luck!

David
 

Attachments

bines53

Active Member
Hello friends ,

EXP, Is a mathematical constant,an irrational number,
Millions of decimal places,in Excel is displayed, only at 15 decimal places.

=SUM(--(MMULT(EXP(A1:D20),{1;1;1;1})=SUM(INDEX(EXP(H1:K1),0))))

For each series of numbers, in an equal number of numbers, each series has its own identity(integers) !

If we take this series of numbers --- 1,1,1,2

SUM(INDEX(EXP(H1:K1),0))=15.5439015843078 ,

This is the exclusive identity of this series !

Can not create the same identity, with 4 other digits(integers).

The only limit, which can be placed up to 700 in each cell, the highest, and minus 620 to low.

I want to be careful, not to put in the same series positive and negative numbers together.

David
 

Lori

Active Member
Hi David,

It's a nice idea in theory based, i believe, on the principle that exp(n) is not a linear combination of any lower powers {exp(1),..,exp(n-1)} i.e. e is a transcendental number which was first shown in 1882 (https://en.wikipedia.org/wiki/Transcendental_number). For example it would not work with sqrt(2) instead of e.

Unfortunately in practice with the double numeric type used by Excel we only have 15 digits of precision which severely limits the combinations that are unique. Try for instance your formula with combinations 36,1,1,2 and 36,2,2,1. It will say they are identical.
 

bines53

Active Member
Hi Lori,

Thank you for your attention,

I see now where Excel main problem is that ,he does not present at all digits

after the point, and therefore forced him to accept it, by dividing,

=SUM(--(MMULT(EXP(1/A1:D20),{1;1;1;1})=SUM(INDEX(EXP(1/K1:N1),0))))


I hope it's all right now,



David
 

bines53

Active Member
Hi Lori,


The dividing in 1 should be so,

=SUM(--(MMULT(1/EXP(A1:D20),{1;1;1;1})=SUM(INDEX(1/EXP(K1:N1),0))))

It work well with zero,

David
 

Lori

Active Member
Hi David, I think the first alternative you give does better than the second for non-zero integers. eg 1, 2, 2, 37 and 1, 2, 2, 36 will be matched in the formula above.

But thanks for the interesting idea. Anyone else interested in the theory behind the formula can follow the proof that e is transcendental from the wikipedia link. Note that being irrational is not enough. For example replacing e with √2, the combinations 1, 1, 5 and 3, 3, 3 would give the same result.

On the other hand i don't really see a problem with using the COUNTIF formula which allows for arbitrary numbers. It can also be modified to an ISNUMBER(MATCH combination if you need array inputs.
 
Last edited:

bines53

Active Member
Hi Lori,


=SUM(--(MMULT(--(ISNUMBER(MATCH(A1:D20,K1:N1,0))),{1;1;1;1})=4))

The result is not accurate,

We can summarize and say,When the data does not include the zero, the

formula can be used ,

=SUM(--(MMULT(EXP(1/A1:D20),{1;1;1;1})=SUM(INDEX(EXP(1/K1:N1),0))))

Anyway, it seems to me that in any situation UDF is better than COUNTIF ,when the criteria are multiple.


David
 

Lori

Active Member
Indeed you're right, that MATCH formula only works reliably with unique criteria. But then COUNTIF is not reliable in general either eg 1, 1, 1, 1 could match criteria 1, 1, 2, 2 with that kind of logic.

Alternatively, a formula that combines the two approaches could work well for any data type (numeric, text or boolean) provided criteria data is not too large (up to around 50 columns):

=SUM(--(MMULT(IFERROR(N(INDEX(EXP(1/MATCH(A1:D20,H1:K1,0)),)),),{1;1;1;1})=SUM(INDEX(EXP(1/MATCH(H1:K1, H1:K1,0)),))))
 

Lori

Active Member
Further testing shows all previous EXP formulas should really be adjusted to a wider tolerance, so instead of:

=SUM(EXP({3,3,3,1}))=SUM(EXP({1,3,3,3}))

which returns FALSE, use something like:

=ABS(SUM(EXP({3,3,3,1})-SUM(EXP({1,3,3,3}))<1E-12
 

bines53

Active Member
Hi Lori,

Do you mean,

=SUM(--(ABS(MMULT(EXP(A1:D20),{1;1;1;1})-SUM(INDEX(EXP(H1:K1),0)))<0.000000000001))

It looks great, if it circumvents the limitations of Excel !


David
 

Lori

Active Member
Hi David,

That works better but I'm not 100% sure about it due to decimal approximations.

It leads me to one final idea that only compares whole numbers...

Replace EXP(n) with the n th prime number and multiply!

=PRODUCT(CHOOSE(H1:K1,1,2,3,5,7))

i.e. 1*1*1*2 and the number of matches in the array is:

=SUM(--(EXP(MMULT(LN(CHOOSE(A1:D20,1,2,3,5,7)),{1;1;1;1})) = 2))

This way could also be adapted using MATCH for general data and you can use CHOOSE(N(IF({1} to avoid CSE.

Lori
 

bines53

Active Member
Hi Lori,

There is also such a solution, but not flexible,

=SUM(--(MMULT(--(A1:D20=1),{1;1;1;1})=3)*(MMULT(--(A1:D20=2),{1;1;1;1})=1))

David
 

Lori

Active Member
Previous suggestion was designed to work efficiently with sample data.

For any numeric data (as well as for text), can try:

=SUM(--(MMULT((COUNTIF(H1:K1,A1:D20)>0)*10^COUNTIF(H1:K1,"<="&A1:D20)-10^COUNTIF(H1:K1,"<="&H1:K1),{1;1;1;1})=0))
 
Last edited:

bines53

Active Member
Hi Lori,

It is great !

Why did you wait until now?:)


By the way ,this formula does not work

=SUM(--(EXP(MMULT(LN(CHOOSE(A1:D20,1,2,3,5,7)),{1;1;1;1})) = 2))

AND ,When 1,1,1,1, then the function,COUNTIF,Shoul be

=SUM(--(MMULT(COUNTIF(H1:K1,A1:D20),{1;1;1;1})=16)),4*4.

Thanks for your cooperation, and I hope there will be more ideas later!

David
 
Last edited:

Lori

Active Member
Hi David,

Thanks, good challenge! A fully reliable solution was found eventually, based on the ideas of yours further up.

Just to clarify what i said before...

- in that first formula with CTRL+SHIFT+ENTER:
replace '2' with 'PRODUCT(CHOOSE(H1:K1,1,2,3,5,7))'

- in that second formula, try with either of these:
H1:K1 = 1,1,2,2 -> =8 and A1:D1=1,1,1,1
H1:K1 = 1,5,3,3 -> =6 and A11:D11=5,5,3,3

Rgds, Lori
 

Peter Bartholomew

Well-Known Member
How about

= SUM( IF( MMULT( 8^table - 8^sequence, {1;1;1;1} ), 0, 1 ) )

The formula = MMULT( 8^sequence, {1;1;1;1} ) is intended to provide a unique octal value for each sequence of the four digits 1-9 (with repetition, otherwise 2^sequence would do).

It strikes me that much of the strife creating such formulas arises from the dumb decisions to optimise spreadsheet technology to avoid anything that smacks of mathematical integrity.
1) Why is cell notation necessary? Worksheet location should make no difference to the significance of the values representing some business object.
2) Why are formulas (and functions) entered into cells rather than being applied to named ranges. Many to many relationships, whether arrays or tables, represent business objects far better than collections of many-to-one relationships.
3) Do the concepts of relative referencing (anchored, row-anchored etc.) provide anything useful? I would suggest that it is only a poor representation of the true multi-cell relationship that works cleanly with arrays and lists.

If spreadsheets technology were built on sounder foundations, then we might have functions that aggregate by row or by column, returning a 1D array. MMULT is fine as far as it goes (it will perform weighted sums, filters and even products if combined with LOG). That leaves a whole raft of aggregations that are not supported e.g. MIN, SMALL, AND, AVERAGE
 

Lori

Active Member
Hi Peter,

Nice simplification for the single digit case. And yes, these kinds of tasks would be much simpler if there were built-in operations for sorting or ranking by row or column - currently require long formulas similar to those found in some prior challenge threads.

The comments relating to cell references would make an interesting topic for a separate thread. Cell references can definitely obfuscate business logic - tables can mitigate this to some extent but have limitations in row referencing.

One way to avoid cell references altogether that I have seen successfully implemented is to work with named arrays. Most of the lookup and aggregation functions allow for arrays as well as range references. On the sheet a hyperlink click could be made to open a new workbook with tables containing arbitrarily large named arrays as editable data, for example by defining:

Code:
Names.Add "Table", Range("Table1").Value2
Names.Add "Sequence", Range("Table2").Value2
 

Peter Bartholomew

Well-Known Member
@Lori I wasn't thinking of anything quite that radical. I did try
Code:
Names.Add "Array", Sheet1.ListObjects("Table1").Range.Value2
I was initially surprised that updating the table made no difference to calculations based upon the named array! Then I looked at the refers to attribute of the newly-defined name.

I have been thinking about coding a demonstrator environment in which the user could access names, as pointers to the lead cell of any range, in order to develop new formulas. The dimensions of the formula range would be determined automatically from the dimensions of the arrays it references. The actual implementation could be based upon an array formula or the usual relative referencing; the user wouldn't need to know since event handlers would reselect the lead cell whenever an attempt is made to select another cell in the range, so decently hiding unstructured content.

I have generated some discussion on Names and Array formulae on LinkedIn, where I moderate a couple of Excel groups, but I wasn't sure that it was appropriate here. I thought a post along the lines of "Why do your spreadsheets look like roadkill?" might get me banned!
[Explanation: all the articulately expressed business modelling ideas are flattened to the point where no life, meaning or structure is discernible, just a sea of cells]
 

Lori

Active Member
a post along the lines of "Why do your spreadsheets look like roadkill?" might get me banned!
..but it would be funny! Poor design can often arise due to the limitations inherent in spreadsheets for working with variable sized data on a fixed two dimensional grid. A number of commercial add-ins (quantlibXL, ManagedXLL, .. ) avoid this to some extent by allowing one to work with data objects (similar to named arrays) referenced through a cell "handle". This at least then allows one to organise data into a contiguous set of model inputs.

Another point relating to the above formula, is that if one normalises data by rows (using COUNTIF or similar) there are only 5 possible outcomes - equal to the number of integer partitions of the number 4. However, this kind of approach can only work up to 15 or so columns due to the numeric precision.
 

Peter Bartholomew

Well-Known Member
Hi Lori
I did get to look up quantlibXL. Approaching the material with very limited knowledge of finance, C++ programming or statistical analysis was not a great start :(
What struck me was that the calculation seems to take place within a compiled code environment which is doubtless far more efficient but leave one wondering whether Excel has any significant role in the process, maybe Get and Transform?
Roadkill - I did put a discussion up on Patrick O'Beirne's group eusprig@yahoogroups.com but as yet I have no idea whether it will be ignored or whether it will me get drummed out of the Brownies!
 
Top