• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Formula Challenge 007 - return distinct/duplicate /unique values as an array

jeffreyweir

Active Member
Formula Challenge 007 - return distinct/duplicate /unique values as an array


Challenge Description

Good evening, Gentlemen. Your mission, should you choose to accept it, is to:

- Write three formula that respectively return the distinct, duplicate, and unique values of a numerical range.

- The output of each formula must be an array that can be used directly by other formula (such as Max, Min, Average etc). That is, we're after a single formula that you can directly incorporate into other formulas - not a formula that has to be entered over a worksheet range in order to work.

- Each formula must return an array of the exact size containing only the distinct, duplicate, and unique values. So as per the sample data and output below:

* The array produced by the Distinct formula will be exactly 13 elements long

* The array produced by the Unique formula will be exactly 9 elements long

* The array produced by the Duplicate formula will be exactly 5 elements long

- Each formula should be completely dynamic, and able to be used on a 1D array of any size.


This message will self-destruct in 10 seconds.

Good Luck


Sample Data


[pre]
Code:
Data  Distinct  Unique  Duplicate
1	0	1	1
2	1	15	2
45	2	22	3
45	3	23	45
49	15	35	49
35	22	37
2	23	39
15	35	41
22	37	99
49	39
39	41
23	49
41	99
3
37
1
3
2
99
0
[/pre]
 
Hello,


Here is my understanding about Distinct, Unique & Duplicate on here.


Distinct : Numbers without repeating.

Unique : Numbers appear only once. COUNTIF =1

Duplicate : Numbers appear more than once. COUNTIF >1


If so, here is an attempt. More elegant ways are coming from experts.


Distinct:


=SMALL(IF(FREQUENCY(A:A,SMALL(A:A,ROW(A$1:INDEX(A:A,COUNT(A:A)))))>0,SMALL(A:A,ROW(A$1:INDEX(A:A,COUNT(A:A))))),ROW(A$1:INDEX(A:A,SUM((FREQUENCY(A:A,A:A)>0)+0))))

Change bolded >0 to =1 & >1 for Unique & Duplicate respectively.


Haseeb
 
Hi Jeff,

Could you check your sample results?


Following Haseeb's interpretation, and following the same theme, here are my versions:


Unique: =SMALL(IF(FREQUENCY(data,data)=1,data), ROW(OFFSET(A$1,,,SUM(N(FREQUENCY(data,data)=1)))))


Distinct: =SMALL(IF(FREQUENCY(data,data)>=1,data), ROW(OFFSET(A$1,,,SUM(N(FREQUENCY(data,data)>=1)))))


Duplicate: =SMALL(IF(FREQUENCY(data,data)>=2,data), ROW(OFFSET(A$1,,,SUM(N(FREQUENCY(data,data)>=2)))))


-Sajan.
 
Well done Haseeb and Sajan. Yes, embarrassingly I posted some incorrect sample results! Sorry about that. Should have been this:

[pre]
Code:
Data    Distnct    Unique    Duplicate
1    0    0    1
2    1    15    2
45    2    22    3
45    3    23    45
49    15    35    49
35    22    37
2    23    39
15    35    41
22    37    99
49    39
39    41
23    45
41    49
3    99
37
1
3
2
99
0
[/pre]
Mine is only slightly longer than Sajans, as I went with a non-volatile INDEX-based version that is also resistant to row insertions:

=SMALL(IF(FREQUENCY(data,data)>0,data),ROW(INDEX(A:A,1):INDEX(A:A,SUM(N(FREQUENCY(data,data)>0)))))
 
Also, I note that I'd forgotten to post Formulas Challenge # 6 in the right category. Would some kind-hearted ninja be able to move it for me?

http://chandoo.org/forums/topic/formula-challenge-006-whats-the-number-you-can-confirm-is-a-prime-or-not


No answers on this one. Note that it's not just about coming up with a prime number checker, but coming up with one that works on the sample data.
 
Good one Sajan & Jeff.


My initial attempt was same as your just FREQUENCY(data,data). But I got #N/A, if there is at least 2 empty or non numeric characters in the range. Could you guys please check it?


Haseeb
 
Hi Haseeb,

You are right... not sure why the error did not get caught due to the dissimilarly sized arrays...

since I am feeling lazy, here are the alternate versions:


Unique: =AGGREGATE(15,6, IF(FREQUENCY(data,data)=1,data), ROW(OFFSET(A$1,,,SUM(N(FREQUENCY(data,data)=1)))))

Distinct: =AGGREGATE(15,6,IF(FREQUENCY(data,data)>=1,data), ROW(OFFSET(A$1,,,SUM(N(FREQUENCY(data,data)>=1)))))

Duplicate:
=AGGREGATE(15,6,IF(FREQUENCY(data,data)>=2,data), ROW(OFFSET(A$1,,,SUM(N(FREQUENCY(data,data)>=2)))))


Cheers,

Sajan.
 
Sajan: are those AGGREGATE ones to handle the case where there is at least 2 empty or non numeric characters in the range? If so, I don't think they work. Try them on this:

1

1

2

2

6

7

3

3

8

9

a

b

12

4

13

4

5

5

14

0
 
Hi Ali ,

These are problems which have crept in during the platform migration , from the earlier BBPress to the present XenForo.

Where ever you find such strange characters , please note that they need to be replaced as follows :

&#60; ----------------- needs to be replaced by the single character < ( the less than symbol )

&#61; ----------------- needs to be replaced by the single character = ( the equal to symbol )

&#62; ----------------- needs to be replaced by the single character > ( the greater than symbol )

&#34; ----------------- needs to be replaced by the single character " ( the double quote symbol )

&#39; ----------------- needs to be replaced by the single character ' ( the single quote symbol )

Please note that if you find 2 sets of characters such as :

&#60;&#61;

together , you will need to replace them in sets , so that these will be equivalent to :

<=

In case you find other strange characters in any post , please let me know , so that we can include it here in this same post / thread.

Narayan
 
Dear Sir.
The formula is not working with the sample data. Can somebody please upload a sample data file with the formula.

Thank you.
Excel Learner :)
 
Hi Ali ,

See the uploaded file , which has Sajan's formulae.

The formulae are all multi-celled array formulae , which means you need to select a large enough range to accommodate all possible entries , and then enter the formula using CTRL SHIFT ENTER.

The formula will be the same in all the cells within the selected range , but Excel will do the correct calculations and put the proper values in their respective cells.

Narayan
 

Attachments

  • Sample.xlsx
    9.9 KB · Views: 17
Hi Ali,

Also see attached file, with single cell array formulas. Note the yellow cells will work for numbers and no blank cells in data, whereas Orange cells will work for Numbers, alphabets, Blanks cells all type. Please also note Orange Cells formulas will work on Excel 2010 & +.

Regards,
 

Attachments

  • Sample (10).xlsx
    13.5 KB · Views: 20
Thankyou so much Sir/Ma'm.
I am having trouble understanding the formula. Could you please explain what is the role of "Data" in name manager and how can i use it in my other data sheets? Snapshot is attached for your reference.
Thankyou for the valuable response.
 

Attachments

  • SNAP.png
    SNAP.png
    38.1 KB · Views: 4
@Asfandyar Ali

Firstly I am male :)

Secondly, Data is dynamic named range use by Narayan Sir.

You can use direct ref. also as shown in Orange cells. Drawback formula will not update automatically if you add more data.

Regards,
 
Oh sorry Sir :)
Yeah i got that. its done. Many thanks.
Can you please guide me if i do not want the unique/distinct values sequentially. for instance; if the first unique/distinct value is returned in "F2" (orange) cell and next one i want it in "F20" or "F30" etc?
 
In that case you have to replace ROWS() function in SMALL function by 1,2 and so on. Say In F2 put 1, than in F20 put 2, in F30 put 3...

Regards,
 
Dear Sir, could you please help to put a formula in the attached sample sheet. i tried but couldn't. Sorry for the mess i created :)
 

Attachments

  • sample(11).xlsx
    8.3 KB · Views: 2
Back
Top