# Formula Challenge 003 - Count unique characters common to two strings

#### Sajan

##### Excel Ninja
Challenge Name

-- Calculate the number of unique characters that are common to two strings

Challenge Description

-- This is a formula challenge. As such, only formula submissions are accepted.

-- For two given strings (that may contain any number of characters -- alphabets, numbers, symbols, etc. -- the goal is to determine the number of unique characters that are common to both strings.

-- Each string could be of variable length.

-- Goal is to calculate the count without using any helper cells, names, etc.

Assumptions

-- You may assume a different location for the data range than cell A1 and B1

-- You may not use any helper cells

-- No difference is attributed to upper case and lower case alphabets. i.e. "A" and "a" are considered equivalent.

-- The characters in each string may appear in any order

Sample data and output:

Code:
``````String1            String2              UniqueCount
Ridiculous    Simple                  3
Incredible    Silly              2
Magic            Fun                  0
Balloon            Children          2
November    December          4
January            August                  2
February    January                  4
United States    India                  4
Australia    United Kingdom        3
Indomitable    submissive          4
Hippopotamus    Elephant          4
Formula            Challenge          2``````
As an example, the string "Ridiculous" and "Simple" have the letters "s", "i" and "l" in common. As such, the expected output will be 3.

Enjoy!

-Sajan.

Last edited by a moderator:

#### jeffreyweir

##### Active Member
That is just NASTY!

Code:
``=SUM(--(FREQUENCY(IFERROR(SEARCH(MID(A2,ROW(OFFSET(A\$1,,,LEN(A2))),1),B2),LEN(B2)+1),ROW(OFFSET(A\$1,,,LEN(B2))))&#62;0))-1``

Took me two hours!

#### Sajan

##### Excel Ninja
Hi Jeff,

Nice approach!

However, you would want to tweak it a bit... run the formula against "yyy" and "xy" to see why.

-Sajan.

#### NARAYANK991

##### Excel Ninja
Hi Sajan ,

Sorry , but this seemed the easiest way !

http://www.mrexcel.com/archive/Formulas/17498.html

In my defence , let me mention John Walkenbach's rejoinder in this link :

http://datapigtechnologies.com/blog/index.php/the-impenetrable-hiring-test

Narayan

#### Sajan

##### Excel Ninja
Hi Narayan,

Very funny! You are the Google master! In my defence, I did try a few Google searches before settling on the idea for the challenge... but I guess my search skills need work!

In any case, Jeff's formula (with a small tweak) is better than what was posted on the MrExcel site.

-Sajan.

#### NARAYANK991

##### Excel Ninja
Hi Sajan ,

May be , but I find the formula very intuitive , and so easy to understand ; just check which characters from one string occur in the constant array of the alphabets , do the same for the other string ; the FREQUENCY function automatically takes care of duplicates , and array multiplication automatically finds what is in common.

For me , that formula is beautiful , since it does so much , so simply.

Narayan

#### Sajan

##### Excel Ninja
Hi Narayan,

I agree that the approach in Aladin's original formula is elegant. In fact, Jeff's formula follows the same theme. Jeff's formula then extends it by checking for uniqueness. However, unlike the approach described in Mark's formula (also from the same MrExcel link posted above), Jeff identified that there is no need to do a full check against the second string, but a check against the full length of the second string would be sufficient.

I have applied the small tweak to Jeff's CSE formula:

=SUM(--(FREQUENCY(IFERROR(SEARCH(MID(A2,ROW(OFFSET(A\$1,,,LEN(A2))),1),B2),FALSE),ROW(OFFSET(A\$1,,,LEN(B2))))>0))

To me, that is beautiful!!

-Sajan.

Last edited by a moderator:

#### jeffreyweir

##### Active Member
Hi Sajan. Thanks! Partly inspired by your great post at http://chandoo.org/forums/topic/counting-unique-items-in-same-cell

Will have a think about the tweak needed when I get a chance. Also have another good formula challenge up my sleeve...just got to do some stuff first.

I love this series.

 - I see you made the tweak already. Sweet. Also proud to announce that I came up with this puppy independently of any Google search, although as per above inspiration from Sajan was essential.

#### Chris van der Berg

##### Member
Hi Narayan,

I agree that the approach in Aladin's original formula is elegant. In fact, Jeff's formula follows the same theme. Jeff's formula then extends it by checking for uniqueness. However, unlike the approach described in Mark's formula (also from the same MrExcel link posted above), Jeff identified that there is no need to do a full check against the second string, but a check against the full length of the second string would be sufficient.

I have applied the small tweak to Jeff's CSE formula:

=SUM(--(FREQUENCY(IFERROR(SEARCH(MID(A2,ROW(OFFSET(A\$1,,,LEN(A2))),1),B2),FALSE),ROW(OFFSET(A\$1,,,LEN(B2))))&#62;0))

To me, that is beautiful!!

-Sajan.

#### Chris van der Berg

##### Member
Hi Sajan,

I am going through the Formula Challenge series and are stuck on this formula.
Can you please explain to me the meaning of "&#62" in your formula
=SUM(--(FREQUENCY(IFERROR(SEARCH(MID(A2,ROW(OFFSET(A\$1,,,LEN(A2))),1),B2),FALSE),ROW(OFFSET(A\$1,,,LEN(B2))))&#62;0))

Regards

Chris

#### shrivallabha

##### Excel Ninja
Hi Sajan,

I am going through the Formula Challenge series and are stuck on this formula.
Can you please explain to me the meaning of "&#62" in your formula
=SUM(--(FREQUENCY(IFERROR(SEARCH(MID(A2,ROW(OFFSET(A\$1,,,LEN(A2))),1),B2),FALSE),ROW(OFFSET(A\$1,,,LEN(B2))))&#62;0))

Regards

Chris
Hi Chris,

Looks like you joined recently. Welcome to Chandoo.org forums.

This forum migrated few months back. During migration, as it happens, some issues cropped up and have not been addressed as of now. You can see a discussion regarding such issues in the lounge.

In your case also backticks have caused small problem. Red marked part [do not mistake the ; for argument separator as you can see rest of the arguments have been separated by , (comma) and not ;(semicolon)]

=SUM(--(FREQUENCY(IFERROR(SEARCH(MID(A2,ROW(OFFSET(A\$1,,,LEN(A2))),1),B2),FALSE),ROW(OFFSET(A\$1,,,LEN(B2))))&#62;0))

becomes

=SUM(--(FREQUENCY(IFERROR(SEARCH(MID(A2,ROW(OFFSET(A\$1,,,LEN(A2))),1),B2),FALSE),ROW(OFFSET(A\$1,,,LEN(B2))))>0))

Quick key to finding out is using numerical part i.e. in a blank cell put following formula
=CHAR(62)

Hope this helps your way around!

#### Chris van der Berg

##### Member
Hi Shrivallabha,

Thanks for your quick response, and explanation!
Yes I joined recently and enjoying the Formula Challenge forum.
It is rather difficult to make sense out of the formula's most of the time as you guys are light years ahead of me, but I am trying.

Regards

Chris

#### DBExcel

##### New Member
Solution in PowerQuery - see video here
Super-fast to develop, very flexible (can return common characters as well as count) and less cryptic than cramming it all into a one-line formula.

#### siva97237

##### New Member
=SUM(IF(FREQUENCY(IFERROR(SEARCH(MID(B1&" ",ROW(INDIRECT("1:"&LEN(B1))),1),A1),0),IFERROR(SEARCH(MID(B1&" ",ROW(INDIRECT("1:"&LEN(B1))),1),A1),0)),1))-1
try this array formula

nice one, Siva

#### Haseeb A

##### Active Member
Hello Challengers,

Here is another CSE compatible with all versions.

=COUNT(MATCH(ROW(INDIRECT("1:"&LEN(A1))),SEARCH(MID(B1,ROW(INDIRECT("1:"&LEN(B1))),1),A1),0))

Note LEN: First one is LEN(A1), 2nd one is LEN(B1)

• Thomas Kuriakose