Using an Array Formula to Find and Count the Maximum Text Occurrences in a Range

Posted on October 31st, 2011 in Excel Howtos , excel links , Formula Forensics , Huis , Learn Excel , Posts by Hui - 55 comments

A week ago Tarun asked a question on the Chandoo.org Forums.

“I have got multiple names in each row and would like to have what name is repeated maximum number of times and how many times?

Eg. Ram, Amita, Obama, Ram, Willi, Ram, Amita, Chandoo, Ram, Willi

Ans: Ram (4 times)”

(The list and answers are edited)

Chandoo responded with a neat Array Formula:

=INDEX(B2:K2,MATCH(MAX(COUNTIF(B2:K2,B2:K2)), COUNTIF(B2:K2,B2:K2),0))  &

” (“&MAX(COUNTIF(B2:K2,B2:K2))&” times)”

Lets take a look inside this and see how it works

 

THE EXAMINATION

The formula has two parts separated by a &

=INDEX(B2:K2,MATCH(MAX(COUNTIF(B2:K2,B2:K2)), COUNTIF(B2:K2,B2:K2),0))

and

&

and

” (“&MAX(COUNTIF(B2:K2,B2:K2))&” times)”

Each part is separate and can be used independently, the & character simply joins the two parts together to make a single string which answers Tarun’s question, Ram (4 times).

Now, lets look at each part.

You can follow along with this forensic examination by downloading the Sample Data File.

 

=INDEX(B2:K2,MATCH(MAX(COUNTIF(B2:K2,B2:K2)), COUNTIF(B2:K2,B2:K2),0))

This is a single Index Function with 2 components, being:

a Range B2:K2 and

a Count  MATCH(MAX(COUNTIF(B2:K2,B2:K2)), COUNTIF(B2:K2,B2:K2),0)

Typically an Index Function uses 3 components

=Index(Array, Row Number,[Column Number])

In this example the Range is a single Row, B2:K2

And so using the Counter in the Row spot has the effect of counting down the first Column and then continuing at the top of the second Column etc

So the formula used:

=INDEX(B2:K2,MATCH(MAX(COUNTIF(B2:K2,B2:K2)), COUNTIF(B2:K2,B2:K2),0))

Is equivalent to:

=INDEX(B2:K2,1,MATCH(MAX(COUNTIF(B2:K2,B2:K2)), COUNTIF(B2:K2,B2:K2),0))

 

Now lets jump ahead to the COUNTIF(B2:K2,B2:K2) bit

If you copy =COUNTIF(B2:K2,B2:K2) to a cell, Press F2 and then evaluate the Formula using F9

You will see that it returns an array. The array is highlighted by the squiggly brackets {  } ‘s

={4,2,1,4,2,4,2,1,4,2}

This is the heart of the solution.

What this is showing us is that for each position in the range B2:K2, the count of how many times that cells value occurs in the range B2:K2

So the formula

=INDEX(B2:K2,MATCH(MAX(COUNTIF(B2:K2,B2:K2)), COUNTIF(B2:K2,B2:K2),0))

Is equivalent to

=INDEX(B2:K2,MATCH(MAX({4,2,1,4,2,4,2,1,4,2}), {4,2,1,4,2,4,2,1,4,2},0))

Looking at the MAX({4,2,1,4,2,4,2,1,4,2}) part, this simplifies to 4, the Maximum value of the array (Remember this line, we’ll come back to it later).

So our simplified formula is now: =INDEX(B2:K2,MATCH(4, {4,2,1,4,2,4,2,1,4,2},0))

Now looking at the MATCH(4, {4,2,1,4,2,4,2,1,4,2},0) part of the equation

You can see that Match is looking for the value 4, in the array {4,2,1,4,2,4,2,1,4,2}, which is the First value , Position 1, the 0 requesting that an exact match is found.

So that MATCH(4, {4,2,1,4,2,4,2,1,4,2},0) is equivalent to 1

So our equation =INDEX(B2:K2,MATCH(4, {4,2,1,4,2,4,2,1,4,2},0))

Is now simplified even more to =INDEX(B2:K2, 1)

Index will then look in B2:K2 and will return the first cell or “Ram” in this example.

 

& “(” & MAX(COUNTIF(B2:K2,B2:K2)) & ” times)”

The second part of the equation is responsible for counting the number of Times Ram occurs and displaying it with some text.

& “(” & MAX(COUNTIF(B2:K2,B2:K2)) & ” times)”

The parts displayed in Red above add the text ( and times) to the Count

Remember the section MAX(COUNTIF(B2:K2,B2:K2)) which was explained above and evaluates to 4 in this case

So the & “(” & MAX(COUNTIF(B2:K2,B2:K2)) & ” times)”

Part evaluates to: ( 4 times)

With the initial & adding it to the text of the first part Ram for the final result – Ram ( 4 times)

 

LEARN MORE ABOUT ARRAY FORMULAS

You can learn more about Array Formulas at the following links:

http://www.cpearson.com/excel/ArrayFormulas.aspx

http://www.databison.com/index.php/excel-array-formulas-excel-array-formula-syntax-array-constants/

http://office.microsoft.com/en-us/excel-help/introducing-array-formulas-in-excel-HA001087290.aspx

 

Chandoo.org has several articles on Array Formulas

http://chandoo.org/wp/tag/array-formulas/

 

FORENSIC FORMULAS

Would you like to see more “Forensic” examination of complex formulas ?

Let us know in the comments below and it may become a regular section at Chandoo.org.

 

Written by Hui...
Tags: , , , , , , , , ,
Home: Chandoo.org Main Page
? Doubt: Ask an Excel Question

55 Responses to “Using an Array Formula to Find and Count the Maximum Text Occurrences in a Range”

  1. Indian says:

    Amazing.............what a simple and superb explanation......and one more thing I learned a new feature of evaluaiton as well..........

  2. tamoghna9 says:

    I just love HUIs articles!!!

    Thank you Hui.

    We want more such examples.

  3. Michael Azer says:

    Wow, I've learnt a lot today! 🙂 Thanks Hui!
    I know the purpose of this post was to learn tricks using Array Formulas, but I was wondering what would you do if you had a tie?

  4. Sawan says:

    Very nice neat explanation,

    Thanks Hui

  5. OleMiss2010 says:

    Brilliant explanation. Thanks Hui for walking through each of those components.

  6. OleMiss2010 says:

    I'm having some trouble implementing this formula:
    1) The Max(CountIf()) portion returns a "3" whenever it should return a "4" in the array I've selected. "3" is appropriate for the item in the first cell, but the item in the third cell occurs 4 times. The problem seems to correct itself if the first cell contains the most frequent value.
    2) When I add the Match() portion I get an "#N/A" error. This occurs whether or not I have the most common value in the first cell.

  7. OleMiss2010 says:

    Ignore my previous comment. I forgot to make it an array formula. Problem easily solved.

  8. ikkeman says:

    Is there a way to capture a tied first place? as in, if ram and chandoo both occurred 4 times, what does the formula return?

  9. oldchippy says:

    @ikkeman

    It returns whoever is first to appear in the list

  10. Brian Holt says:

    Yes, I would like to see more forensic explanations of complex formulas. The best way for me to learn how something works is to take it apart. I learn something just about every day from your site, so keep up the great work!

  11. If the list does not have blanks:
    .

    {=INDEX(B2:K2,MODE(MATCH(B2:K2,B2:K2,0)))&" ("&MAX(COUNTIF(B2:K2,B2:K2))&" times)"}

    .
    which is 29 fewer characters and 2 function calls less.

  12. mandar pande says:

    Great...!!!! i would like to know more about Forensic Accounting formulas..

  13. Siddique Sayeed says:

    these insights in Excel are really very helpful

  14. Danièle says:

    These posts are invaluable because they go through and make you see straight away "sense", but also from the comments, novel formulas too.
    Thanks Hui for a very detailed explanation and Daniel Ferry for the simplified formula; a neat use of MODE.
    More forensic excel can only be more fun.
    Bring it on!

  15. Amit Dani says:

    simply Awesome.. loved ur way of explanation. looking forward for more posts.

  16. Sailepaty says:

    To return the last occurrence

    {=LOOKUP(MAX(COUNTIF(B2:K2,B2:K2)),COUNTIF(B2:K2,B2:K2),B2:K2)&" ("&MAX(COUNTIF(B2:K2,B2:K2))&" times)"}

    Regards

  17. Tanner says:

    i would like to know more about Forensic Accounting formulas!

    am still learning about excel, but I'm wondering - if you had just used the COUNTIF instead of MAX & COUNTIF, would it not have returned the same answer 4?

  18. andreas.wpv says:

    Great post, really helpful. I have some strange thing happening though when trying to rework this.

    In a cell i type "=countif(B2:K2,B2:K2)". Results in '2'. Then I hit f2, then f9 to see the array - and the value changes to 4.

    Can you replicate (I can, works every time)? How does this work?

  19. Hui... says:

    @Andreas
    When using
    =countif(B2:K2,B2:K2)
    It must be array entered ie: with Ctrl Shift Enter

  20. Chandoo says:

    @Daniel... Very clever way to use MODE and MATCH... thanks for sharing it with us.

  21. Ilona says:

    Great explanation!!! it would be nice to have more of these type of examples!!! Simple and great!!thank you Hui!

  22. vietkuong says:

    How about if the data is in multiple rows? I tried your formula but not success.

  23. Muntz says:

    One problem: this does not seem to take into account instances where more than one name is represented the maximum number of times...

  24. Joyce says:

    I'm wondering if there's a way to count the number of occurences of words when they're all in a cell? Like this:
    A1: "Windows NT, Networking, Firewalls, Security, TL, Training"
    A2: "Networking, Networking, Training, Security, TL, Training"
    A3: "Security, TL, Firewalls, Security, Networking, Windows NT"

    Thanks for any help. 🙂
    Joyce

  25. Hui... says:

    @Joyce
    Give the following a go:
    Array enter
    =SUM(LEN(A1:A3)-LEN(SUBSTITUTE(A1:A3,B1,"")))/LEN(B1) Ctrl Shift Enter
    B1: the word you want to count eg: Security

  26. Joyce says:

    WOW! thanks a lot Hui, works great!

  27. Vaibhav Garg says:

    Hi,
    one trick that I want to share with you all, which looks pretty obvious in hindsight, is that you can use white-space in long formulas to improve readability, for eg:
    =SUMPRODUCT(
    --(MONTH(OUTDATE_VG)=MONTH($J11)),
    --(YEAR(OUTDATE_VG)=YEAR($J11)),
    --(N$2<=OUTTIME_VG),
    --(OUTTIME_VG<O$2)
    )/$V11

    works and is much more intuitive than a single line function.

  28. Vaibhav Garg says:

    It is double negative in the above formula, by the way.

  29. K_Mac says:

    Spreading a formula over several lines is a great way to format code in programming text but is it possible in Excel or is your suggestion above only for debugging/illustrative purposes for this forum. Is white space the same as ASCII 32 or another character which leaves a space.causes a carriage return/line feed but is ignored by ythe Excel engine when interpreting the function.

    Another useful topic for Chandoo to explore may be Evaluate Formula as this can be a useful tool for debugging extended formulas. Will it work withs CSE's

  30. john@reddybay says:

    Hui fantastic explanation so yes I would like to see more of these forensic formulas - these are real world problems we all face from time to time.
    john

  31. James says:

    Very helpful! Please keep them coming. Thanks for your time. Your website is excellent; one of the best I've been too.

  32. Raiz says:

    The step by step explanation is amazing and absolutely wonderful !

    I have seen a MAMMOTH formula for converting numbers to words - I wish you do a step by step killing of that MAMMOTH soon in this section.

    I can share you the link to the Mammoth if you are interested.

  33. Vaibhav Garg says:

    @K_mac
    The line break works in excel, it's not just for illustration.
    The formula illustrated above is a CSE, so i guess it does work. You'll have to try it to believe it, just press Alt+Return, where ever you want a line break. HTH

  34. Jeremiah Minifield says:

    Forensics is an excellent way to learn application of the various formulas. I'd like to see it on a regular basis.

  35. Anar says:

    Hi,

    Question:
    What if there are words with equal number of occurrences? For example, 3 "Ram"s and 3 "Amita"s.

  36. Koenan says:

    The article implies that it found the number of times that "RAM" appears in the list but it just backs into that number.

    *** It finds the MAX occurrence of ANY name. ***

    This formula has nothing to do with the number of times each unique name appears.

    It only deals with the max times ANY name appears. Pretty worthless IMO.

  37. Daniel Ferry says:

    Did you get up on the wrong side of the bed, Koenan?

    The formula does EXACTLY what Tarun's question asked. If returns which name is the most frequent in the list, and how many times that name appears.

    Before crustily deriding the article, the readers here would be better served if you took the time to understand the point first.

  38. Egan says:

    "Using an Array Formula to Find and Count the Maximum Text Occurrences in a Range"

    Precisely what I needed --- thank you!

  39. ASHOK NILLAY says:

    Sir,
    Pls suggest me how to count the occurrence of text in excel, for EX in each cell (Say A1) there is data like shown below

    Qtr No 132, ashok
    Qtr N0 135, shyam
    Plot No 138, ashok
    and so on

    how to count the occurrence of ashok

  40. iPrabhakar says:

    Chandoo,

    This is excellent & mind blowing formula....

  41. Neel says:

    Hello

    Is there a way, to get count of the max. no. of occurences of text/data in 1 column relative to the data in the adjacent column i.e. return "guava" as the result as it relates to criteria "abc" in Col B etc!
    Col A Col B Col C
    apple abc = guava
    guava abc
    guava abc
    guava abc
    banana xyz
    banana xyz
    banana xyz
    banana xyz
    banana xyz
    grapefruit xyz
    grapefruit xyz
    grapefruit xyz
    grapefruit xyz
    grapefruit xyz
    grapefruit xyz
    grapefruit xyz
    orange efg
    orange efg
    orange efg
    orange efg

    Thanks!

    • Chandoo says:

      Hi Neel.. welcome chandoo.org & thanks for your question.

      You can use array formulas to do this.

      Assuming colA data is named list1, colB data is named list2 and first row of data is in A1:B1, in C1 write below formula.

      =INDEX(list1,MATCH(MAX(IF(list2=B1,COUNTIFS(list2,list2,list1,list1))),COUNTIFS(list2,list2,list1,list1),0))

      and press CTRL+Shift+Enter to get the result. Drag down to get other values.

      • Neel says:

        Chandoo....Many thanks for the speedy response, somehow i seem to be getting an error i.e. #VALUE! when i try to appy the formula. I did define the "Names" as list1 and list2 as suggested by you, so not sure i'm doing wrong here?.

        Best

        Neel

        • Neel says:

          Ok, so i think i figured out what i was doing wrong, list1 started out from A2 instead of A1. Now, the formula seems to work fine except that towards the bottom where it should return "orange" (relative to "efg" in ColB)....for some odd reason it returns "guava"??

          Any help is greatly appreciated. Thanks again!

  42. Swati says:

    Hi,

    Thanks for the formula.
    But if i need to try the same formula for alternate coulmns.
    For eg : Max repeated text from alternate columns.

    Ram 0 Ram 125 Sita 12 Pooja 33 Ram 12 Laxman 56

    Now ram is repeated 3 times which is max. so how can i find using the index formula for the alternate columns.

  43. Carlos says:

    Hello,

    I was wondering: is there a way to modify the formula used in this example in order to identify the maximum number in a data set based on a certain variable? Specifically, I was hoping to use the formula to find out what the maximum number in a large range (F7:EY7) is only if there are at least 5 data points that are equal to that number.

    e.g., (45,50,50,40,25,55,50,45,50,60,55,50) in this data set, although 60 is the max number, I would not want 60 returned as the max, I would want 50 to be the number identified as the max. Basically, I am attempting to avoid pulling outliers that mean nothing to my data set.

    Thanks,
    - C

  44. Kapish says:

    2 questions regarding this one.

    1. Is the formula valid across a range? (Not a single row like B2:K2, but across multiple rows and columns). I tried but was not working, so can it be modified such that it works across multiple rows and columns?

    2. What if you want to find out the word appearing the 2nd, 3rd maximum number of time. Can any alternate be used to MAX function. I tried the "LARGE" function, but did not work

    Any help will be much appreciated

  45. greenwold eliezer says:

    I have a column with 3000 rows of phrases on the products, and I want a formula to create index of cells that have a similar word batir example attachment. Thank you very much

Leave a Reply