A week ago Tarun asked a question on the Chandoo.org Forums.
“I have got multiple names in each row and would like to have what name is repeated maximum number of times and how many times?
Eg. Ram, Amita, Obama, Ram, Willi, Ram, Amita, Chandoo, Ram, Willi
Ans: Ram (4 times)”
(The list and answers are edited)
Chandoo responded with a neat Array Formula:
=INDEX(B2:K2,MATCH(MAX(COUNTIF(B2:K2,B2:K2)), COUNTIF(B2:K2,B2:K2),0)) &
” (“&MAX(COUNTIF(B2:K2,B2:K2))&” times)”
Lets take a look inside this and see how it works
THE EXAMINATION
The formula has two parts separated by a &
=INDEX(B2:K2,MATCH(MAX(COUNTIF(B2:K2,B2:K2)), COUNTIF(B2:K2,B2:K2),0))
and
&
and
” (“&MAX(COUNTIF(B2:K2,B2:K2))&” times)”
Each part is separate and can be used independently, the & character simply joins the two parts together to make a single string which answers Tarun’s question, Ram (4 times).
Now, lets look at each part.
You can follow along with this forensic examination by downloading the Sample Data File.
=INDEX(B2:K2,MATCH(MAX(COUNTIF(B2:K2,B2:K2)), COUNTIF(B2:K2,B2:K2),0))
This is a single Index Function with 2 components, being:
a Range B2:K2 and
a Count MATCH(MAX(COUNTIF(B2:K2,B2:K2)), COUNTIF(B2:K2,B2:K2),0)
Typically an Index Function uses 3 components
=Index(Array, Row Number,[Column Number])
In this example the Range is a single Row, B2:K2
And so using the Counter in the Row spot has the effect of counting down the first Column and then continuing at the top of the second Column etc
So the formula used:
=INDEX(B2:K2,MATCH(MAX(COUNTIF(B2:K2,B2:K2)), COUNTIF(B2:K2,B2:K2),0))
Is equivalent to:
=INDEX(B2:K2,1,MATCH(MAX(COUNTIF(B2:K2,B2:K2)), COUNTIF(B2:K2,B2:K2),0))
Now lets jump ahead to the COUNTIF(B2:K2,B2:K2) bit
If you copy =COUNTIF(B2:K2,B2:K2) to a cell, Press F2 and then evaluate the Formula using F9
You will see that it returns an array. The array is highlighted by the squiggly brackets { } ‘s
={4,2,1,4,2,4,2,1,4,2}
This is the heart of the solution.
What this is showing us is that for each position in the range B2:K2, the count of how many times that cells value occurs in the range B2:K2
So the formula
=INDEX(B2:K2,MATCH(MAX(COUNTIF(B2:K2,B2:K2)), COUNTIF(B2:K2,B2:K2),0))
Is equivalent to
=INDEX(B2:K2,MATCH(MAX({4,2,1,4,2,4,2,1,4,2}), {4,2,1,4,2,4,2,1,4,2},0))
Looking at the MAX({4,2,1,4,2,4,2,1,4,2}) part, this simplifies to 4, the Maximum value of the array (Remember this line, we’ll come back to it later).
So our simplified formula is now: =INDEX(B2:K2,MATCH(4, {4,2,1,4,2,4,2,1,4,2},0))
Now looking at the MATCH(4, {4,2,1,4,2,4,2,1,4,2},0) part of the equation
You can see that Match is looking for the value 4, in the array {4,2,1,4,2,4,2,1,4,2}, which is the First value , Position 1, the 0 requesting that an exact match is found.
So that MATCH(4, {4,2,1,4,2,4,2,1,4,2},0) is equivalent to 1
So our equation =INDEX(B2:K2,MATCH(4, {4,2,1,4,2,4,2,1,4,2},0))
Is now simplified even more to =INDEX(B2:K2, 1)
Index will then look in B2:K2 and will return the first cell or “Ram” in this example.
& “(” & MAX(COUNTIF(B2:K2,B2:K2)) & ” times)”
The second part of the equation is responsible for counting the number of Times Ram occurs and displaying it with some text.
& “(” & MAX(COUNTIF(B2:K2,B2:K2)) & ” times)”
The parts displayed in Red above add the text ( and times) to the Count
Remember the section MAX(COUNTIF(B2:K2,B2:K2)) which was explained above and evaluates to 4 in this case
So the & “(” & MAX(COUNTIF(B2:K2,B2:K2)) & ” times)”
Part evaluates to: ( 4 times)
With the initial & adding it to the text of the first part Ram for the final result – Ram ( 4 times)
LEARN MORE ABOUT ARRAY FORMULAS
You can learn more about Array Formulas at the following links:
http://www.cpearson.com/excel/ArrayFormulas.aspx
http://www.databison.com/index.php/excel-array-formulas-excel-array-formula-syntax-array-constants/
http://office.microsoft.com/en-us/excel-help/introducing-array-formulas-in-excel-HA001087290.aspx
Chandoo.org has several articles on Array Formulas
http://chandoo.org/wp/tag/array-formulas/
FORENSIC FORMULAS
Would you like to see more “Forensic” examination of complex formulas ?
Let us know in the comments below and it may become a regular section at Chandoo.org.
57 Responses to “Using an Array Formula to Find and Count the Maximum Text Occurrences in a Range”
Amazing.............what a simple and superb explanation......and one more thing I learned a new feature of evaluaiton as well..........
I just love HUIs articles!!!
Thank you Hui.
We want more such examples.
Wow, I've learnt a lot today! 🙂 Thanks Hui!
I know the purpose of this post was to learn tricks using Array Formulas, but I was wondering what would you do if you had a tie?
Very nice neat explanation,
Thanks Hui
Brilliant explanation. Thanks Hui for walking through each of those components.
I'm having some trouble implementing this formula:
1) The Max(CountIf()) portion returns a "3" whenever it should return a "4" in the array I've selected. "3" is appropriate for the item in the first cell, but the item in the third cell occurs 4 times. The problem seems to correct itself if the first cell contains the most frequent value.
2) When I add the Match() portion I get an "#N/A" error. This occurs whether or not I have the most common value in the first cell.
Ignore my previous comment. I forgot to make it an array formula. Problem easily solved.
Is there a way to capture a tied first place? as in, if ram and chandoo both occurred 4 times, what does the formula return?
@ikkeman
It returns whoever is first to appear in the list
Yes, I would like to see more forensic explanations of complex formulas. The best way for me to learn how something works is to take it apart. I learn something just about every day from your site, so keep up the great work!
If the list does not have blanks:
.
{=INDEX(B2:K2,MODE(MATCH(B2:K2,B2:K2,0)))&" ("&MAX(COUNTIF(B2:K2,B2:K2))&" times)"}
.
which is 29 fewer characters and 2 function calls less.
Great...!!!! i would like to know more about Forensic Accounting formulas..
these insights in Excel are really very helpful
These posts are invaluable because they go through and make you see straight away "sense", but also from the comments, novel formulas too.
Thanks Hui for a very detailed explanation and Daniel Ferry for the simplified formula; a neat use of MODE.
More forensic excel can only be more fun.
Bring it on!
simply Awesome.. loved ur way of explanation. looking forward for more posts.
To return the last occurrence
{=LOOKUP(MAX(COUNTIF(B2:K2,B2:K2)),COUNTIF(B2:K2,B2:K2),B2:K2)&" ("&MAX(COUNTIF(B2:K2,B2:K2))&" times)"}
Regards
i would like to know more about Forensic Accounting formulas!
am still learning about excel, but I'm wondering - if you had just used the COUNTIF instead of MAX & COUNTIF, would it not have returned the same answer 4?
Great post, really helpful. I have some strange thing happening though when trying to rework this.
In a cell i type "=countif(B2:K2,B2:K2)". Results in '2'. Then I hit f2, then f9 to see the array - and the value changes to 4.
Can you replicate (I can, works every time)? How does this work?
@Andreas
When using
=countif(B2:K2,B2:K2)
It must be array entered ie: with Ctrl Shift Enter
@Daniel... Very clever way to use MODE and MATCH... thanks for sharing it with us.
Great explanation!!! it would be nice to have more of these type of examples!!! Simple and great!!thank you Hui!
How about if the data is in multiple rows? I tried your formula but not success.
One problem: this does not seem to take into account instances where more than one name is represented the maximum number of times...
I'm wondering if there's a way to count the number of occurences of words when they're all in a cell? Like this:
A1: "Windows NT, Networking, Firewalls, Security, TL, Training"
A2: "Networking, Networking, Training, Security, TL, Training"
A3: "Security, TL, Firewalls, Security, Networking, Windows NT"
Thanks for any help. 🙂
Joyce
@Joyce
Give the following a go:
Array enter
=SUM(LEN(A1:A3)-LEN(SUBSTITUTE(A1:A3,B1,"")))/LEN(B1) Ctrl Shift Enter
B1: the word you want to count eg: Security
WOW! thanks a lot Hui, works great!
Hi,
one trick that I want to share with you all, which looks pretty obvious in hindsight, is that you can use white-space in long formulas to improve readability, for eg:
=SUMPRODUCT(
--(MONTH(OUTDATE_VG)=MONTH($J11)),
--(YEAR(OUTDATE_VG)=YEAR($J11)),
--(N$2<=OUTTIME_VG),
--(OUTTIME_VG<O$2)
)/$V11
works and is much more intuitive than a single line function.
It is double negative in the above formula, by the way.
Spreading a formula over several lines is a great way to format code in programming text but is it possible in Excel or is your suggestion above only for debugging/illustrative purposes for this forum. Is white space the same as ASCII 32 or another character which leaves a space.causes a carriage return/line feed but is ignored by ythe Excel engine when interpreting the function.
Another useful topic for Chandoo to explore may be Evaluate Formula as this can be a useful tool for debugging extended formulas. Will it work withs CSE's
Hui fantastic explanation so yes I would like to see more of these forensic formulas - these are real world problems we all face from time to time.
john
[...] week Joyce asked a question on the Chandoo.org, Comment [...]
Very helpful! Please keep them coming. Thanks for your time. Your website is excellent; one of the best I've been too.
The step by step explanation is amazing and absolutely wonderful !
I have seen a MAMMOTH formula for converting numbers to words - I wish you do a step by step killing of that MAMMOTH soon in this section.
I can share you the link to the Mammoth if you are interested.
@K_mac
The line break works in excel, it's not just for illustration.
The formula illustrated above is a CSE, so i guess it does work. You'll have to try it to believe it, just press Alt+Return, where ever you want a line break. HTH
[...] Formula Forensic 001 – Tarun’s Problem [...]
Forensics is an excellent way to learn application of the various formulas. I'd like to see it on a regular basis.
[...] Formula Forensics 001 – Tarun’s Problem [...]
Hi,
Question:
What if there are words with equal number of occurrences? For example, 3 "Ram"s and 3 "Amita"s.
[...] Formula Forensic Series: [...]
The article implies that it found the number of times that "RAM" appears in the list but it just backs into that number.
*** It finds the MAX occurrence of ANY name. ***
This formula has nothing to do with the number of times each unique name appears.
It only deals with the max times ANY name appears. Pretty worthless IMO.
Did you get up on the wrong side of the bed, Koenan?
The formula does EXACTLY what Tarun's question asked. If returns which name is the most frequent in the list, and how many times that name appears.
Before crustily deriding the article, the readers here would be better served if you took the time to understand the point first.
"Using an Array Formula to Find and Count the Maximum Text Occurrences in a Range"
Precisely what I needed --- thank you!
Sir,
Pls suggest me how to count the occurrence of text in excel, for EX in each cell (Say A1) there is data like shown below
Qtr No 132, ashok
Qtr N0 135, shyam
Plot No 138, ashok
and so on
how to count the occurrence of ashok
Chandoo,
This is excellent & mind blowing formula....
Hello
Is there a way, to get count of the max. no. of occurences of text/data in 1 column relative to the data in the adjacent column i.e. return "guava" as the result as it relates to criteria "abc" in Col B etc!
Col A Col B Col C
apple abc = guava
guava abc
guava abc
guava abc
banana xyz
banana xyz
banana xyz
banana xyz
banana xyz
grapefruit xyz
grapefruit xyz
grapefruit xyz
grapefruit xyz
grapefruit xyz
grapefruit xyz
grapefruit xyz
orange efg
orange efg
orange efg
orange efg
Thanks!
Hi Neel.. welcome chandoo.org & thanks for your question.
You can use array formulas to do this.
Assuming colA data is named list1, colB data is named list2 and first row of data is in A1:B1, in C1 write below formula.
=INDEX(list1,MATCH(MAX(IF(list2=B1,COUNTIFS(list2,list2,list1,list1))),COUNTIFS(list2,list2,list1,list1),0))
and press CTRL+Shift+Enter to get the result. Drag down to get other values.
Chandoo....Many thanks for the speedy response, somehow i seem to be getting an error i.e. #VALUE! when i try to appy the formula. I did define the "Names" as list1 and list2 as suggested by you, so not sure i'm doing wrong here?.
Best
Neel
Ok, so i think i figured out what i was doing wrong, list1 started out from A2 instead of A1. Now, the formula seems to work fine except that towards the bottom where it should return "orange" (relative to "efg" in ColB)....for some odd reason it returns "guava"??
Any help is greatly appreciated. Thanks again!
Hi,
Thanks for the formula.
But if i need to try the same formula for alternate coulmns.
For eg : Max repeated text from alternate columns.
Ram 0 Ram 125 Sita 12 Pooja 33 Ram 12 Laxman 56
Now ram is repeated 3 times which is max. so how can i find using the index formula for the alternate columns.
Hello,
I was wondering: is there a way to modify the formula used in this example in order to identify the maximum number in a data set based on a certain variable? Specifically, I was hoping to use the formula to find out what the maximum number in a large range (F7:EY7) is only if there are at least 5 data points that are equal to that number.
e.g., (45,50,50,40,25,55,50,45,50,60,55,50) in this data set, although 60 is the max number, I would not want 60 returned as the max, I would want 50 to be the number identified as the max. Basically, I am attempting to avoid pulling outliers that mean nothing to my data set.
Thanks,
- C
@Carlos,
Why wouldn't 55 be included as the Max, it occurs twice
@Carlos
But for a formula you can use:
=MAX(IF(COUNTIF($F$7:$EY$7,F7:EY7)>=5,F7:EY7)) Ctrl+Shift+Enter
@Hui
This formula is absolutely perfect! Thanks so much for your efforts and help.
Carlos
2 questions regarding this one.
1. Is the formula valid across a range? (Not a single row like B2:K2, but across multiple rows and columns). I tried but was not working, so can it be modified such that it works across multiple rows and columns?
2. What if you want to find out the word appearing the 2nd, 3rd maximum number of time. Can any alternate be used to MAX function. I tried the "LARGE" function, but did not work
Any help will be much appreciated
I have a column with 3000 rows of phrases on the products, and I want a formula to create index of cells that have a similar word batir example attachment. Thank you very much
I am using arrey formula with COUNTIF, but value returned as result is not what a normal countif function will do. Please help
I have the same question that has been asked twice before, but with no answer. What do you do if you have two names that appear the same number of times?
For example I have a row of survey data that has results from Very Satisfied to Not Satisfied. Right now using this formula it highlights Very Satisfied (4 times), however, Moderately Satisfied also has 4 responses. How do I get it to show both? So that it shows both Very Satisfied (4 times and Moderately Satisfied (4 times)?