Imagine you work at MI5 as a HR officer. You want to find all agents who have license to kill (licence 7). Your data looks like this:
How would you go about it?
If you filter the list or use FIND() or SEARCH() formulas, you will end up with agents who also have licenses 77, 17 or not7. So how would you solve this problem?
Of course, you do what any smart person does. You summon Excel and ask it nicely by using some wicked pattern matching logic.
Finding all cells with 7 in them
Assuming the licenses are in column [licenses], you can use below formula to check if it has 7:
Step 1: Make an extra column, say [Two more commas] and use this formula.
=", "&[@licenses]&", "
Step 2: Now use below formula to find if a license has 7 in it:
=COUNTIFS([@[Two more commas]] , "* 7,*") > 0
This formula returns TRUE if [@licenses] has 7.
So how does it work?
There are three cases for licenses with 7 in them, as shown below.
Once we prefix & suffix COMMASPACE to this, we end up with a text that has the pattern:
<COMMA SPACE number><COMMA SPACE number>...
Now, we simply look for the pattern SPACE7, in this by using the * wildcard along with COUNTIFS.
=COUNTIFS([@[Two more commas]], "* 7,*")
We add a check to see if the count is greater than ZERO (ie did we find the pattern?)
So there you go. Now you can find the agent who can nab the targets.
Related: Using wildcards * ? in Excel VLOOKUP & other functions | Introduction to SUMIFS formula
Adding a few plot twists
Now, your MI5 career would be awfully boring, if there are no plot twists. So Q calls you in to her office and says, “We need a list of all agents who have any of the licenses 7, 65 or 63. Oh, while you are at it, tell me which agents have all three licenses.”
Damn you Q
the evil is you
for making me do
work I don’t want to
Damn you Q.
Added later: Okay, My James Bond knowledge is not very good. M is the boss of MI5, not Q. So let’s assume M calls you in to her office and gives you this task. As usual, you go:
Why do this M?
Everything is ho hum
and then you come
to tell my work is not yumm
Why do this M?
So you are back to your desk. Now the licenses to find are in a named range called list.
You can use COUNTIFS() pattern find logic to get the answer.
For sake of simplicity, let’s assume that you have a new column in your data table called as [Two more commas]
Does the agent have any of the licenses in list?
=SUMPRODUCT(COUNTIFS([@[Two more commas]],"* "&list&",*"))>0
The internal COUNTIFS returns an array of values, which the SUMPRODUCT simply adds up.
Does the agent have all of the licenses in list?
=SUMPRODUCT(COUNTIFS([@[Two more commas]],"* "&list&",*")) = COUNTA(list)
Now, let’s hope Q doesn’t add more plot twists. And if she does, you can always post them in the comments so internet can solve them.
Related: Introduction to Excel SUMPRODUCT function
How would you find license to kill?
I am sure COUNTIFS is not the only way to do this. So what would you do in this case? Will you use formulas / VBA or Power Query? Or something else altogether? Go ahead and share your approach in the comments section.
Note: Thanks to Brian who emailed me this problem.
21 Responses to “Finding if a cell has 7 in it… [Pattern matching in Excel]”
Hi Chandoo,
Re the solutions beginning:
=COUNTIFS(", "&[@licenses]&", ",…
for which version of Excel is this? Such amendments to the criteria ranges are prohibited in 2010 and earlier. Has this changed in a more recent version?
Regards
Hello XOR LX
This construction is still not allowed in Excel 2016.
You are right. Sorry, I didn't test the formula properly. In my test workbook, I used [Two more commas] helper column. I have changed the write up to correct this mistake. Thanks for pointing this out.
Hi Chandoo,
The license problem works with *,7* and not *7,*. The former will only return true if 7 is isolated, while the latter returns true as long as 7 is the last digit of the value.
SEARCH and FIND can be used for the first requirement like below:
=SUMPRODUCT(ISNUMBER(SEARCH(", 7,",", "&Table1[Licenses]&","))*1)
Following CSE entry construct will work for one or more criteria:
=SUM((MMULT(--ISNUMBER(SEARCH(", "&TRANSPOSE(Table2[List])&",",", "&Table1[Licenses]&",",1)),{1;1;1})=3)+0)
Based on number of criteria array {1;1;1} and count value can be changed like for at least one from the list:
=SUM((MMULT(--ISNUMBER(SEARCH(", "&TRANSPOSE(Table2[List])&",",", "&Table1[Licenses]&",",1)),{1;1;1})>0)+0)
The head of MI5 is M, not Q. Q is the R&D guy with the gadgets.
I have nothing to contribute about the formulas yet, but I'll work on it.
I have been doing this for years... many years. I used to program in a language called M (nothing to do with James Bond lol) and it had a function $Piece which, given a string of delimited data, you specified the string, the delimiter and which item in the delimited list you wanted.
So I wrote a VBA function which does exactly that. It saves a shedload of time doing things like this.
I like the story, but it doesn't help to tell that you HAVE such a function if you don't share it. Care to post the code?
Can you please share your VBA function? I just had my computer replaced. My old PC had a custom Excel "piece" function that I used all the time. It is now lost to the scrapyard. I'd love to have it back. Thanks.
After doing some digging, I found some VBA code here:
https://social.msdn.microsoft.com/Forums/en-US/6ac0fbfb-c191-4209-9b51-30187514bf09/piece-function?forum=isvvba
Code follows:
Function piece(strInput As String, Delim As String, nPart As Long) As String
If Len(Trim(strInput)) = 0 Then piece = "NULL": Exit Function
If Len(Trim(Delim)) = 0 Then piece = "NULL": Exit Function
piece = strInput
If InStr(1, strInput, Delim, vbTextCompare) Then
Dim MyArray() As String
MyArray = Split(strInput, Delim)
If nPart > (UBound(MyArray) + 1) Then piece = "NULL": Exit Function
piece = MyArray(nPart - 1)
ElseIf nPart > 1 Then
piece = "NULL"
End If
End Function
Find all cells with a lone 7:
=SUMPRODUCT(--COUNTIF(rng,{"7,*","*, 7","*, 7,*"}))
Here's one that doesn't use a helper column or wildcards. Checks to see if the character before and after the "7" is not a number.
Where B column is the License column.
=IF(ISERROR(SEARCH("7",B2)),FALSE,IF(ISERROR(MID(B2,SEARCH("7",B2)-1,1)*1),IF(ISERROR(MID(B2,SEARCH("7",B2)+1,1)*1),TRUE,FALSE)))
Chaosfiend,
I can't get your formula to return accurate results.
=OR(LEFT(SUBSTITUTE(B2," ",""),SEARCH(",",SUBSTITUTE(B2," ",""))-1)="7",IFERROR(SEARCH(",7,",B2),FALSE),IFERROR(RIGHT(B2,LEN(TRIM(B2))-SEARCH(",7",TRIM(B2))="7"),FALSE))
=OR(LEFT(SUBSTITUTE(TRIM(B10)," ",""),SEARCH(",",SUBSTITUTE(TRIM(B10)," ",""))-1)="7",IFERROR(SEARCH(",7,",TRIM(B10)),FALSE),IFERROR(RIGHT(TRIM(B10),LEN(TRIM(B10))-SEARCH(",7",TRIM(B10))="7"),FALSE))
If data set is large enough or procedure is done repeatedly, I'd use PowerQuery.
1. Duplicate [Licenses]
2. Split using "," as delimiter
3. Unpivot split columns
4. Change unpivoted column to Whole Number type
5. Replace error with out of range number (ex. 99999)
6. Filter unpivoted column = 7
7. Remove unnecessary columns
Complete M: (order of procedure is bit different as I removed columns in between steps)
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Name", type text}, {"Licenses", type text}}),
#"Duplicated Column" = Table.DuplicateColumn(#"Changed Type", "Licenses", "Licenses - Copy"),
#"Split Column by Delimiter" = Table.SplitColumn(#"Duplicated Column","Licenses - Copy",Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv),{"Licenses - Copy.1", "Licenses - Copy.2", "Licenses - Copy.3", "Licenses - Copy.4", "Licenses - Copy.5", "Licenses - Copy.6"}),
#"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Split Column by Delimiter", {"Name", "Licenses"}, "Attribute", "Value"),
#"Removed Columns" = Table.RemoveColumns(#"Unpivoted Columns",{"Attribute"}),
#"Changed Type2" = Table.TransformColumnTypes(#"Removed Columns",{{"Value", Int64.Type}}),
#"Replaced Errors" = Table.ReplaceErrorValues(#"Changed Type2", {{"Value", 99999}}),
#"Filtered Rows" = Table.SelectRows(#"Replaced Errors", each ([Value] = 7)),
#"Removed Columns1" = Table.RemoveColumns(#"Filtered Rows",{"Value"})
in
#"Removed Columns1"
How about;
=IFERROR(FIND(" "&MatchCell&" "," "&SUBSTITUTE(TRIM(ListCell),","," ")&" ")>0,FALSE)
Returns TRUE or FALSE if MatchCell value is found within comma-separated list. Works with text and numbers. Allows null items and leading and trailing spaces.
Regards, Mark.
I'm unable to understand,, @Two Commas,, is it from which version, I've searched in Sheet Kraft also!!
@Rajesh
It is the name of the New Column
Licenses Range -->B2:B11
=COUNT(INDEX(FIND(",7,",","&B2:B11&","),))
=IF(MID(B2,1,2)="7,",1,IF(RIGHT(B2,2)=" 7",LEN(B2),FIND(" 7,",B2)))