Formula Forensic No. 021 – Find the 4th Slash !
Last week at the Chandoo.org Forums, Senthilkumar_rm posed the question:
“I Have file name as
D:\Data\Personal\sramasam\desktop\Exceldata.xls
I want to find the position of the 4^{th} slash “\”
What is a simple way ? ”
I proposed 3 answers being:
=FIND("\",A1,FIND("\",A1,FIND("\",A1,FIND("\",A1)+1)+1)+1)
=SEARCH("\",A1,FIND("\",A1,FIND("\",A1,FIND("\",A1)+1)+1)+1)
=FIND(CHAR(135),SUBSTITUTE(G5,"\",CHAR(135),4))
and Faseeh proposed a 4^{th} answer:
=(LARGE(((MID($A1,ROW(INDIRECT("1:"&1024)),1))="\")*ROW(INDIRECT("1:"&1024)),2))
So today at Formula Forensics we will examine all these and see what and why they all work.
As usual at Formula Forensics you can follow along by downloading a Sample File.
Using the Find() function
The Initial formula I proposed was:
=FIND("\",A1,FIND("\",A1,FIND("\",A1,FIND("\",A1)+1)+1)+1)
This uses the Excel Find() function repeatedly inside itself
The Excel Find() function uses the following syntax
The second solution I proposed was the same as the Find solution except that it used the Excel Search( ) function.
You can see above that Find and Search have exactly the same syntax.
Why have 2 functions that are effectively similar ?
Search is Case Insensitive.
eg: Chandoo = cHanDoo
Find is Case Sensitive.
eg: Chandoo <> cHanDoo
We will discuss the solution using:
=FIND("\",A1,FIND("\",A1,FIND("\",A1,FIND("\",A1)+1)+1)+1)
But the solution and description are equally applicable to the search based solution:
=SEARCH("\",A1,FIND("\",A1,FIND("\",A1,FIND("\",A1)+1)+1)+1)
So Find has the syntax: =Find( Text, Within Text, [Start No.])
But our formula has 4 Find() functions, where do we start?
Start in the Middle.
In our example: =FIND(“\”,A1,FIND(“\”,A1,FIND(“\”,A1,FIND(“\”,A1)+1)+1)+1)
FIND("\",A1)
Text: “\”
Within Text: A1
Start No: Optional = 0
This will find the first \ in out text string and return the value 3.
Checkout Cell D13.
Stepping out one find =FIND(“\”,A1,FIND(“\”,A1,FIND(“\”,A1,3+1)+1)+1)
FIND("\",A1,3+1)
Text: “\”
Within Text: A1
Start No: 3 + 1
This will find the Second \ in out text string by starting at position 3 + 1 and hence return the value 8.
Checkout Cell D15.
Stepping out one more find =FIND(“\”,A1,FIND(“\”,A1,8+1)+1)
FIND("\",A1,8+1)
Text: “\”
Within Text: A1
Start No: 8 + 1
This will find the Third \ in out text string by starting at position 8 + 1 and hence return the value 17.
Checkout Cell D17.
Finally we arrive at the outer Find, =FIND(“\”,A1,17+1)
=FIND("\",A1,17+1)
Text: “\”
Within Text: A1
Start No: 17 + 1
This will find the fourth \ in out text string by starting at position 17 + 1 and hence return the answer of 26.
Checkout Cell D19.
Advantages:
Relatively simple formula
Disadvantage:
This formula must be manually remade if you want to find either the 3^{rd}, 5^{th} or another occurrences.
Using the Find() & Substitute() functions
=FIND(CHAR(135),SUBSTITUTE(A1,"\",CHAR(135),4))
Using the Find() & Substitute() formula shown above take a different approach to solving the problem to the pure Find() based solution.
This solution works by using a feature of the substitute function that allows for the substitution of the Nth chosen character with another character.
The Excel Substitute() function has the following syntax :
In our example:
=FIND(
CHAR(135)
,SUBSTITUTE(
A1
,
"\"
,
CHAR(135)
,
4
))
Text: A1
Old_Text: "\"
New_Text: Char(135)
Instance_Num:
So substitute will replace the 4^{th} Slash with the character Ascii code 135. Char 135 or a ‡ character was chosen as it is unlikely to be used in normal text. If it is used another character code should be chosen.
The Find Function will then look for Char(135) in the Substituted text and return the position number of it.
Replacing the Char(135) with a Char(5) characters reduces this formula to 44 characters!
Advantages:
This is a very clear formula to understand
You can easily look for the 3^{rd} or 5^{th} character without changing the formula
Disadvantage:
If the Text String already contains the Char(135) character then another must be chosen or the formula will be wrong.
Using Faseeh’s Array Formula.
={(LARGE(((MID($A1,ROW(INDIRECT("1:"&1024)),1))="\")*ROW(INDIRECT("1:"&1024)),2))}
Faseeh’s Formula is based around the
Large() function
=(LARGE(((MID($A1,ROW(INDIRECT("1:"&1024)),1))="\")*ROW(INDIRECT("1:"&1024)),2))
Ctrl Shift Enter
The Excel
Large() function has the following syntax:
In Faseeh’s Formula we see
=LARGE(((MID($A1,ROW(INDIRECT("1:"&1024)),1))="\")*ROW(INDIRECT("1:"&1024)),2)
Array:
((MID($A1,ROW(INDIRECT("1:"&1024)),1))="\")*ROW(INDIRECT("1:"&1024))
K: 2
So we can see that the Large() function is looking for the Second Largest value (
K=2) in the array.
Lets pull the Array apart
The Array consists of two parts separated by a Multiplication sign.
((MID($A1,ROW(INDIRECT("1:"&1024)),1))="\")
*ROW(INDIRECT("1:"&1024))
We can look at each part separately and then combine them at the end.
The first part MID($A1,ROW(INDIRECT("1:"&1024)),1)="\"
Consists of MID($A1,ROW(INDIRECT("1:"&1024)),1)
and an
= "\"
In a Blank cell Say E25 enter =MID($A1,ROW(INDIRECT("1:"&1024)),1)
and press F9 Not Enter
Excel responds with an Array:
=
{"D";":";"\";"D";"a";"t";"a";"\";"P";"e";"r";"s";"o";"n";"a";"l";"\";"s";"r";"a";"m";"a";"s";"a";"m";"\";"d";"e";"s";"k";"t";"o";"p";"\";"E";"x";"c";"e";"l";"d";"a";"t";"a";".";"x";"l";"s";""; … ;"" ;""}
We can see here that Excel has taken the value of cell A1 and broken it up as text and put each character into an element in the array.
So the whole line: MID($A1,ROW(INDIRECT("1:"&1024)),1)="\"
Is evaluating this array against a "\"
and should return an array of True/False values
In a Blank cell E27 put the following: =MID($A1,ROW(INDIRECT(“1:”&1024)),1)=”\” and press F9 Not Enter
Excel responds with a an array of 1024 Falses
={FALSE;FALSE;FALSE; …. ;FALSE;FALSE}
That’s not quite what we expected ?
The Function is =MID($A1,ROW(INDIRECT(“1:”&1024)),1)=”\”
Which is a Boolean way of saying if the Middle 1 Character of Cell A1 from position 1 to 1024 is = “\” return an Array of the values.
What is going on here?
If we modify the formula slightly to =(MID($A1,ROW(INDIRECT(“1:”&1024)),1)=”\”)*1
And evaluate (F9) that in cell E28
Excel returns an Array ={0;0;1;0;0;0;0;1;0;0;0;0;0;0;0;0;1;0;0;0;0;0;0;0;0;1;0;0;0;0;0;0;0;1;0; … ;0;0}
The array consists of 1024 0’s except for the positions where the Array = “\” where it has 1’s.
Lets go inside this part of the formula and see what is happening.
=(MID($A1,ROW(INDIRECT(“1:”&1024)),1)
INDIRECT(“1:”&1024) returns a Reference of 1:1024 as Text which Indirect converts to a Range from row 1 to row 1024
=(MID($A1,ROW(INDIRECT(“1:”&1024)),1)
Row() returns the row number of the array
=(MID($A1,ROW(INDIRECT(“1:”&1024)),1)
Mid takes the Character from A1 starting at Position Row(INDIRECT(“1:”&1024)) and returns 1 character.
This effectively allows the array to retrieve each character from the Text of a1.
The Second Part
((MID($A1,ROW(INDIRECT("1:"&1024)),1))="\")
*
ROW(INDIRECT("1:"&1024))
We have seen that the first part of the equation
((MID($A1,ROW(INDIRECT("1:"&1024)),1))="\")
Returns an Array of 1’s and 0’s where the formula matched the "\"
character.
The second part of the equation: ROW(INDIRECT("1:"&1024))
is used to return the position of the characters.
In a blank cell E30 type
=ROW(INDIRECT("1:"&1024))
and press F9 Not Enter
Excel will respond with ={1;2;3;4;5;6;7;8; … ;1023;1024}
This is a list of the Rows from the Range 1:1024
We can now return to the original function
((MID($A1,ROW(INDIRECT(“1:”&1024)),1))=”\”)*ROW(INDIRECT(“1:”&1024))
In a blank cell say E32 type: =((MID($A1,ROW(INDIRECT(“1:”&1024)),1))=”\”)*ROW(INDIRECT(“1:”&1024)) and press F9 Not Enter
Excel will respond with:
={0;0;3;0;0;0;0;8;0;0;0;0;0;0;0;0;17;0;0;0;0;0;0;0;0;26;0;0;0;0;0;0;0;34;0; … ;0;0;0;0}
This array is the combination of the two arrays discussed above.
That is it is the 1’s and 0’s where the \’s are multiplied by the Row Numbers.
We can see that the positions where the \’s are have the position numbers listed
={0;0;3;0;0;0;0;8;0;0;0;0;0;0;0;0;17;0;0;0;0;0;0;0;0;26;0;0;0;0;0;0;0;34;0; … ;0;0;0;0}
The Large Function in the original Formula:
=LARGE(((MID($A1,ROW(INDIRECT("1:"&1024)),1))="\")*ROW(INDIRECT("1:"&1024)),2)
Then extracts the second largest number, which in this case is the value 26
Advantages:
To assign the position number you need to know how many \’s are in the formula to start with
Disadvantages:
This is an Array Formula which some people struggle with
How the formula works is difficult to workout or explain.
Modified Formula
With a small modification you can introduce a constant to allow a variable n’th character to be retrieved without knowing how many \’s there were originally.
=LARGE(((MID($A1,ROW(INDIRECT("1:"&LEN(A1))),1))="\")*ROW(INDIRECT("1:"&LEN(A1))),LEN(A1)LEN(SUBSTITUTE(A1,"\",""))4+1) Ctrl Shift Enter
Or
=LARGE(((MID($A1,ROW(INDIRECT("1:"&LEN(A1))),1))="\")*ROW(INDIRECT("1:"&LEN(A1))),LEN(A1)LEN(SUBSTITUTE(A1,"\",""))B1+1) Ctrl Shift Enter
Where cell B1 contains 4
Where 4 is the 4^{th} character you want to find the location for.
You can examine how the replacement of 2 with LEN(A1)LEN(SUBSTITUTE(A1,"\",""))4+1 works
How Else Can You Solve Senthilkumar_rm’s Problem?
Your task is to find the location of the 4^{th} Slash "\"
in the text string: D:\Data\Personal\sramasam\desktop\Exceldata.xls
Can you solve Senthilkumar_rm’s problem another way?
Can you beat my 44 characters ?
Let us know in the comments below:
Download
You can download a copy of the above file and follow along, Download Here.
Formula Forensics “The Series”
This is the 21^{st} post in the Formula Forensics series.
You can learn more about how to pull Excel Formulas apart in the following posts
Formula Forensics Needs Your Help
I need more ideas for future Formula Forensics posts and so I need your help.
If you have a neat formula that you would like to share and explain, try putting pen to paper and draft up a Post like above or;
If you have a formula that you would like explained, but don’t want to write a post, send it to Hui or Chandoo.
 

Leave a Reply
Please Enroll in our Excel & Dashboards Masterclass – Melbourne  Catchup with Chandoo in Perth 
38 Responses to “Formula Forensic No. 021 – Find the 4th Slash !”
I really like your substitute() method. Bearing in mind in the specific case here we know we're looking in a file path, you could replace the char(135) part with something that's not allowed in filenames.
Something like =FIND("?",SUBSTITUTE(A1,"\","?",4))
Ok, it's nothing new and prone to breaking if used in a different situation but yay, shorter! 35 chars.
FIND("/",SUBSTITUTE(A1,"\","/",4))
I also really like the SUBSTITUTE() option. I've used the first option in the past, the SUBSTITUTE() option simplifies the formula nicely.
I'm not a fan of arrays due to their complexity for other users to pick up. Certainly to be avoided when there is a much simpler solution.
Think Sunil just won the prize for the shortest solution!
I personally liked Hui's Substitute() Option, but i believe that we should learn and understand every possible solution, might the concept apparently difficult be helpful somewhere else.
Regards,
Faseeh
Faseeh,
I absolutely agree that it is always worth trying (and sharing) as many different options as possible as you never know when they might come in handy for something else.
My comment was not supposed to be a criticism (& apologies if it came across that way), I just wanted to raise the point that arrays should be used with caution.
Myles
Myles,
Hey, no need for apologies! Happy to Help, 😀 Yes these formula should be used with caution.
Faseeh
Hi
Personally I really like the find/substitute method as it seems a logical and simple solution.
Out of interest, I was wondering how this information/read out might be applied.
Having got the answer 26 what can I do with it or to but it another why would I need to know what the character is at position 26?
Apologies if Senthilkumar_rm expanded on this in his original post.
Keep up the good work.
Regards
Mark
haha, beaten to it
=FIND("*",SUBSTITUTE(A1,"/","*",4))
if you want the last "/" rather than the 4th then use
=FIND("*",SUBSTITUTE(A1,"/","*",LEN(A1)LEN(SUBSTITUTE(A1,"/",""))))
Have fun
Ian
Hi Mark,
The position of a specific character is normally used to extract a specific piece of text from a text string.
For eg. with D:\Data\Personal\sramasam\desktop\Exceldata.xls in cell A1
Use Hui's formula to pull out the position of the 4th "/" as 26
Use the Len() function to find the full string length =Len(A1) = 47
Finally use the Right() function to extract the text after the 4th "/" =RIGHT(A1,4726) = desktop\Exceldata.xls
Myles
Another way to extract a portion of a string is to create a giant gap, and then trim the excess:
=TRIM(MID(SUBSTITUTE(A1,"\",REPT(" ",999),4),999,999))
@ Myles,
Thanks for your message.
I'm actually pleased that I came up with a scenario similar to the one you mention using the =RIGHT function. I thought about perhaps using the =MID function to extract some or all of the text. For example:
=MID(A1,26,9)
=\desktop\
Thanks again,
Mark
You could use the Small() function instead of Large(). That way finding the 4th smallest number is pretty straight forward.
Small would pull in the 0s returned in the array before returning the matches
A little late on this one, but I liked what you were going for.
It bugged me that Faseed found the second to last \ not really the fourth. Since the zeroes interfere with your suggestion to use the SMALL function, I tried finding the largest reciprocal like so:
=1/(LARGE(((MID($A1,ROW(INDIRECT("1:"&1024)),1))="\")*(1/(ROW(INDIRECT("1:"&1024)))),4))
Only 88 characters, vs Chandoo's modification that used 121.
Why to use a normal character as "?" or "/" to substitute "\"?
I prefer to use a special character, the newline or line break or endofline or EOL or LF or CHAR(10), inserted into a cell when you press Alt+Enter
The formula with only 33 visible chars:
<code>=FIND("
",SUBSTITUTE(A1,"\","
",4))</code>
Excel still counts Alt+Enter as a character though. Try putting a ' before the formula and using LEN(). Yours, mine and Sunil's versions all come out as 35 chars.
Jamie, I said that 33 chars are visible but, as you noted, really are 35 chars into my formula. The advantage is that the EOL character not often found among the chars to replace. Observe what happens if the "?" or any other substituting "\" is included in the string.
Write the last formula without <code> tags.
You can wrap the array portion of the LARGE function in Faseeh's formula in an INDEX function to remove the CSE requirement
so LARGE(INDEX(MID(.........),0,1),....)
I like the substitute method since it can be modified to:
=IFERROR(FIND("/",SUBSTITUTE($G$5,$G$6,"/",$H$6)),"Not that many in the string!")
and then you enter the desired character in cell G6 with the repetition in cell H6 which allow the search without changing the formula.
If I want to find first "/" from the right wether there are 4 "/" or 3 "/" or 8 "/" char available in a string. What should we use. I mean that is there any formula exist which start searching from the right.
@Rahul
This is the same as finding the last \ from the left.
The formula
LEN(G5)LEN(SUBSTITUTE(G5,"\",""))
Calculates the number of occurrences of the Character \ in the string.
=FIND("*",SUBSTITUTE(G5,"\","*",LEN(G5)LEN(SUBSTITUTE(G5,"\",""))))
Will do what you want.
Thanks.
Here is what I came up with:
"Character to Find" is in cell C1  \
"Occurrence" to Find is in cell C2
"Max Occurrences" is in cell C3  formula (below)
"String to Search" is in cell C4
=INDEX(SMALL(INDEX(LARGE(SplitChar*SplitArray,kArray),,1),kArray),C2)
Names:
SplitChar
=MID(Sheet1!$C$4,SplitArray,1)=Sheet1!$C$1
SplitArray
=ROW(INDIRECT("1:"&LEN(Sheet1!$C$4)))
kArray
=ROW(INDIRECT("1:"&Sheet1!$C$3))
and in cell C3 for max occurrences enter
=SUMPRODUCT(SplitCHar)
It is basically the same as Faseeh except 3 differences
1/I use the INDEX function to return arrays for the LARGE and SMALL functions, so the formula does not need to be CSE entered.
2/ I used Named Formulas to bypass CSE and for clarity
3/I added a SMALL function to invert the results of the LARGE function, so when you want to find the 1,2,3,n occurrences you would search with 1,2,3,n, not n,3,2,1
=SUMPRODUCT( SplitCHar) is the double negative not a single dash
My question is the same as was asked earlier... why do you want to find the 4th slash. If it is so you can use it as a starting point for pulling out the 5th delimited field, the I think this miniblog article of my would be of interest...
Get Field from Delimited Text String
@Rick
That's a neat formula
I may well plagiarize it soon.
Hui...
Hi Hui,
I am glad you liked the formula... feel free to plagiarize it anytime
If you haven't already done so, check out my other articles here....
Rick Rothstein's Corner
and perhaps you will find some other things to plariagize as well.
Rick
Lovely formula indeed. Bookmarked and will be using as needed 🙂
@Chandoo,
And I am glad you liked the formula as well. In case you or Hui (or anyone else reading this thread) missed the link at the bottom of my article, I just wanted to draw your attention to my related article title "Get "Reversed" Field from Delimited Text String" which allows you to easily get, say, the next to last field in a delimited string without needing to know how many total items are in the string. NOTE... be sure to read the whole thread as one of the responders offered a far better (more concise) formula than the one I posted in my article.
Rick
Ohhh....these just make my head hurt, Rick. How the heck do they work? Got me completely baffled.
I have a much longer formula that like yours also returns the nth delimiter. It's like a lightyear long:
=MID(input,FIND("",SUBSTITUTE(Delimiter&input&Delimiter,Delimiter,"",ROW(INDEX(A:A,Fieldnumber):INDEX(A:A,Fieldnumber+1)))),FIND("",SUBSTITUTE(Delimiter&input&Delimiter,Delimiter,"",ROW(INDEX(A:A,Fieldnumber):INDEX(A:A,Fieldnumber+1))+1))1FIND("",SUBSTITUTE(Delimiter&input&Delimiter,Delimiter,"",ROW(INDEX(A:A,Fieldnumber):INDEX(A:A,Fieldnumber+1)))))
I didn't understand the reason of using char(135).
Please ellaborate
@Manish
It is just a character that is unlikely to appear in "normal" Text
You could just as easily have used an "e" if your text isn't going to contain an "e"
Got it .. thanks
[...] Of course there are many methods find where the last is. You can find a very excellent summary of these techniques in our formula forensics #21 – finding the 4th slash. [...]
I have made a small VBA function which finds the last "char" in a text string. It can be easily adapted to give the nth location within the text string.
Function findLastChar(text As String, char As String)
For i = 1 To Len(text)
If Mid(text, i, 1) = char Then
findLastChar = i
End If
Next i
End Function
well I believe the goal is to find the name of the file ...
=MID(CELL("filename"),SEARCH("[",CELL("filename"))+1, SEARCH("]",CELL("filename"))SEARCH("[",CELL("filename"))1)
How high do you want to go? For the 2nd you can use
=FIND("",A1,FIND("",A1)+1)
but that would get longer for 3rd or 4th
For 3rd you could use this
=FIND("^^",SUBSTITUTE(A1,"","^^",3))
change 3 to whatever instance you need
from http://www.mrexcel.com/forum/excelquestions/473844findseconddashcell.html