This is CRAZY!!!. I stumbled on a weird use for FILTERXML() while reading a forum post earlier today. So I couldn’t wait to test it. I am happy to share the results.
Say you have some text (sentence / phrase / keyword etc.) in a cell and you want to extract the nth word. Unfortunately Excel doesn’t have SPLIT() formula. So we end up writing obscenely long array formulas or use gazillion helper columns.
Here is the super sneaky trick. Use FILTERXML() instead.
See this example
How to extract words from sentence with FILTERXML
Say you have a long sentence (or keyword phrase) in cell C3.
Step 1: Convert this to valid XML
This sounds complicated that it is. All you need to do is prefix, insert and suffix some tags. Like this:
="<DATA><A>"& SUBSTITUTE(C3, " ", "</A><A>") & "</A></DATA>"
This will turn C3 in to a valid XML block with each word as <A> node.
Step 2: Use FILTERXML on this to extract words
Now that we have valid XML, you can say =FILTERXML(C5, “/DATA/A[3]”)
to extract the 3rd word from our XML converted sentence.
Step 3: There are no more steps. Enjoy FILTERXML.
Bonus trick: Use [last()] to get the last word. For example, =FILTERXML(C5, “/DATA/A[last()]”) will get you the last word from sentence.
Watch this – how to extract words with FILTERXML() in Excel
I was so excited to learn about this that I recorded a video in my robe. Rated A (for awesomeness), Do check it out below or on my YouTube page.
Download example workbook
Click here to download the example file for this tip. Play with FILTERXML to learn more.
Learn more about XML & XPATH
If you want to learn how XML and Xpath work, check out these pages.
- Xpath cheatsheet – devhints
- Xpath examples – w3 Schools
What if you can’t use FILTERXML?
FILTERXML works in Excel 2013 or above. But if you are using an older version of Excel or Excel for Mac, then you can’t rely on this method. Check out below two examples to learn other ways to split and extract words from sentences.
Do you FILTERXML for splitting or something else …?
I almost never use FILTERXML unless I am calculating distance between points or calling a webservice. But this use of splitting text is fun. Big thanks to GraH for posting this in the forum.
What about you? Have you used FILTERXML for any other out of box situations? Please share in the comments. Pretty please with sprinkles of conditional formatting icons on top 🙂
6 Responses to “#awesome trick – Extract word by position using FILTERXML()”
Hey Chandoo,
Nice of you to say thanks, but honestly all credits and merits go to Dave2018. As I only remembered his earlier challenge where I picked up the trick. So kudos to him 🙂
https://chandoo.org/forum/threads/parse-cell-without-array-formula.40551/
Cheers
G.
Very good and you have made it very simple to use.
I downloaded your file and played with it. I put my own sentence in cell C3 ... Mary had a little lamb, its fleece was white as snow.
Here is my comment ... not a criticism ... notice the punctuation in my sentence? When I ask for word 5, I get lamb, and word 11 returns snow.
Of course, removing all punctuation is an option. Ignoring the punctuation in the output is another option.
Just saying!
The FILTERXML trick was also used in an earlier challenge from Jan 2017 to return unique strings from a comma separated list (unsorted).
https://chandoo.org/forum/threads/create-a-unique-delimited-string-from-a-delimited-string-–-excel-formula-method-by-david-hager.32630/
and here using Bill Jelen's cool trick to concatenate a 3D range:
https://excelxor.com/2016/04/08/advanced-formula-challenge-13-single-array-containing-all-entries-from-a-given-range-in-multiple-worksheets/#comment-2684
Thank you so much for sharing these links Lori. Welcome back to comments section. 🙂
Hi Chandoo,
I had come across this beautiful solution sometime back. Works as beautifully as your.
Thanks for your technique.
=TRIM(MID(SUBSTITUTE(C3," ",REPT("^",LEN(C3))), (F8-1)*LEN(C3)+1, LEN(C3)))
C3 = sentence
F8 = Nth word input