• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Extract whole words of some length

Good afternoon, dear friends. I want to know how I can do, either with formula, macro or UDF, to get words from a cell that have a certain number of characters. Attached file with examples.
Thank you in advance for the attention. Greetings from Peru
Jorge
 

Attachments

  • Cons_Chandoo_040917.xlsx
    9.6 KB · Views: 11
@callao2908 Interesting question. I am sure you can use formulas to get the answer, but I couldn't think of one quickly. So I wrote a UDF. I have tested it a bit and it seems to do the job. Give it a try.

Code:
Function getNthWord(fromThis As String, n As Integer, Optional length As Integer, Optional delim As String) As String
    Dim words, retval As String, i As Integer, j As Integer
   
    delim = IIf(Len(delim) = 0, " ", delim)
   
    words = Split(fromThis, delim)
   
    If length > 0 Then
        ' we need to return the word with given length
        For i = LBound(words) To UBound(words)
            If Len(words(i)) = length Then
                j = j + 1
                If j = n Then retval = words(i)
            End If
        Next i
        getNthWord = retval
    Else
        getNthWord = words(n - 1)
    End If
End Function
 
r2c2, thanks for your attention. Please check the attached file, you can see cells in yellow, which are well obtained results, and in blue results that give error, or that does not find, is the case of a second or third word that may be in the text of the cell .
Please check the UDF, give some examples of application, and explain the function of each parameter
Thanks for the attention, greetings from Peru
 

Attachments

  • Cons_Chandoo_040917_Rev.1.xlsm
    16.4 KB · Views: 7
r2c2, thanks for your attention. Please check the attached file, you can see cells in yellow, which are well obtained results, and in blue results that give error, or that does not find, is the case of a second or third word that may be in the text of the cell .
Please check the UDF, give some examples of application, and explain the function of each parameter
Thanks for the attention, greetings from Peru


You are welcome. You should use the function like below.

getNthWord(yourtext, n, length of the word)

For example, to get the 2nd 3 letter word in A1, use:
=getNthWord(A1, 2,3)

Notice the order of parameters.
 
r2c2, thanks for your attention. Good morning here in Peru, a query, the parameter "delim", what it serves, how it is used give some examples of application.
With the explanation that you made me arrive if the other occurrences are obtained, excellent.
While this UDF is very useful, it has the drawback that I have to be analyzing the contents of the cell, eg A1, to know how many occurrences of each type I have in it, for example how many 2 characters I have, and so on more characters, there is some way that to say in a single application I throw the result of all occurrences. Please see the file that I sent you in the initial post
Thank you
 

Attachments

  • 2017-09-05_9-24-42.png
    2017-09-05_9-24-42.png
    25.1 KB · Views: 6
John Jairo V, thanks for your attention, I will verify results with the original data, and I am communicating with you.

Welcome will be any other suggestion. Thank you very much

Jorge from Peru
 
John Jairo V, I have had the opportunity to do the corresponding tests and it works correctly, very good job. I would be grateful if you please kindly explain why the use of each of the formulas used and how they have been built nesting with each other, as well as the formula used for the {rng], which incidentally is curious and interesting.
I hope your answer, thank you very much, greetings from Peru
Jorge
P.D .: translation made with Google Translate
 
Good afternoon, excuse the insistence, if it is within the reach of someone would be good to explain the applied formulas, and because "rng" is defined as
Unless I propose another solution
Thank you,greetings from Peru
Jorge
P.D .: translation made with Google Translate
 
Hi ,

I will try to explain , but the explanation will span several posts , since there are many elements to the formula.

First , let us consider the named range rng. The formula for this is :

=TRIM(MID(SUBSTITUTE(LOOKUP("zzz",Hoja1!$A$1:$A2)," ",REPT(" ",99)),1+99*(ROW(Hoja1!$1:$20)-1),99))

There are 4 components of this formula , using the functions TRIM , MID , SUBSTITUTE and LOOKUP. The REPT and the ROW functions are a part of the SUBSTITUTE and MID functions respectively.

We can straightaway eliminate the TRIM function because it is the simplest of them all ; it is removing the unwanted spaces which were introduced by the REPT function.

Since the MID and SUBSTITUTE functions are working in tandem , we will consider them later , after we have seen how the LOOKUP function works.

To understand how the LOOKUP function works , do the following :

In a blank worksheet , enter the values 1 , 4 , 7 and 10 in cells A1 , A6 , A11 and A16 respectively i.e. cell A1 has 1 in it , cell A6 has 4 in it , A11 has 7 in it and A16 has 10 in it.

Now , in cell C1 enter the formula :

=LOOKUP(99,A$1:A1)

and drag it down till cell C20.

You will see that the cells between C1 and C20 have been filled with data ; there are no blank cells unlike in column A. The cells between C1 and C5 are filled with the data in A1 , the cells between C6 and C10 are filled with the data in A6 ,and so on.

Two points must be noted in this formula :

1. The 99 in the LOOKUP function is because the data is numeric. This will not work if the data is alphabetic or alphanumeric.

2. Even if the data is numeric , the 99 works in this case because it is bigger than the biggest data value ; for this reason , you might sometimes see the same formula using a value such as 9.9999E+307 , which is the biggest number possible to be entered in a cell in Excel. Obviously , using such numbers is not necessary ; all that is required is that the number used be bigger than any number within the data range.

To further understand this formula , change the data in A1 from 1 to a ; change the data in A11 from 7 to b.

The results of the formula are now an error value in the cells C1 to C5 , and the value 7 in the cells C11 through C20.

The reason for the error value is that the LOOKUP function is no longer able to recognize the numeric value of the data in A1 , the value in C1 is an error value , and the remaining cells from C2 to C5 are also filled with the same error value.

The reason for the cells C11 through C20 to have 7 in them despite the intervening character b in cell A16 is because the LOOKUP function is looking for a numeric value (since the first parameter , 99 , is a number) , and ignores the alphabetic character b.

Any doubts ?

Narayan
 
Narayank, thank you very much for your kind attention, it shows your generosity and professionalism.
Until another opportunity, they can give the subject by closed
Jorge from Perú
P.D .: translation made with Google Translate
 
Another option (for use in Excel 2013+) to return an array containing text elements of a given length:

=FILTERXML("<x><a>"&SUBSTITUTE(TRIM(A2)," ","</a><a>")&"</a></x>","//*[string-length(text()) = 2]")

eg: A2 = "a ss ddd ffff ggg hh j" -> {"ss";"hh"}
 
Last edited:
Lori, thank you very much for your kind attention.
I have Excel16 in Spanish, and the translation of your formula is:

=XMLFILTRO("<x><a>"&SUSTITUIR(ESPACIOS(A2);" ";"</a><a>")&"</a></x>";"//*[string-length(text()) = 2]")

I can not find translation for what is red

I'll be grateful to you how you do to get the result

A2 = "a ss ddd ffff ggg hh j" -> {"ss"; "hh"}

to my only gives me the result "ss", I do not throw the second or third event, for any case. In our case I do not give the result "hh"
Thanks again
Regards,
Jorge from Perú
P.D .: translation made with Google Translate
 
Narayank, thank you very much for the links, I could know more than I knew

Lori: I'll be grateful to tell me how you do to get the result

A2 = "a ss ddd ffff ggg hh j" -> {"ss"; "hh"}

to me it only gives me the result "ss", it does not give me the second or third event, when there is. In our case I can not find the result "hh"
Thanks again
Regards,
Jorge from Peru
P.D .: translation made with Google Translate
 
Hi @callao2908,
As the formula returns more than one result you can use INDEX to get the other values:

=INDEX(FILTERXML("<x><a>"&SUBSTITUTE(A2," ","</a><a>")&"</a></x>","//*[string-length(text()) = 2]"),2)

Not sure what that would be in Spanish?
 
Back
Top