• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Separate text from numbers

I have a column which has data in the following format


10000200000001XX Abc-Def _old


1000020XX Abc-Def _new


10000200000001XXAbc-Def _old


I want to separate out the text from each of the cells and place them in a different cell. I have tried using "Text to columns" but what happens is, in some cases it breaks up the text as well. I want to know if a formula can give me the desired output?


The number only appears at the start of the cell. Everything after the number has to be placed in another cell. So for the first cell the text will be "XX Abc-Def _old", the 2nd cell "XX Abc-Def _new" and the last cell "XXAbc-Def _old". The number of digits in the start number also vary
 
Hi ,


Try this :


=RIGHT(A1,LEN(A1)-MATCH(TRUE,(MID(A1,ROW(INDIRECT("1:"&LEN(A1))),1)>"9"),0)+1)


entered as an array formula , using CTRL SHIFT ENTER.


Your text is assumed to be in A1.


Narayan
 
wow NARAYANK991 that was really fast! u are a genius. how did u manage to come up with it so fast? Can u please explain it as well? thanks
 
Tackling it from the inside out.

Whole formula:

=RIGHT(A1,LEN(A1)-MATCH(TRUE,(MID(A1,ROW(INDIRECT("1:"&LEN(A1))),1)>"9"),0)+1)


Start here: INDIRECT("1:"&LEN(A1))

We want to create a range or rows equal in length to the number of characters in the word in A1. The LEN function will gives us the number of characters. Concatenating this with "1:" and putting it in the INDIRECT function will give us a range reference. For now, let's use the first word you posted, "10000200000001XX Abc-Def _old", which has a length of 29. Thus, the INDIRECT function will return the range 1:29. I'll bold the value being passed in each following paragraph.


MID(A1,ROW(1:29),1)>"9")

The ROW function is now going to take our range and convert it into an array of numbers like {1,2,3,...,28,29}. The MID function is then going to take the word in cell A1, and for each number in our array, take 1 letter at that position, thus creating an array of each character in the word, eg {1,0,0,0,0,2,...,o,l,d}. The last part of this section then compares it with the text "9" (why not the number? Because the MID function returns everything as text) and see if it is "larger". Alphabetically, alpha characters are sorted after numbers, so they count as "larger". This then turns our array into a boolean array containing true/false {false,false,false,false,false,...,true,true,true}


MATCH(TRUE,{false,false,false,false,false,...,true,true,true},0)

The MATCH function is now going to look through the array and return the position of the first instance of True. In our example, the first true (which represents the first non-number) is found at location 15.


=RIGHT(A1,LEN(A1)-15
+1)

We've now arrived at the end. We know we're going to be taking stuff from the RIGHT side of the word in A1, but we need to know how many characters to take, counting from the left. To get this number, we take total length (29) and subtract the number we previously found (15) to get a value of 14. However, because we want to include the starting point (think of it as counting 0) we then add 1 at the end, to get a value of 15. Thus, the formula returns the last 15 characters from the word in A1, giving us:

XX Abc-Def _old


*PS: We had to confirm the formula as an array so that the ROW function mentioned above would generate an array, and not just a single number.
 
Hi ,


What we are doing within the MID function is , we are taking one character at a time ; this is the reason for the following construct :


=ROW(INDIRECT("1:"&LEN(A1)))


I'll start with an explanation of the ROW function , which I had posted earlier in answer to another question.


ROW(1:52) , which creates an array of numbers from the starting number to the ending number ; here the start number is 1 , while the end number is 52. To check out how this works , create a named range , say temp , and in the Refers To box , put in the formula =ROW($1:$52)


In any unused cell , put in the formula =temp ; press F2 and F9 ; you should see the array as follows :

{1;2;3;4;5;6;7;8;9;10;11;12;13;14;15;16;17;18;19;20;21;22;23;24;25;26;27;28;29;30;31;32;33;34;35;36;37;38;39;40;41;42;43;44;45;46;47;48;49;50;51;52}


If you change the Refers To formula by including the TRANSPOSE function ( =TRANSPOSE(ROW($1:$52)) ) , you should see the following array :

{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52}


The first array is an array of 1 column and 52 rows ; the second one has 1 row and 52 columns.


This syntax of ROW is possible when you know the start and end points ; in the above example , the start point was 1 while the end point was 52. However , if you want a variable end point , then the following syntax is helpful :


=ROW(INDIRECT("1:"&LEN(A1)))


Suppose A1 contains the word "Excel" ( without the double quotes ) ; the LEN(A1) function returns 5 as the length of the text Excel.


The ROW(INDIRECT("1:5")) function returns the array of numbers from 1 to 5 thus : {1;2;3;4;5}


The MID function returns a string from another string ; if A1 contains the text Chandoo , then the formula :


=MID(A1,3,3) will return the text "and" ( without the double quotes ).


Here the first parameter A1 is the text to be used as the input ; the second parameter 3 is the place within the input from where to start ; the third parameter 3 is the number of characters to be returned.


If you had used the formula =MID(A1,2,4) , you would have got the text "hand" ( without the double quotes ).


Now suppose A1 contains Chandoo , if you use only =MID(A1,1,1) , you will always get the letter C.


If you want to look at each character , one by one , you need to retrieve "C" , "h" , "a" , "n" , "d" , "o" , "o".


This can be done by using =MID(A1,ROW(INDIRECT("1:"&LEN(A1))),1) , which returns the array :


{"C";"h";"a";"n";"d";"o";"o"}


Narayan
 
Back
Top