Formula Challenge 001 – Return everything from a string after the first block of numbers (Part 2.)

Welcome back. Get a good night’s sleep? Great.

Hopefully you had freshly-picked brain-function-enhancing blueberries for breakfast, and a red bull or five, because you’re gonna need it. Why? Because today, my friends, we peek inside the multi-dimensional mind of Sajan the Excel Magician.

As you’ll keenly recall from yesterday’s gripping episode of “When good formulas go GREAT”, Sajan claimed Bronze in our inaugural formula challenge at Formula Challenge 001 – Return everything in string after first block of numbers

By way of a quick refresher, this formula challenge calls for a formula to return a substring from a longer string. Those strings look something like this:

String: Required Substring:
Monaco7190Australia1484 Australia1484
Liechtenstein42Austria128590 Austria128590
Malta130612Belarus8 Belarus8

You can download the challenge and full dataset here: Formula Challenge 1.2 (Excel 2007+ only, because the formulas have too many nested formulas for earlier Excel versions to handle)

Specifically, we need a formula to return only the second country and associated number …in the case of Monaco7190Australia1484 it’s that bit in bold i.e. Australia1484

Sajan the Magician split those strings with this award-winning beauty:

=MID(A1,MODE(MMULT((N(ISNUMBER(-MID(A1,ROW(INDIRECT(“1:”&LEN(A1))),1)))={1,0})*(ROW(INDIRECT(“1:”&LEN(A1)))-{0,1}),{1;1}))+1,LEN(A1))

How does this wizardly wonder work? Let’s find out…

1. Sajan splits apart the source string into it’s characters

This is as I did in the last post, albeit using a slightly different formula than I did:

=MID(A1,ROW(INDIRECT(“1:”&LEN(A1))),1)

=MID(A1,{1;2;3;4;5;6;7;8;9;10;11;12;13;14;15;16;17;18;19;20;21;22;23},1)

={“M”;”o”;”n”;”a”;”c”;”o”;”7″;”1″;”9″;”0″;”A”;”u”;”s”;”t”;”r”;”a”;”l”;”i”;”a”;”1″;”4″;”8″;”4″}

2. Then he adds zero to the array

This coerces any numbers stored as text into numbers, while causing the text bits to throw errors

={“M”;”o”;”n”;”a”;”c”;”o”;”7″;”1″;”9″;”0″;”A”;”u”;”s”;”t”;”r”;”a”;”l”;”i”;”a”;”1″;”4″;”8″;”4″} +0

={#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;7;1;9;0; #VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;1;4;8;4}

3. Wrap an ISNUMBER around this unsightly spawn of Satan,

This turns numbers to True, and turns those hideously unnatural errors of nature to better-behaved booleans: =ISNUMBER({#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;7;1;9;0;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;1;4;8;4})

={FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;TRUE;TRUE;TRUE;TRUE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;TRUE;TRUE;TRUE;TRUE}

4. Turn this array into Ones and Zeros, and checks if it equals either One or Zero.

I know…me too. But let’s give him the benefit of the doubt for now, and see what rabbit hole he’s leading us down…we might yet pop out of a magicians’ hat:

=N({FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;TRUE;TRUE;TRUE;TRUE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;TRUE;TRUE;TRUE;TRUE})={1,0}

={0;0;0;0;0;0;1;1;1;1;0;0;0;0;0;0;0;0;0;1;1;1;1}={1,0}

={FALSE,TRUE;FALSE,TRUE;FALSE,TRUE;FALSE,TRUE;FALSE,TRUE;FALSE,TRUE;TRUE,FALSE;TRUE,FALSE;TRUE,FALSE;TRUE,FALSE;FALSE,TRUE;FALSE,TRUE;FALSE,TRUE;FALSE,TRUE;FALSE,TRUE;FALSE,TRUE;FALSE,TRUE;FALSE,TRUE;FALSE,TRUE;TRUE,FALSE;TRUE,FALSE;TRUE,FALSE;TRUE,FALSE}

What this has done in effect is create a 2D array – with one column of the array being the inverse of the other. So one column records whether something is a number, and the other records whether something isn’t a number. I know…me too. But maybe it would help if we saw what this would look like if entered over an Excel range, with our original string down the side so we can try and work out what spell this Wizard is whipping together:

FC021

 5. He then creates a second 2d array with this bit:

=(ROW(INDIRECT(“1:”&LEN(A16)))-{0,1})

={1,0;2,1;3,2;4,3;5,4;6,5;7,6;8,7;9,8;10,9;11,10;12,11;13,12;14,13;15,14;16,15;17,16;18,17;19,18;20,19;21,20;22,21;23,22}

…which if we were to enter over an Excel range would look like this:

FC022

6. He then throws this array into his wizard’s cauldron together with the array in 5 – along with generous portions of Eye of newt, toe of frog, wool of bat and tongue of dog. (This simply means Multiply the two previous Arrays)

This gives us the following:

={FALSE,TRUE;FALSE,TRUE;FALSE,TRUE;FALSE,TRUE;FALSE,TRUE;FALSE,TRUE;TRUE,FALSE;TRUE,FALSE;TRUE,FALSE;TRUE,FALSE;FALSE,TRUE;FALSE,TRUE;FALSE,TRUE;FALSE,TRUE;FALSE,TRUE;FALSE,TRUE;FALSE,TRUE;FALSE,TRUE;FALSE,TRUE;TRUE,FALSE;TRUE,FALSE;TRUE,FALSE;TRUE,FALSE}*{1,0;2,1;3,2;4,3;5,4;6,5;7,6;8,7;9,8;10,9;11,10;12,11;13,12;14,13;15,14;16,15;17,16;18,17;19,18;20,19;21,20;22,21;23,22}

={0,0;0,1;0,2;0,3;0,4;0,5;7,0;8,0;9,0;10,0;0,10;0,11;0,12;0,13;0,14;0,15;0,16;0,17;0,18;20,0;21,0;22,0;23,0}

…which again would look like this if entered over an Excel range (with our original string down the side by way of reference):

FC023

Well mess up my hair, and call me Einstein. Because now I see what all that was about. The only time that the same number in that first array also appears in that second array is that magical place where the first block of numbers ends, and the second block of letters begins. That is some serious sorcery!

7. Multiply the arrays

Next he conjures this 2D array into 1D, with the help of MMULT.

Microsoft tells us that MMULT “…returns the matrix product of two arrays. The result is an array with the same number of rows as array1 and the same number of columns as array2.”

If that’s got you scratching your head, then try this alternate explanation: The M in MMULT stands for Magic.  😉

The way Sajan is using MMULT is to basically add across each row to find out what the total is for each row. Or another way of thinking about it is that because there’s ether a zero or a number in each row, he’s zipping up these two lists into one to just get the numbers and to ignore the zeros:
=MMULT({0,0;0,1;0,2;0,3;0,4;0,5;7,0;8,0;9,0;10,0;0,10;0,11;0,12;0,13;0,14;0,15;0,16;0,17;0,18;20,0;21,0;22,0;23,0},{1;1})

= {0;1;2;3;4;5;7;8;9;10;10;11;12;13;14;15;16;17;18;20;21;22;23}

8. Work out the most common number in that array and add one to it

Now he’s got the start position of our desired substring.

=MODE({0;1;2;3;4;5;7;8;9;10;10;11;12;13;14;15;16;17;18;20;21;22;23})+1

=11

9. Lastly, he uses this trickery to tease out our answer string

=MID(A16,MODE({0;1;2;3;4;5;7;8;9;10;10;11;12;13;14;15;16;17;18;20;21;22;23})+1,LEN(A16))

=MID(A16,11,LEN(A16))

=Australia1484

Austrailia1484. Crikey dick. Strewth, mate!

That bonza beaut has more bounce than a Boomer! Makes me feel a couple of tinnies short of a slab.

And that’s the formula that got the lowly Bronze! Tune in tomorrow, and we’ll see what the Silver-medal winner, Haseeb, has to offer. If you dare…..

 

About the Author.

Jeff Weir – a local of Galactic North up there in Windy Wellington, New Zealand – is more volatile than INDIRECT and more random than RAND. In fact, his state of mind can be pretty much summed up by this:

=NOT(EVEN(PROPER(OR(RIGHT(TODAY())))))

That’s right, pure #VALUE!

Find out more at http:www.heavydutydecisions.co.nz

Facebook
Twitter
LinkedIn

Share this tip with your colleagues

Excel and Power BI tips - Chandoo.org Newsletter

Get FREE Excel + Power BI Tips

Simple, fun and useful emails, once per week.

Learn & be awesome.

Welcome to Chandoo.org

Thank you so much for visiting. My aim is to make you awesome in Excel & Power BI. I do this by sharing videos, tips, examples and downloads on this website. There are more than 1,000 pages with all things Excel, Power BI, Dashboards & VBA here. Go ahead and spend few minutes to be AWESOME.

Read my storyFREE Excel tips book

Overall I learned a lot and I thought you did a great job of explaining how to do things. This will definitely elevate my reporting in the future.
Rebekah S
Reporting Analyst
Excel formula list - 100+ examples and howto guide for you

From simple to complex, there is a formula for every occasion. Check out the list now.

Calendars, invoices, trackers and much more. All free, fun and fantastic.

Advanced Pivot Table tricks

Power Query, Data model, DAX, Filters, Slicers, Conditional formats and beautiful charts. It's all here.

Still on fence about Power BI? In this getting started guide, learn what is Power BI, how to get it and how to create your first report from scratch.

46 Responses

  1. How about? Am I too late to win 🙁

    =MID(A1,MAX(IFERROR(FIND((ROW(1:10)-1)&CHAR(COLUMN(BM:CL)),UPPER(A1)),0))+1,255)

    1. Also note that you don’t need the UPPER, although you do need to change ROW(1:10) to absolute references in order to copy the formula down. And you could replace that hard-coded 255 with say 9^9 to handle any length of string (although we only need a number of no more than 32767 on the end, so 9^9 is slightly overkill, albeit with no ill effect).

      By my account that makes it 76 characters. Well done.

    2. What if the next character after the first numbers is not between A-Z? Like Monaco7190@Australia1484

      Regards

      1. In retrospect, the formula can be further simplified by making the CHAR array more comprehensive (i.e.
        CHAR(COLUMN(BF:DS))). Then, there would be no reason to convert A1 to UPPER(A1).

      2. Elias – that’s another challenge entirely. In this one – based on some question I answered in a forum somewhere recently – the sample data always has the letter A-Z after a number.

        1. I think this one handles the scenario I’m talking about, and still short.

          =MID(A4,MATCH(TRUE,(MID(A4,ROW(A:A),1)*ISERR(-MID(A4,ROW(A:A)+1,1)))>0,)+1,9^9)

          Regards

        1. David…you’ve got to have the numbers first. Plus I changed 255 for 9^9 to handle strings longer than 255 characters:
          =MID(A1,MAX(IFERROR(FIND((ROW(1:10)-1)&CHAR(COLUMN(BF:DR)),A1),0))+1,9^9)

          1. @Rajesh
            CHAR(COLUMN(BF:DR))
            is the same as =CHAR(COLUMN(58:122))
            So it is just extracting an array of the Characters between Char 58 and 122
            ie: between characters : and z

  2. Ha…you spoiled my punch line, David. At the end of this series (a couple of posts of which are still to come) I was going to challenge readers to find a shorter solution than the respondents had so far posted, with a slightly longer version of yours in mind. Specifically =MID(A1,MIN(IFERROR(FIND(CHAR(COLUMN($AW:$BF))&CHAR(ROW($65:$90)),A1),””))+1,LEN(A1))

    Great skills. Be sure to check out the other challenges, and still check in to see the other innovative solutions that I’ll be covering re this challenge.

    1. Whoops, the above comments are out of order. Also, with my slight revision it runs at 79 characters, not 76 as I said below. But that’s still a winner!

  3. Man your formula is insane ! All those Matrix operations !!

    Here my very simple solution

    ={MID(B12;MATCH(TRUE;ISNUMBER(–MID(B12;ROW(INDIRECT(“1:”&LEN(B12)));1));0)+4;100)}

    Regards.

    1. Hi ExcelStrategy. The formula challenge sets out that your formula should handle any length of string, and any length of numbers – not just the lengths shown above/below.

      Your formula is hard-coded to 4 digits, but as per the challenge we can’t rely on the numbers always being four digits.

      1. Now I understand why you have isolated the only sequence of FALSE;TRUE (0;1) in the initial array! it signals the change between the numbers and text in the two blocks of strings !

        I have based my formula on the data wich was inside the example workbook that has a pattern of 4 numbers every time 🙂

  4. Can’t you use the CODE function to get ASCII number greater than 64 related to second sequential number greater than 64 via vlookup or nested if or search or find functions?

    1. Simon: here’s what CODE returns on the whole string:
      ={77;111;110;97;99;111;55;49;57;48;65;117;115;116;114;97;108;105;97;49;52;56;52}
      …and here’s the ones that are greater than 64:
      ={TRUE;TRUE;TRUE;TRUE;TRUE;TRUE;FALSE;FALSE;FALSE;FALSE;TRUE;TRUE;TRUE;TRUE;TRUE;TRUE;TRUE;TRUE;TRUE;FALSE;FALSE;FALSE;FALSE}
      How does that help?

      1. Sorry, I should elaborate further. You can use this:
        =CODE(MID(A1,ROW(A$1:INDEX(A:A,LEN(A1))),1))>64
        …to return this:
        ={TRUE;TRUE;TRUE;TRUE;TRUE;TRUE;FALSE;FALSE;FALSE;FALSE;TRUE;TRUE;TRUE;TRUE;TRUE;TRUE;TRUE;TRUE;TRUE;FALSE;FALSE;FALSE;FALSE}

        And then we can search that array for the first FALSE with this:
        =MATCH(FALSE,CODE(MID(A1,ROW(A$1:INDEX(A:A,LEN(A1))),1))>64,0)
        …but then we’d need to find where the next TRUE falls relative to the length of the original string, so that we could split that original string. And that requires quite a bit more manipulation, which really blows out the length of our formula.

        The next post discusses a formula that does something similar to your suggested approach. But there’s not a MATCH or VLOOKUP in sight. Stay tuned ?

  5. Pair of parentheses not needed, final formula.

    =MID(A1,MAX(IFERROR(FIND(ROW(1:10)-1&CHAR(COLUMN(BF:DR)),A1),0))+1,9^9)

    1. David,
      Very clever technique! I love its simplicity! Sheer genius!

      By the way, I recently discovered your old EEE posts. Amazing stuff! I hope you resurrect those posts since there are very few posts out there of such quality!

      Regards,
      Sajan.

    2. Still need those $ in the Row, though.
      =MID(A1,MAX(IFERROR(FIND(ROW($1:$10)-1&CHAR(COLUMN(BF:DR)),A1),0))+1,9^9)
      Otherwise you have to array-enter your formula individually for each item in our data list.

      1. David, you can shave another character off like this:
        =MID(A1,MAX(IFERROR(FIND(COLUMN(A:J)-1&CHAR(ROW($65:$90)),A1),0))+1,9^9)

        THis works, because (A:J) is shorter than (1:10)

  6. @David – Brilliant…Sheer Genius….
    @Jeff Reminds me of the gem to search a exact word in a string from the academy from you and Craig.

  7. Am new to the Site. Spending hours together to understand the work – breaking my head.

    Could somebody help me to understand why we use “-” in formula (before 2nd MID formula and “-{0,1}) and what {1;1} does.

    MID(A6,MODE(MMULT((N(ISNUMBER(-MID(A6,ROW(INDIRECT(“1:”&LEN(A6))),1)))={1,0})*(ROW(INDIRECT(“1:”&LEN(A6)))-{0,1}),{1;1}))+1,LEN(A6))

    Thanks!
    Valli

  8. Hi All,

    Just came across this………Thought I’d have a crack at it before reading anyone else’s responses, just to see what method my brain would come up with!

    This entered as an array formula (Ctrl + Shift +Enter) would do the trick, changing A1 for whatever cell needed

    =RIGHT(A1,LEN(A1)-MIN(IFERROR(SEARCH({“A”,”B”,”C”,”D”,”E”,”F”,”G”,”H”,”I”,”J”,”K”,”L”,”M”,”N”,”O”,”P”,”Q”,”R”,”S”,”T”,”U”,”V”,”W”,”X”,”Y”,”Z”},A1,MIN(IFERROR((SEARCH({1,2,3,4,5,6,7,8,9},A1,1)),””)))-1,””)))

  9. @Jeff

    Woah psychotic is most probably an understatement. Let me say this english is my first language and I have read and understood your brilliant explanation. I have even taken the time to test bits of your explanation of see what some of the functions are doing.

    What still befuddles me is how a number of you even start to approach this type of problem right from the beginning. Which functions to use and in what order to get to the solutions you are all showing.

    I used to just write very simple formulas that did one or two things and with this site I have come on leaps and bounds by studying the answers and explanations from lots of contributors. A few names always rise to the top (that includes your forum posts Jeff and those of the Forum Ninjas) I just want to get to the point where I can begin to approach a problem like this by understanding which of those functions I should use and how to nest them. Rather than just copying an answer and moving on. I know that will take time, practice and lots more learning and I am sure it will slowly come. Keep up the great explanations and remember us less gifted lovers of excel.

    Bring on episode three Jeff………………….. I can take it 😉

    1. Thanks for your kind words. You raise a good point…how to even start formulating approaches such as this. I’ll probably write something up in due course along those lines. I can certainly say that there’s no way I’d ever have put together the approaches that some of the other respondents put together. That’s the beauty of a challenge like this…you get to see many expert cat-skinners in action.

      1. Jeff a great idea

        “…..you raise a good point…how to even start formulating approaches such as this. I’ll probably write something up in due course along those lines………”

        Looking forward to that. Book me a front row seat.

        Maybe a series of posts from your good self and some of the Ninjas to start with????????????????

  10. I already posted this but look like it was unseen. This one can handle any character after the numbers and not only letters A-Z.

    =MID(A4,MATCH(TRUE,(MID(A4,ROW(A:A),1)*ISERR(-MID(A4,ROW(A:A)+1,1)))>0,)+1,9^9)

    Regards

  11. I can reduce it to 76 characters length. It also does not get mess if users insert/delete a row/column.

    =MID(A4,MATCH(TRUE,MID(A4,ROW(A:A),1)*ISERR(-MID(A4,ROW(A:A)+1,1))>0,)+1,9^9)

    Regards

  12. Hi,

    My 2 cents (76 chars), Excel 2003:
    =MID(A2;MATCH(2;MMULT(N(ISERR(-MID(A2;ROW($1:$99)+{0\1};1)));{1;2});0)+1;99)

    (French conventions:
    formula separator “;”
    {0\1}=column vector 0,1
    {1;2}=row vector 1,2)

  13. It is strange to see how complex data manipulation is with Excel when it is so simple with Word:

    1°) Copy the herebelow lines in Word:

    Monaco7190Australia1484
    Denver11222Chicago4257
    Beijin42Sofia666877
    London11998Paris75006

    2°) FIND & REPLACE
    Find what : (*[0-9]{1;10})(*)(^13)
    Options : Use wildcards
    Replace with : \2\3
    (REPLACE ALL)

    3°) That’s it!

    I wish Excel have these wildcard capabilities!

  14. =MID(A1,MODE(MMULT((N(ISNUMBER(-MID(A1,ROW(INDIRECT(“1:”&LEN(A1))),1)))={1,0})*(ROW(INDIRECT(“1:”&LEN(A1)))-{0,1}),{1;1}))+1,LEN(A1))

    I can’t get this to work on A1= Monaco7190Australia1484

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.