• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Formula breaks is string has unexpected characters?

belinea2010

New Member
Window 7 64 bit
Excel 2016

Formula to extract data from strings work well except when string is not in expected format?

I have formula’s that play an explicit roll in pulling specific data from a complicated string in Column A on an Excel work sheet named “Data Import” and then splitting the data into various Columns on a different worksheet. The strings are made up of letters (lower and upper caps), numbers and special characters.

Here are some dummy examples of the strings:
M!xjeio_827Jnx_-_Mhwsush82_09Z_[Mhgdh_L$_1qt]_(K2_19R7t)_48_kHz_Fc_9it_g8v
2h7TD_-_jn0n3r_kksjd7_[882n6s]_(je_73%96)_44.1_kHz_he_aa3_W7v

The constant characters in the string that will not change are “_-_” , “[“ , “]” , “(“ , “)” and “kHz”.
I have placed them in bold in the examples so that you may see them better.

The character “_” is used in place of a space “ “
I use these constant characters as delimiters in the formulas to tell the formulas where to extract the data I need and it works well.

In my testing of the workbook I have discovered that if the string contains extra’s of the constant characters, “_-_” , “[“ , “]” , “(“ , “)” and “kHz” it breaks one of my formulas which was kindly supplied by a member here in answer to a different question but when I asked the question I did not know about this new problem – I will explain below.

Using my examples the formula is:
Code:
=IF(Data_Import!A1<>"",LOOKUP(9^9,0+MID(Data_Import!A1,FIND(")_",Data_Import!A1)+2,ROW($1:$85)))&" kHz","")

This formula looks at the string and finds the characters ")" and "kHz" , removes the underscore "_" replaces it with a space " " then returns only the text/characters BETWEEN the "(" and "kHz" but including “kHz” but not “)”.

Using my examples the result would be “48 kHz” and “44.1 kHz”.

This works fine as long as the strings are all in the same format but I have found some of the strings have extra data inside another 2nd set of “(“ , “)” which breaks the formula returning “#N/A”.

This is an example of a string that would break the formula:
M!xjeio_827Jnx_-_Mhwsu_(I’m_Extra)_sh82_09Z_[Mhgdh_L$_1qt]_(K2_19R7t)_48_kHz_Fc_9it_g8v

Notice to the right hand side there is now “(I’m_Extra)” – In bold so you can see it better.
It is these extra "(" and ")" that are breaking the formula.

Unless you guys can make a better suggestion I think the formula would need to look at the string starting at the right hand side and the use the first “(“ and “)” it finds on the right hand side. That way it would ignore and other “(“ and “)” that may appear in the string?

What do you suggest?
 
Try if below works.
Code:
=FILTERXML("<a><b>"&SUBSTITUTE(A1,"_","</b><b>")&"</b></a>","//b["&MATCH("kHz",FILTERXML("<a><b>"&SUBSTITUTE(A1,"_","</b><b>")&"</b></a>","//b"),0)-1&"]") & " kHz"
 
Changed your formula to this >>

In B1, copied down :

=IF(Data_Import!A1<>"",LOOKUP(9^9,0+RIGHT(LEFT(Data_Import!A1,FIND("_kHz",Data_Import!A1)-1),ROW($1:$99)))&" kHz","")

68826

Or,

To use FILTERXML function of which available in Excel 2013, Excel 2016, Excel 2019 and Office 365

In B1, copied down :

=FILTERXML("<a><b>"&SUBSTITUTE(SUBSTITUTE(Data_Import!A1,"_kHz"," kHz<r/>"),"_","</b><b>")&"</b></a>","//b[r]")

68831
 
Last edited:
Changed your formula to this >>

In B1, copied down :

=IF(Data_Import!A1<>"",LOOKUP(9^9,0+RIGHT(LEFT(Data_Import!A1,FIND("_kHz",Data_Import!A1)-1),ROW($1:$99)))&" kHz","")

View attachment 68826

Or,

To use FILTERXML function of which available in Excel 2013, Excel 2016, Excel 2019 and Office 365

In B1, copied down :

=FILTERXML("<a><b>"&SUBSTITUTE(SUBSTITUTE(Data_Import!A1,"_kHz"," kHz<r/>"),"_","</b><b>")&"</b></a>","//b[r]")

View attachment 68831
Outstanding, that worked perfectly. Thank you
 
Back
Top