• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Extract numbers after specific text in a text string

RDaga

New Member
Hi Friends,

Need a small help in excel. In a cell I have multiple rows of data which includes text and numbers. I am trying to extract numbers after a specific text for eg :

Data in a cell where I need to run the search say in cell A1:

SUMMARY= AInitial#11
NOTES=
AInitial#11
RCA#=2
================================================================
ID= PIW0
WORKDETAIL_SUBMIT_DATE= 2020-02-20 01:06:27 -0600
TYPE= General Information
Volume#222
SUBMITTER= tp
SUMMARY= AVolume#11
NOTES=
AVolume#12

Search Result
in Cell B1 RCA = 2
in Cell C1 AVolume = 11
in Cell D1 AVolume = 12
in Cell E1 Volume = 222

As per above example i want to extract any numbers after "RCA", “AVolume” “Volume” as shown above. The numbers can be vary in digits i.e. it can be 1 digit numbers or 4 or 6. Also, at times there is some other text in between (like # & =).

Thanking all in advance
 
Sure, have created a sample file and uploaded the same. Sorry as this is my first post :)
 

Attachments

  • Book1.xlsx
    9.9 KB · Views: 10
Try this formula with TEXTJOIN+FILTERXML function

In B2, array (Ctrl+Shift+Enter) formula copied across and down :

=TEXTJOIN(", ",,SUBSTITUTE(FILTERXML("<a><b>"&SUBSTITUTE(SUBSTITUTE($A2,"SUMMARY= ",CHAR(10)),CHAR(10),"</b><b>")&"</b></a>","//b[starts-with(., '"&B$1&"')]"),B$1,""))

70860
 

Attachments

  • Filterxml(BY).xlsx
    10.2 KB · Views: 8
In the attached, your file with a user-defined function ExtractNumbers. It is case sensitive.
 

Attachments

  • Chandoo44882.xlsm
    18.5 KB · Views: 8
Since we are offering alternatives , a small PQ never hurts somebody
Code:
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
    AsList = Table.AddColumn(#"Changed Type", "SplitTxt", each Text.Split([Column1], "#(lf)")),
    SelectSpecificTexts = Table.AddColumn(AsList, "SpecificText", each List.Select([SplitTxt]
  , each  List.AllTrue(
      { Text.Contains(_ , "#") , not Text.Contains(_, "Summary", Comparer.OrdinalIgnoreCase) } ) ) ) ,
    AsRecords = Table.AddColumn(SelectSpecificTexts, "ListOfRecords", each List.Transform([SpecificText], each
[Label = Text.Split(_, "#"){0}
, Value =  Text.Select(Text.Split(_, "#"){1}, {"0".."9"}) ] )),
    AsTable = Table.AddColumn(AsRecords, "AsTable", each Table.PromoteHeaders(Table.Transpose(Table.Combine(List.Transform([ListOfRecords], each Table.FromRecords({_}, {"Label","Value"})))),[PromoteAllScalars=true])),
    #"Expanded AsTable" = Table.ExpandTableColumn(AsTable, "AsTable", {"AInitial", "AVolume", "Volume", "RCA", "Volume_1"}, {"AInitial", "AVolume", "Volume", "RCA", "Volume_1"}),
    RemoveHelpers = Table.RemoveColumns(#"Expanded AsTable",{"SplitTxt", "SpecificText", "ListOfRecords", "AInitial"})
in
    RemoveHelpers
 
Hi Friends, thankyou for all the help. Just wanted to check if it is possible to get a similar thing done for text as well.

INSTANCE:ERR:
PRMTR:EVNTMGT_ERROR
Node(s):IHCCT

Text after "PRMTR:"

I tried =TRIM(MID(I2,SEARCH($M$1,I2)+LEN($M$1),255)) but this giving me all the text after PRMTR: but needed on "EVNTMGT_ERROR" as the output.
 
Hi Friends, thankyou for all the help. Just wanted to check if it is possible to get a similar thing done for text as well.

INSTANCE:ERR:
PRMTR:EVNTMGT_ERROR
Node(s):IHCCT

Text after "PRMTR:"

I tried =TRIM(MID(I2,SEARCH($M$1,I2)+LEN($M$1),255)) but this giving me all the text after PRMTR: but needed on "EVNTMGT_ERROR" as the output.

New question new post

Please open a new thread for your new question.

Regards
Bosco
 
Try this formula with TEXTJOIN+FILTERXML function

In B2, array (Ctrl+Shift+Enter) formula copied across and down :

=TEXTJOIN(", ",,SUBSTITUTE(FILTERXML("<a><b>"&SUBSTITUTE(SUBSTITUTE($A2,"SUMMARY= ",CHAR(10)),CHAR(10),"</b><b>")&"</b></a>","//b[starts-with(., '"&B$1&"')]"),B$1,""))

View attachment 70860
Hi @bosco_yip, sorry for the delayed reply. I have been trying to use this file you shared and it has been really helpful but I observed that for some of the cells it is throwing an error.

I was not sure if the number of character in the cell was an issue. Hence, I added a new col to check the same. Please help me out.

70965
 
Please try at B2

=TEXTJOIN(", ",,TRIM(LEFT(SUBSTITUTE(FILTERXML("<n><b>"&SUBSTITUTE(SUBSTITUTE(SUBSTITUTE($A2,CHAR(10)," ")," "&B$1,"</m><m>"),"/m","/b",1)&"</m></n>","//m")," ",REPT(" ",50)),50)))
 
Please try at B2

=TEXTJOIN(", ",,TRIM(LEFT(SUBSTITUTE(FILTERXML("<n><b>"&SUBSTITUTE(SUBSTITUTE(SUBSTITUTE($A2,CHAR(10)," ")," "&B$1,"</m><m>"),"/m","/b",1)&"</m></n>","//m")," ",REPT(" ",50)),50)))
It is giving me the same result as above. The file shared by p45cal is giving the value for most of the rows.
 
Since the formula return errors, you may want to try the PQ solution, given you do have an Excel version with PQ onboard, do you? 2010 or higher would be needed.
 
Back
Top