Extract currency amounts from text – Power Query Tutorial

Posted on July 27th, 2017 in Power Query - 10 comments

Let’s say you got some text values and want to extract the amounts from them. Something like this:

extract-amounts-from-text

Note: Thanks to Monty for posting this problem on our forum.

How to go about it?

We could use a variety of techniques to extract the values.

Extracting amounts from text items – Power Query Tutorial

  1. Specify a list of currency codes. Create a table in Excel and mention the codes. Something like below. Let’s name this table as codes 
    currency-codes
  2. Load text data in to Power Query (Use Power Query > from Table or Data > from Table). We get this:
    text-values-loaded-into-pq
  3. Now, let’s bring in currency codes as a cross join. To do this, just insert a new custom column. In the formula section, type
    = Excel.CurrentWorkbook(){[Name=”codes“]}[Content]

    addin-codes-table-to-text-data

  4. This will bring a new column with all currency codes as tables. Expand the column to cross join both tables. See below demo.
    expand-corss-join-two-tables
  5. Now, add a conditional column to check which currency code is present in the text data. You can use below settings:

    conditional-column-on-codes

  6. At this stage, our PQ data looks like this:

    currency-codes-found

  7. Now, let’s filter away any nulls in the Found? column.
  8. Splitting each row by the currency code in next column: this is the tricky part. We can use Text.Split() to split a text value by delimiter. But the result will be a list. We just want one of the items of that list. Simple, we can pass the result of Text.Split() to List.Last() to get that. Use below formula:=List.Last(Text.Split([String],[Custom.Codes]))

    column-formula-to-extract-amount-from-text

  9. We get this:

    output-after-extracting-amounts-almost

  10. Now, convert the Amount column to decimal number. This will throw errors for incorrect values like .530.680268. Simply remove all these errors.

    amounts-after-converting-to-number

  11. Tidy up by removing unnecessary columns and renaming the rest. Load in to Excel. Here is a snapshot of cleaned data.

    final-output-amounts-from-text

Download Example Workbook

Click here to download sample workbook. Try cleaning the data in first tab yourself using PQ. You will realize how awesome and simple this approach is compared to either formula or VBA driven methods.

How are you using Power Query?

These days, PQ (or get data & transform as it is known in newer versions of Excel) has become my go to tool for most data polishing, cleanup and reshaping problems. What about you? Are you addicted to PQ yet? Please share your experiences and wins in the comments section.

More ways to extract, clean and massage data with Power Query:

If this looks interesting, check out below tuts to learn more about PQ.

 

Written by Chandoo
Tags: , , , , ,
Home: Chandoo.org Main Page
? Doubt: Ask an Excel Question

10 Responses to “Extract currency amounts from text – Power Query Tutorial”

  1. Chris H says:

    Very cool technique! I love learning new ways to do things, and didn't realize Text.Split() returned a list.

    You can also use the new function Text.AfterDelimiter() function using the currency code as you describe. You can choose to start from the left or right, and how many instances of the delimiter to skip before you grab text.

  2. Bhavani Seetal Singh says:

    Good Afternoon Sir,

    Really, your hardwork on your website. Helping people across the Globe to learn new Techniques which make our working lives comfortable.

    Thank You,
    Bhavani Seetal Singh.

  3. Peter says:

    Very nice! Thanks for showing us that.

  4. Jeff Weir says:

    Ohh, good challenge.

    I wrote a crazy formula to dynamically extract the nth numerical element without delimiters (including numbers with decimal points in them as well as times, dates etc) some time back at http://dailydoseofexcel.com/archives/2014/11/20/dynamically-extracting-the-nth-numerical-element-without-delimiters/ that could likely be amended to do this, but it would return '2017' from that 5th line, which in this case is not what we want.

    While a formula approach is possible, it ain't the way to go to crack this particular nut.

  5. Jack Wells says:

    There's so much potential for me in the application of this. I really need to look into Power Query more.

    Amazing article as always 🙂

  6. GraH says:

    And Power Query delivers yet again! This is a nice one, Chandoo. Loving the cross join solution.

  7. Mike says:

    You do do an amazing job, Chandoo.
    I read all your posts with great interest and appreciaition.

  8. Umakanth Kokkula says:

    This formula worked for me:
    =VLOOKUP(TRIM(RIGHT(TRIM(LEFT(B4,FIND("xxx",B4)-1)),3)),$H$4:$I$9,2,0)

  9. Nicolas ALLANO says:

    For this kind of string manipulation, I always have a function which would enable regular expression

    Function RegexExtract(ByVal text As String, _
    ByVal extract_what As String, _
    Optional separator As String = ", ") As String

    Dim allMatches As Object
    Dim RE As Object
    Set RE = CreateObject("vbscript.regexp")
    Dim i As Long, j As Long
    Dim result As String

    RE.Pattern = extract_what
    RE.Global = True
    Set allMatches = RE.Execute(text)

    For i = 0 To allMatches.Count - 1
    result = result & (separator & allMatches.Item(i).Value)
    Next

    If Len(result) 0 Then
    result = Right$(result, Len(result) - Len(separator))
    End If
    RegexExtract = result
    End Function

    Final formula in the excel file that you provided:
    =IFERROR(VALUE(RegexExtract(data[[#This Row],[String]],"(\d+(\.\d+)+)$")),"")

    Here is how to practice and tailor regular expression: http://regexr.com/3gh92

    Once you master it, there is almost no limits 😉

Leave a Reply