Let’s say you got some text values and want to extract the amounts from them. Something like this:
Note: Thanks to Monty for posting this problem on our forum.
How to go about it?
We could use a variety of techniques to extract the values.
- Formulas – not easy given the unstructured nature of data. But almost possible. See this for an example.
- VBA – possible, read this forum discussion few ways to do it.
- Power Query – at first glance it might seem tricky, but PQ makes this all too easy. Read on.
Extracting amounts from text items – Power Query Tutorial
- Specify a list of currency codes. Create a table in Excel and mention the codes. Something like below. Let’s name this table as codes
- Load text data in to Power Query (Use Power Query > from Table or Data > from Table). We get this:
- Now, let’s bring in currency codes as a cross join. To do this, just insert a new custom column. In the formula section, type
= Excel.CurrentWorkbook(){[Name=”codes“]}[Content] - This will bring a new column with all currency codes as tables. Expand the column to cross join both tables. See below demo.
- Now, add a conditional column to check which currency code is present in the text data. You can use below settings:
- At this stage, our PQ data looks like this:
- Now, let’s filter away any nulls in the Found? column.
- Splitting each row by the currency code in next column: this is the tricky part. We can use Text.Split() to split a text value by delimiter. But the result will be a list. We just want one of the items of that list. Simple, we can pass the result of Text.Split() to List.Last() to get that. Use below formula:=List.Last(Text.Split([String],[Custom.Codes]))
- We get this:
- Now, convert the Amount column to decimal number. This will throw errors for incorrect values like .530.680268. Simply remove all these errors.
- Tidy up by removing unnecessary columns and renaming the rest. Load in to Excel. Here is a snapshot of cleaned data.
Download Example Workbook
Click here to download sample workbook. Try cleaning the data in first tab yourself using PQ. You will realize how awesome and simple this approach is compared to either formula or VBA driven methods.
How are you using Power Query?
These days, PQ (or get data & transform as it is known in newer versions of Excel) has become my go to tool for most data polishing, cleanup and reshaping problems. What about you? Are you addicted to PQ yet? Please share your experiences and wins in the comments section.
More ways to extract, clean and massage data with Power Query:
If this looks interesting, check out below tuts to learn more about PQ.
- Introduction to Power Query – Podcast
- SUMPRODUCT Vs. Power Query
- Figuring out Employee Churn with Power Query [HR Analytics]
- Unpivot data quickly with Power Query [tutorial]
11 Responses to “Extract currency amounts from text – Power Query Tutorial”
Very cool technique! I love learning new ways to do things, and didn't realize Text.Split() returned a list.
You can also use the new function Text.AfterDelimiter() function using the currency code as you describe. You can choose to start from the left or right, and how many instances of the delimiter to skip before you grab text.
Good Afternoon Sir,
Really, your hardwork on your website. Helping people across the Globe to learn new Techniques which make our working lives comfortable.
Thank You,
Bhavani Seetal Singh.
Very nice! Thanks for showing us that.
Ohh, good challenge.
I wrote a crazy formula to dynamically extract the nth numerical element without delimiters (including numbers with decimal points in them as well as times, dates etc) some time back at http://dailydoseofexcel.com/archives/2014/11/20/dynamically-extracting-the-nth-numerical-element-without-delimiters/ that could likely be amended to do this, but it would return '2017' from that 5th line, which in this case is not what we want.
While a formula approach is possible, it ain't the way to go to crack this particular nut.
There's so much potential for me in the application of this. I really need to look into Power Query more.
Amazing article as always 🙂
[…] http://chandoo.org/wp/2017/07/27/extract-currency-amounts-from-text-power-query-tutorial/ […]
And Power Query delivers yet again! This is a nice one, Chandoo. Loving the cross join solution.
You do do an amazing job, Chandoo.
I read all your posts with great interest and appreciaition.
This formula worked for me:
=VLOOKUP(TRIM(RIGHT(TRIM(LEFT(B4,FIND("xxx",B4)-1)),3)),$H$4:$I$9,2,0)
For this kind of string manipulation, I always have a function which would enable regular expression
Function RegexExtract(ByVal text As String, _
ByVal extract_what As String, _
Optional separator As String = ", ") As String
Dim allMatches As Object
Dim RE As Object
Set RE = CreateObject("vbscript.regexp")
Dim i As Long, j As Long
Dim result As String
RE.Pattern = extract_what
RE.Global = True
Set allMatches = RE.Execute(text)
For i = 0 To allMatches.Count - 1
result = result & (separator & allMatches.Item(i).Value)
Next
If Len(result) 0 Then
result = Right$(result, Len(result) - Len(separator))
End If
RegexExtract = result
End Function
Final formula in the excel file that you provided:
=IFERROR(VALUE(RegexExtract(data[[#This Row],[String]],"(\d+(\.\d+)+)$")),"")
Here is how to practice and tailor regular expression: http://regexr.com/3gh92
Once you master it, there is almost no limits 😉
This was very interesting and helpful. Thank you.