Extracting numbers from text in excel [Case study]

Share

Facebook
Twitter
LinkedIn

Often we deal with data where numbers are buried inside text and we need to extract them. Today morning I had such task. As you know, we recently ran a survey asking how much salary you make. We had 1800 responses to it so far. I took the data to Excel to analyze it. And surprise! the numbers are a mess. Here is a sample of the data.

Extract numbers from text in Excel - How to?

Now, how do I extract the salary amounts from this without typing the values?

My first thought is to write a user defined function to extract the number from text. But I usually shy away from VBA. So I wanted to see if there is a formula based approach to extract the number from text.

Using formulas to extract number from text

Extracting numbers from text using Excel formulas - process

To extract number from a text, we need to know 2 things:

  1. Starting position of the number in text
  2. Length of the number

For example, in text US $ 31330.00 the number starts at 6th letter and has a length of 8.

So, if we can write formulas to get 1 & 2, then we can combine them in MID formula to extract the number from text!

Finding the starting position of number in text

To find the starting position, we need to find the first character which is a number (0 to 9). In other words, if we can find the positions of 0 to 9 inside the given text, then the minimum of all such positions would be starting position.

Sounds complicated?!? Well, in that case look at the formula and then you will understand why this works.

Assuming the text is in A1 and the range lstNumbers contains 0 to 9, below formula finds starting position

{=MIN(IFERROR(FIND(lstNumbers,A1),””))}

You need to array enter it (CTRL+SHIFT+Enter)

How this formula works?

FIND(lstNumbers, A1) portion: This part finds where each of the numbers 0 to 9 occur in the text in A1. If a match is found, the position is returned. Else we get an error. For US $ 31330.00 the values would be,

{10;7;#VALUE!;6;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!}

Meaning, 0 occurs at 10th position, 1 occurs at 7th position, 3 occurs at 6th position and everything else (2,4,5,6,7,8,9) do not occur in the number.

IFERROR(…,””) portion: Then, we replace errors with empty spaces so that MIN could work its magic.

At this stage, the result would be, {10;7;””;6;””;””;””;””;””;””}

Related: IFERROR Formula – syntax & examples

{=MIN(…)} portion: This would find the minimum of {10;7;””;6;””;””;””;””;””;””} which is 6. The starting position of number inside text.

Because we are finding multiple items, we need to array enter the formula to get correct result.

Finding the length of number

Once we find starting point, next we need to know the length of the number. There are many ways to do this. Depending on the variety in your input data, you can choose a technique that works best.

Approach 1 – counting number of digits in text

My first approach is to count number of digits in the text and use it as length. For this, we can break the text in to individual characters and then see if each of them is a number or not.

Assuming the text is in A1, the number of digits in it are,

=SUMPRODUCT(- -ISNUMBER(MID(A1,ROW($A$1:$A$200),1)+0))

MID(A1,ROW($A$1:$A$200),1) + 0 portion: This breaks the text in A1 in to individual characters (assumes the max length is 200) and then adds 0 to them.

At this stage, you have 200 values some of them numbers, others errors.

ISNUMBER(…) portion: This checks all the 200 values for numbers. After this, we will have 200 true or false values.

— ISNUMBER (…) portion: This converts the true, false values to 0s and 1s. (by double negating Excel will convert boolean values to number equivalents).

SUMPRODUCT(…) portion: This finally sums up all 1s thus giving us the number of digits in the text.

Does it work?

While this approach works well for some numbers, it fails in other cases. For example, a text like US $ 31330.00 has number portion with 8 characters (31330.00) where as our formula would say the length is 7 (because decimal point . is not a number and hence ISNUMBER() would give false for that).

So I had to move on to next approach.

Approach 2 – counting number of digits, commas & decimal points in text

The next approach is to count not only numbers, but also commas & decimal points in the text. For this, first I placed all the digits (0 to 9) and comma & decimal point in a range called as lstDigits.

Below formula counts how many of lstDigits are in text in A1.

=SUMPRODUCT(COUNTIF(lstDigits,MID(A1,ROW($A$1:$A$200),1)))

COUNTIF(lstDigits, MID(…)) portion: This checks how many times each of the 200 characters appear in lstDigits.

This would be an array of counts. For example {0;0;0;0;0;1;1;1;1;1;1;1;1;…} for US $ 31330.00, indicating that first 5 are not in lstDigits and then we have 8 in lstDigits.

SUMPRODUCT(…) portion: just sums all the numbers, hence we get length as 8.

Related: SUMPRODUCT Formula – examples & explanation

Extract numbers from text in excel - results explained

Extracting numbers from text

Once we have starting position of number & its length, we can combine them in a MID formula to extract the number. Here is the result for our sample data set.

As you can see, this method works well, but fails in some cases like,

  • European number formats (, for decimal point and . for thousands)
  • Text with multiple numbers

Fortunately, in my data set, we had only a few incidents like these. So I have decided to manually adjust them than work out even more complicated formula.

Using Macros to extract numbers from text

As you can guess, we can use a simple macro (or UDF) to extract numbers from a given text. We will learn how to do this next week.

Download Example Workbook

Click here to download example workbook with all these formulas. Examine the formulas to understand how you can extract numbers from text in Excel.

How do you Extract numbers from Text?

Often I deal with data like this. I use a mix of techniques. Apart from the one mentioned above I also use,

  • getNumber() UDF to extract numbers from text (more on this next week)
  • Use SUBSTITUTE to clear formatting (replace dots with empty spaces and commas with dots to convert from European format to standard format)
  • Use VALUE to extract the number (works when number is shown as text)
  • Use +0 to force convert numbers from text (works when number is shown as text)

What about you? How do you extract numbers from text? What are your favorite techniques? Please share using comments.

Tips on cleaning data using Excel

If you use Excel to clean data, go thru these articles to learn some powerful techniques.

Facebook
Twitter
LinkedIn

Share this tip with your colleagues

Excel and Power BI tips - Chandoo.org Newsletter

Get FREE Excel + Power BI Tips

Simple, fun and useful emails, once per week.

Learn & be awesome.

Welcome to Chandoo.org

Thank you so much for visiting. My aim is to make you awesome in Excel & Power BI. I do this by sharing videos, tips, examples and downloads on this website. There are more than 1,000 pages with all things Excel, Power BI, Dashboards & VBA here. Go ahead and spend few minutes to be AWESOME.

Read my storyFREE Excel tips book

Overall I learned a lot and I thought you did a great job of explaining how to do things. This will definitely elevate my reporting in the future.
Rebekah S
Reporting Analyst
Excel formula list - 100+ examples and howto guide for you

From simple to complex, there is a formula for every occasion. Check out the list now.

Calendars, invoices, trackers and much more. All free, fun and fantastic.

Advanced Pivot Table tricks

Power Query, Data model, DAX, Filters, Slicers, Conditional formats and beautiful charts. It's all here.

Still on fence about Power BI? In this getting started guide, learn what is Power BI, how to get it and how to create your first report from scratch.

66 Responses to “Budget vs. Actual Charts – 14 Charting Ideas You can Use”

    • Linwe says:

      Hi there:

      I'm interested in understanding exactly how contestants #'s 1, 8 got their surplus or shortfall to show up at the top of the bar (is this overlapped or stacked somehow) and change colour?  I hope this makes sense.  I've tried to find samples and I can see contestant 8 (cuboo) may have used something called graphomate but I can't use this.  

      I need to create a bar chart that shows budget, and actual variance whether it be a surplus or a shortfall and I would like make it look like option 1 or 8 above but haven't  a clear idea how to do it...any help would be greatly appreciated!

      Regards..Linwe 

  1. [...] heute können alle Beiträge auf “Pointy Haired Dilbert” gesichtet und bis zum 12.04. bewertet werden. Falls mein Vorschlag - Nr. 8 - gefällt, freue ich [...]

  2. Jon Peltier says:

    #6 is the best here. Simple, no extraneous visual effects.

  3. Kevin Stanford says:

    I was all set to vote for #9...until I noticed its lack of y-axis labels. So I have to go with #6 also.

  4. I think #6,#9 is enough .

  5. Barfly says:

    #9 is my favorite
    Nice data/ink ratio 😉

  6. Tony Rose says:

    I agree with Jon - #6 for me.

  7. Gale says:

    8 & 14

  8. Fabrice says:

    I go for # 9 (simple) and #14 (complete)

  9. fulvioo says:

    I go for cuboo #8
    cheers

  10. Robert says:

    #6 for overview at a glance / top management
    #8 for deeper analysis / those who need more detailed information

  11. Bob Gannon says:

    #14 although I think you only need the bottom panel and I then would stack the Center charts vertically to make Center comparisons easier.

  12. Denise says:

    #10 gets my vote.
    If there is a second place, then #14
    denise

  13. Tin Seong KAM says:

    Hi, if I was not wrong, Samples 3,4 and 5 were created using Tableau software and not Excel. For more information on Tableau you might want to visit http://www.tableausoftware.com/. It was initially designed by Prof. Pat Hanrahan and his PhD students. I am not their salesperson but I thought someone might want to know more about this particular technology.

    • Linwe says:

      Hi Tin Seong Kam:
       
      Thanks - I have looked at Tableau before.  I have also found the means to reproduce something similar to chart 8 without using graphomate, and also chart 7.  I proposed chart 9  as well but the overlap is confusing to some.
      I am really not too concerned about showing actual budget figures but the variance in $ and % is important for my particular use.  That is why I gravitate to the charts that seem to easily tell us that we have a surplus or a shortfall.  
       
      Thanks!
      Linwe
       

  14. Anamika says:

    11, 6, 9 (presque pareil)
    7 pour la clarté

  15. Haki says:

    cuboo #8 ist my favorite
    best regards...

  16. la'cruse says:

    8 is fantastic

  17. Stefan Sandauer says:

    I prefer N#8 - N# 1,7 & 8 use the settings of Rolf Hichert...

  18. SANTOSH CHAUBE says:

    6 : The GURU (read "Jon Peltier ") has spoken,
    SOO easy on eyes!

  19. Sumit says:

    Hi Chandoo,

    I liked Cuboo's submission. So #8 gets my vote.

    Regards,
    Sumit

  20. jram says:

    Number 8 by far. Even though it's not part of the data display, the comments feature sells me. Variance explanations are as important as the actual variances.

  21. Cyril Z. says:

    I visually prefer #8, but #3 is really easier to understand, even if it lacks a lot of information (inverting budget/actual), legend, etc...

  22. [...] All in all there are several great entries suggesting a good variety to present budget vs. actual performance. Go check them out. [...]

  23. [...] reshape, zoo by learnr A reader of a Pointy Haired Dilbert blog enquired about best ways to visualise budget vs. actual performance. In response PHD challenged his blog readers to contribute their visualisations made using Excel or [...]

  24. anyone willing to post their xls for these? Some really excellent exmaples.

  25. PublicSectorPlanner says:

    To avoid the summary execution of the person presenting these to an executive team these charts must handle overspending as well as underspending, be comprehensible in 5 seconds and show the key fact clearly. The key fact isn't budget or actual - it's the magnitude of the gap!

    Therefore:

    #14 for nailing the key fact and being able to handle overspending. The winner therefore.
    #6 for nailing speed-reading and being able to handle overspending, but somewhat obscuring the key fact. Second place.
    #8 for nailing information depth and aesthetics. Third place.

    I really wanted #8 to win, but that's the technician's view not the end-user's.

  26. [...] Todas as contribuições podem ser vistas no seguinte endereço: Budget vs. Actual Charts – 14 Options You can Use Posted on April 5th, 2009 http://chandoo.org/wp/2009/04/05/budget-vs-actual-charts/ [...]

  27. Social comments and analytics for this post...

    This post was mentioned on Twitter by NancyJHess: I like to explore fav tweets of those I follow. Here is one from DutchDriver http://twurl.nl/17eiap Creative visual charts: Budget vs Actual...

  28. jon says:

    number 8

    clean, full of info, qualitative as well as quantitative

  29. Virender Singh says:

    Hi,
    I Like 4 chart in above as per the following ratings:-
    no 1# -> 14***
    no 2# -> 7***
    no 3 # -> 8**
    no 4# -> 1.3**

    I will be greateful if someone can send me the process of making all above 4 charts.

    Virender

  30. Shazbot says:

    Does anyone know what type of chart #6 is (chart name?)? Also, how do I create this is Excel 2007?

  31. Hui... says:

    @Shazbot
    I'd call it a Column and Bar chart, but don't get hungup on names

    To make it try this:

    Setup the chart as a Clustered Column Chart
    Change the Series so there is 100% overlap, ie: One column is in front of the other
    Change the Budget series to a line chart
    Set the line color to none
    Set the marker style to a Flat Line
    Change the marker width to make it the same width as the bar
    Change colors and other chart properties to suit

  32. Caroline says:

    Does anyone have an idea on how to create chart #1?
    Thanks

  33. Stefan says:

    Caroline, please see the german page: http://www.hichert.com/de/software/exceldiagramme/55

    there you can find the original example for nr1.
    best regards,
    stefan

  34. Hui... says:

    Caroline
    This is a Clustered Stacked Column Chart
    Which has the column under the Shortfall/Excess colored the same as the Budget
    Have a look here

    http://chandoo.org/forums/topic/question-about-budget-v-actual
    &
    http://peltiertech.com/WordPress/clustered-stacked-column-charts/

  35. Vijay says:

    Hi,
    Is it possible to get the source files like the other visualisation challenge (on sales).
    Thanks,
    Vijay

  36. Vijay Raghavendran says:

    Dear Chandoo,

    I discovered your site by pure chance and I am really thrilled about it and I am learning a lot.
    Is it possible to post the source file for this visualisation challenge?

    Thanks,

    Vijay

  37. Greg says:

    Dear Chandoo,

    How do I create Chart #10 (comparing Budget vs Actual Performaces) by cost center by quarter without the cumulative performance. Do you have an actual example that I could use?

    Thanks,

    Greg

  38. OKI says:

    HI

    Does anyone can help me to a to create chart #7? I'm beginer in excel , I started to work two weeks ago and my boss ask me to follow the budget/actual until the end of the year.
    SO I really need your help.
    Thanks in advance

    p.s Sorry for my english ( i'm french)

  39. Hui... says:

    @OKI, Greg

    I have made a mockup of #7 and #10
    It is available at:
    http://chandoo.org/wp/wp-content/uploads/2009/04/Bud-Act-visualizaion-challenge-7+10..xlsx

    #10 is a straight, Pivot Chart/Table but the data has been rearranged to get it into the pivot table

    #7 is 2 charts, being a simple Bar Chart and a Scatter Chart with 100% Error Bars
    I have used Named Formulas for the two charts.

  40. OKI says:

    HELLO Hui
    Thanks you very much for your hepl , i really appreciate

    Have I nice week

  41. Tony says:

    Hi,

    I was wondering how can you replicated chart 1.3? The bars looked like there overlapped on two different axis?

    Tony

  42. BINDU says:

    I think 1 & 3 are good.

  43. Sawan says:

    Hi Chandoo,
    Please can you provide a link of the excel sheet for 1. Chart "3 colors and everything is clear"

    I would like to drill into the spreadsheet and learn the secrets as how the chart was made.

    Many thanks,
    Sawan

  44. Hui... says:

    @Sawan
    It is probably 12 seperate charts, I will assume snapped to the underlying cells to ensure they are the same size
    The left 3 Charts have a vertical Axis
    The bottom 4 Charts have a horizontal Axis
    The remainder have no axis
    The remaining text maynot be part of the charts but is probably cell content

  45. Juan Carlos Etayo says:

    Saludos,

    Como puedo descargar estos maravillosos ejemplos para estudiarlos y analizarlos deseo aprender a realizar este tipo de graficas en Excel.

    Gracias,

  46. Michelle says:

    Dear Chandoo and Hui,

    Please would you help me (step by step if possible) to create Chart #8?

    Many thanks in advance!

  47. Phoebe says:

    Dear Chandoo,

    I think chart #8 is really great. Would really appreciate if you can show basic step to create it.

    Thanks 🙂

  48. Sawan says:

    Hi all,
    Is there any step by step tutorial to recreate the the chart #1 please?
    Would really appreciate if someone could show me how it done.
     
    Regards
    Sawan

  49. ExcelNerd says:

    Can someone tell me how do you create chart number 2? Thanks!

  50. Robert says:

    Am I the only one that can not display any of the images?  Would love to take a look at these.  This is the ONLY page on the whole website I have had this issue with. 🙁

  51. Hassan Mirza says:

    Dear All,
    how can i create chart # 7? is there any link where i can subscribe to your website by paying a certain amount. i want to learn some good excel techniques.
    please let me know.

  52. Carlos says:

    Cant see the images 🙁

  53. Sunil B says:

    Where can I find the link to download some of the above charts?? these are extremely usefull chart and would like to utilize the same.
    Waiting for the reply.
    Thanks..

  54. Khaled Mohamed Abdel Aziz says:

    I am interested for # 1,6,7,8,9,10,11 its very exciting for me .

  55. satyapal says:

    Hi,
    Just wanted to check, is there any possibility that pivot table or drop down work in power point?
    Regards
    Satyapal

    • Chandoo says:

      @Satyapal... you can only use static images or slide animations in Power Point. Not features like pivot tables or drop downs. However, you can embed the entire workbook (or sheet) in a presentation. When clicked this will just open Excel so your users can play with the data.

  56. Ramesh N says:

    Is there any instalment kind of facility available for joining the online course of Rs.12000/-.

    Regards

    Ramesh N

  57. Tim says:

    Hi,

    I badly want to replicate #10. Can someone help me.. I've checked google to help but I can't figure out how to add the total 🙁

    Regards,
    Tim

Leave a Reply