Often we deal with data where numbers are buried inside text and we need to extract them. Today morning I had such task. As you know, we recently ran a survey asking how much salary you make. We had 1800 responses to it so far. I took the data to Excel to analyze it. And surprise! the numbers are a mess. Here is a sample of the data.

Now, how do I extract the salary amounts from this without typing the values?
My first thought is to write a user defined function to extract the number from text. But I usually shy away from VBA. So I wanted to see if there is a formula based approach to extract the number from text.
Using formulas to extract number from text

To extract number from a text, we need to know 2 things:
- Starting position of the number in text
- Length of the number
For example, in text US $ 31330.00 the number starts at 6th letter and has a length of 8.
So, if we can write formulas to get 1 & 2, then we can combine them in MID formula to extract the number from text!
Finding the starting position of number in text
To find the starting position, we need to find the first character which is a number (0 to 9). In other words, if we can find the positions of 0 to 9 inside the given text, then the minimum of all such positions would be starting position.
Sounds complicated?!? Well, in that case look at the formula and then you will understand why this works.
Assuming the text is in A1 and the range lstNumbers contains 0 to 9, below formula finds starting position
{=MIN(IFERROR(FIND(lstNumbers,A1),””))}
You need to array enter it (CTRL+SHIFT+Enter)
How this formula works?
FIND(lstNumbers, A1) portion: This part finds where each of the numbers 0 to 9 occur in the text in A1. If a match is found, the position is returned. Else we get an error. For US $ 31330.00 the values would be,
{10;7;#VALUE!;6;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!}
Meaning, 0 occurs at 10th position, 1 occurs at 7th position, 3 occurs at 6th position and everything else (2,4,5,6,7,8,9) do not occur in the number.
IFERROR(…,””) portion: Then, we replace errors with empty spaces so that MIN could work its magic.
At this stage, the result would be, {10;7;””;6;””;””;””;””;””;””}
Related: IFERROR Formula – syntax & examples
{=MIN(…)} portion: This would find the minimum of {10;7;””;6;””;””;””;””;””;””} which is 6. The starting position of number inside text.
Because we are finding multiple items, we need to array enter the formula to get correct result.
Finding the length of number
Once we find starting point, next we need to know the length of the number. There are many ways to do this. Depending on the variety in your input data, you can choose a technique that works best.
Approach 1 – counting number of digits in text
My first approach is to count number of digits in the text and use it as length. For this, we can break the text in to individual characters and then see if each of them is a number or not.
Assuming the text is in A1, the number of digits in it are,
=SUMPRODUCT(- -ISNUMBER(MID(A1,ROW($A$1:$A$200),1)+0))
MID(A1,ROW($A$1:$A$200),1) + 0 portion: This breaks the text in A1 in to individual characters (assumes the max length is 200) and then adds 0 to them.
At this stage, you have 200 values some of them numbers, others errors.
ISNUMBER(…) portion: This checks all the 200 values for numbers. After this, we will have 200 true or false values.
— ISNUMBER (…) portion: This converts the true, false values to 0s and 1s. (by double negating Excel will convert boolean values to number equivalents).
SUMPRODUCT(…) portion: This finally sums up all 1s thus giving us the number of digits in the text.
Does it work?
While this approach works well for some numbers, it fails in other cases. For example, a text like US $ 31330.00 has number portion with 8 characters (31330.00) where as our formula would say the length is 7 (because decimal point . is not a number and hence ISNUMBER() would give false for that).
So I had to move on to next approach.
Approach 2 – counting number of digits, commas & decimal points in text
The next approach is to count not only numbers, but also commas & decimal points in the text. For this, first I placed all the digits (0 to 9) and comma & decimal point in a range called as lstDigits.
Below formula counts how many of lstDigits are in text in A1.
=SUMPRODUCT(COUNTIF(lstDigits,MID(A1,ROW($A$1:$A$200),1)))
COUNTIF(lstDigits, MID(…)) portion: This checks how many times each of the 200 characters appear in lstDigits.
This would be an array of counts. For example {0;0;0;0;0;1;1;1;1;1;1;1;1;…} for US $ 31330.00, indicating that first 5 are not in lstDigits and then we have 8 in lstDigits.
SUMPRODUCT(…) portion: just sums all the numbers, hence we get length as 8.
Related: SUMPRODUCT Formula – examples & explanation

Extracting numbers from text
Once we have starting position of number & its length, we can combine them in a MID formula to extract the number. Here is the result for our sample data set.
As you can see, this method works well, but fails in some cases like,
- European number formats (, for decimal point and . for thousands)
- Text with multiple numbers
Fortunately, in my data set, we had only a few incidents like these. So I have decided to manually adjust them than work out even more complicated formula.
Using Macros to extract numbers from text
As you can guess, we can use a simple macro (or UDF) to extract numbers from a given text. We will learn how to do this next week.
Download Example Workbook
Click here to download example workbook with all these formulas. Examine the formulas to understand how you can extract numbers from text in Excel.
How do you Extract numbers from Text?
Often I deal with data like this. I use a mix of techniques. Apart from the one mentioned above I also use,
- getNumber() UDF to extract numbers from text (more on this next week)
- Use SUBSTITUTE to clear formatting (replace dots with empty spaces and commas with dots to convert from European format to standard format)
- Use VALUE to extract the number (works when number is shown as text)
- Use +0 to force convert numbers from text (works when number is shown as text)
What about you? How do you extract numbers from text? What are your favorite techniques? Please share using comments.
Tips on cleaning data using Excel
If you use Excel to clean data, go thru these articles to learn some powerful techniques.














15 Responses to “Modeling Interest During Construction (IDC) – Excel Project Finance”
Thanks again for a very helpful post.
I had a similar problem when trying to model a balance sheet and profit and loss projection. The problem was that interest expense (in P&L) was dependent on a cash shortfall (in BS) which had to be funded. The cash shortfall depended on how much interest was paid, so the mutual dependency made a circular reference.
I addressed it with a macro that calculated interest outside of the P&L, then pasted the calculated amount into the P&L as a value. The model was out of balance, but by repeating the pasting and calculating loop the imbalance reduced to zero. It was a bit messy, and had to be repeated every time a line changed - but it worked.
If I have to do it again I'll read this article again first and see if it can be done more elegantly.
Hi,
The use of a circular reference can be avoided in this case. Just make use of the geometric sum to calculate the interest required. I’ll walk through the example from the spreadsheet.
First calculate the cash needed each year without the interest expense. So you year 1 you need 55 Mn, year 2 105 Mn, and 190 Mn for year 3. The total amount to borrow for year 1 is then (50 Mn)/(1-interest_rate) = (50)/(1-0.1). For years 2 and subsequent the amount borrowed is the cash needed in that year plus the interest_rate times the amount already borrowed. For year 2 (105 + interest_rate * sum(previous debt raised))/(1-interest_rate)=(105+0.1*61.1)/(1-0.1).
This process avoids the need for a circular reference, and makes the calculation more stable.
Thanks,
Tristan
The question is for the year 1 in your case, the amount works out to 45 mn. However in the year 2 you have applied the loan amount as 61.1 mn.
Am I missing something ! Please help !
very helpful information!!!
using circular references and to make model more stable we can use combination of "IF" and "ISERROR" functions. i.e
=if(iserror(formula1),"",(formula1))
this formula will return blank value if there is any error otherwise give the result required.
I usually use this in my models and it makes them very stable......
🙂 🙂 🙂
@Terry: Thats right. Exactly same problem is seen in Interest - Cash cycle in P&L and Cash Flow statement as well. In our trainings on financial modeling in excel, we demonstrate using both the circular loops as well as the macros to take care of this problem. Circular loops have their own pitfalls. If the model enters into a state of error, the error percolates!
@Tristan: Thanks for pointing out. I agree with you that if circular loops can be avoided, they should be avoided.
@Yogesh: This is one way of avoiding the problem. Although circular loops have another problem that they make your sheet slower. Each time, there is a change in the sheet, all the calculations are redone. So if they can be avoided, they should be avoided.
Please note that this was an example (a large one indeed) and I didn't have space to speak about the pitfalls of this approach! I just wanted to illustrate an approach and am glad that some of you found it useful!
I think while posting, there is an error in the images! The last image should be flipped with the one that is posted in step VII!
I think you can try the following simple solution given by Microsoft itself to make the circular works:
Windows: Excel Options -> Formulas -> Put a tick on "Enable iterative calculation"
Mac: Excel -> Preference -> Calculation ->Put a tick on "Limit iteration"
You can change the maximum number of calculation iterations as well as the maximum changes which iteration stops for goal seeking or for resolving circular references based on the number you type in the maximum change box.
Thank you.
Hey All
I heard that we can take care of the circularity with the help of macro for IDC. Can anybody help on the steps to construct the macro for the same.
Regards
Vinay
Hi Vinay,
If you look closely, you are essentially copying the values from the interest calculation to the IDC in project cost.
Basically you can record a macro, that takes the values from interest and pastes special the values in IDC row in project cost.
Then you can run that recorded code in a for loop.
Hope this helps.
Thanks Param for reply.
But before calculating interest, i need to provide for Upfront Equity and Equity, which are essentially part of total project cost. Hence, i need to put in Upfront Equity and Equity to calculate the IDC which is again hitting the total project cost.
Bit of confused on how to remove this circular reference.
Regards
Vinay
Wow, this was a brilliantly simple post. I was looking online for a while before I found this page. Never seen this been explained so beautifully yet so crisply before. Thanks for saving my ass at work! (i'm relatively new to finance + modeling)
I'm not sure why but this web site is loading very slow for
me. Is anyone else having this issue or is it a problem on my end?
I'll check back later and see if the problem still exists.
[…] Project Finance Modeling using Excel – Part 1 & Part 2 […]
I have been reading your blog since my college days. Today, I'm writing just to say thanks.
We have calculated Financial Rate of return of a hydropower projects, and the observer has raised an observation regarding Total Project cost with IDC Rs. 8616.01 million (PKR) and with-out IDC 8352.46 million (PKR). How does the Financial nalysis be calculated on the basis of with-out IDC Or With IDC?????
Please helpf. if possible to spare some time.