Clean up Incorrectly Formatted Phone Numbers using Excel
In our Utopian imaginations all the data would have been standardized and shareable across systems and people. But alas, the reality is totally different. We seldom get data in the format / way we desire it to be. In other words, the ingredients are all there, but for us to prepare the dinner, you must pre-process them.
Often this pre-processing or cleaning up the data takes quite an amount of time it self leaving very little to do the actual work. That is when you can use excel’s powerful data cleaning techniques to handle the situations.
One common problem with corporate data is incorrectly formatted phone numbers. Most of us are used to a standard 10 digit phone number format like 123-123-1234 or (123) 123 1234, but when you get that customer data, very few phone numbers in it are formatted like above. Instead you might see phone numbers like 1231231234, 12312 31234, (123)123-1234 etc.
It is not really difficult to clean up the phone numbers if we know before hand how they are formatted. For eg. you can easily convert a phone number like 1231231234 to 123-123-1234 using excel text formatting functions like =TEXT(1231231234,"000-000-0000"). But it is a rare case in which we have control over the incoming format and quickly you will have to use a slew of format / text processing functions to clean up the data.
To simplify the whole thing, I have written a small VBA UDF (User Defined Function) which you can add to your excel add-ins list and use to clean up virtually any phone number format to standard phone number.
Function cleanPhoneNumber(thisNumber As String) As String
' this function aspires to clean any phone number format
' to standard format (+9999) 999-999-9999 or 999-999-9999
' works with almost all phone number formats stored in text
Dim retNumber As String
For i = 1 To Len(thisNumber)
If Asc(Mid(thisNumber, i, 1)) >= Asc("0") And Asc(Mid(thisNumber, i, 1)) <= Asc("9") Then
retNumber = retNumber + Mid(thisNumber, i, 1)
End If
Next
If Len(retNumber) > 10 Then
' format for country code as well
cleanPhoneNumber = Format(retNumber, "(+#) 000-000-0000")
Else
cleanPhoneNumber = Format(retNumber, "000-000-0000")
End If
End Function
The above function is pretty straight forward and simple. It scans the input text for any numeric ASCII codes and saves them to another text field. Once the scanning is complete the function will format the final number to 999-999-9999 format if the number has 10 or less digits, otherwise to (+9999) 999-999-9999 format (with country code).
Like this? Learn these other data cleaning / processing tips:
Handling spelling mistakes in your data
Splitting text using excel formulas
Generating initials from names using excel
Adding a range of cells using Concat()
Comments
RSS feed for comments on this post. TrackBack URI


At Chandoo.org, I have one goal, "to make you awesome in excel and charting". This blog is started in 2007 and today has 350+ articles and tutorials on using excel, making better charts.

Here’s the code I use, but I like yours better since it handles text. My code assumes that it’s only numbers and maybe a few odd characters:
Sub Convert_Phone()
Application.ScreenUpdating = False
‘
‘ first highlight the section you want to work on
‘
With Selection.SpecialCells(xlConstants)
.Replace what:=Chr(160), Replacement:=”", LookAt:=xlPart, _
SearchOrder:=xlByColumns, MatchCase:=True
.Replace what:=Chr(32), Replacement:=”", LookAt:=xlPart, _
SearchOrder:=xlByColumns, MatchCase:=True
.Replace what:=”)”, Replacement:=”", LookAt:=xlPart, _
SearchOrder:=xlByColumns, MatchCase:=True
.Replace what:=”(”, Replacement:=”", LookAt:=xlPart, _
SearchOrder:=xlByColumns, MatchCase:=True
.Replace what:=”-”, Replacement:=”", LookAt:=xlPart, _
SearchOrder:=xlByColumns, MatchCase:=True
.Replace what:=”+”, Replacement:=”", LookAt:=xlPart, _
SearchOrder:=xlByColumns, MatchCase:=True
End With
‘ at this point you could do one of two things:
‘ 1. do a “virtual” format where you just make the cell *appear* to be a
‘ phone number.
‘ Selection.NumberFormat = “(###) ###-####”
‘ 2. We can actually insert the parentheses and dash in the appropriate place.
‘
‘ For each cell in Selection
‘ cell = “(” & Left(cell, 3) & “) ” & Mid(cell, 4, 3) & “-” & Right(cell, 4)
‘ Next cell
‘
‘ uncomment whichever one you want!
‘
‘
Application.ScreenUpdating = True
End Sub
@JP.. that is one reason for writing the above function.. I wanted to comeup with something that can applied to almost all cases… As I mentioned in the article, most of the times we know how the incoming format is.. in those cases your code is suitable.
What I’ll probably do is convert your code to a routine that can be used on a selection (similar to what mine is now), merge some of my code in, and add a few conditions, for example since I deal with US phone numbers only:
If Len(retNumber) < 10 Then
cleanPhoneNumber = “not valid” ‘ show error
or
If Len(retNumber) < 10 Then
cleanPhoneNumber = retNumber ‘ just return the number
This looks great! How can add this add-in to my Excel workbooks? I don’t know anything about VBA and don’t have VBE. Is there a way for me to use this code some other way? Has anyone already packaged it up as an add-in I can download? If so, I would love you forever!
-Sam
@Sam: Thanks for the comments. Welcome
You can install a vba user defined function (UDF) like this very simply. Just follow these steps:
1. In your workbook, right click on the sheet name and select view code
2. in the resulting window, click menu > insert > module
3. You will see a blank module (basically a place where you can write all your macros and udfs), copy paste the code from here to there
4. save by pressing ctrl+s
5. close it and return to your excel sheet
6. save the work book, just in case
7. Now use the cleanPhoneNumber() in your cells like you would use sum(), if() etc.
remember, using this method, the cleanPhoneNumber() will only work in that particular workbook. If you need to use it in all the workbooks, just save the file as “Excel add in” type (menu > save as, and file type).
Now from tools > addins, click browse and point to the saved excel addin file. Once the file is shown, make sure you check it to install that add-in. This way, you can use cleanPhoneNumber() in all the workbooks.
Let me know if you have any difficulties with this method.
Ok – I tried following the above instructions and created the UDF and even added it to my “add-ins”, but in the cell where I am putting the formula, i’m seeing the following result: “#NAME?”. When I step through the logic on the formula, it’s not recognizing the name “cleanphonenumber” in the formula.
What am I missing here?
This was close to what I needed except it didn’t handle extensions, so I expanded on your function below. If it finds the letter “x” anywhere beyond the 7th digit, it will assume the following numbers are an extension and separate it out. (If “x” is found more than once, it uses the right-most location.)
So for example:
Input: (999) 999 9999 Ext. 123
Output: 999-999-9999 x123
Function cleanPhoneNumber(thisNumber As String) As String
‘ this function aspires to clean any phone number format
‘ to standard format (+9999) 999-999-9999 or 999-999-9999
‘ works with almost all phone number formats stored in text
Dim retNumber As String
Dim ext As String
Dim extStart As Integer
extStart = 0 ‘ if an extension is present, it begins at this location
For i = 1 To Len(thisNumber)
If Asc(Mid(thisNumber, i, 1)) >= Asc(”0″) And Asc(Mid(thisNumber, i, 1)) = 7 And LCase(Mid(thisNumber, i, 1)) = “x” Then
extStart = Len(retNumber) + 1
End If
Next
If extStart > 0 Then
‘ if an extention is present, separate it out
ext = Right(retNumber, Len(retNumber) – extStart + 1)
retNumber = Left(retNumber, extStart – 1)
End If
If Len(retNumber) > 10 Then
‘ format for country code as well
cleanPhoneNumber = Format(retNumber, “(+#) 000-000-0000″)
Else
cleanPhoneNumber = Format(retNumber, “000-000-0000″)
End If
If extStart > 0 Then
‘ append the extension if present
cleanPhoneNumber = cleanPhoneNumber + Format(ext, ” x#”)
End If
End Function
Interesting function. Thank you for the function. I wrote a simple replace all based on the same logic as the cleanPhoneNumber function. Here goes. hope you guys will enjoy this one in excel.
Function ReplaceAll(tStr As String, s2Replace As String, replaceWith As String) As String
Dim retStr As String
For i = 1 To Len(tStr)
If Mid(tStr, i, 1) = s2Replace Then
retStr = retStr + replaceWith
Else
retStr = retStr + Mid(tStr, i, 1)
End If
Next
ReplaceAll = retStr
End Function
@Dan and Deepak.. very good extensions. Thank you for sharing.
HI
I have been having the same problem as someone else above. When I try it, it keeps telling me ambiguous name and it puts a “#NAME?” in the cell where the formula was. Can any one help me?