Clean up Incorrectly Formatted Phone Numbers using Excel

Posted on September 30th, 2008 in ideas , Learn Excel - 55 comments

cleanup-phone-numbers-using-excel-howtoIn our Utopian imaginations all the data would have been standardized and shareable across systems and people. But alas, the reality is totally different. We seldom get data in the format / way we desire it to be. In other words, the ingredients are all there, but for us to prepare the dinner, you must pre-process them.

Often this pre-processing or cleaning up the data takes quite an amount of time it self leaving very little to do the actual work. That is when you can use excel’s powerful data cleaning techniques to handle the situations.

One common problem with corporate data is incorrectly formatted phone numbers. Most of us are used to a standard 10 digit phone number format like 123-123-1234 or (123) 123 1234, but when you get that customer data, very few phone numbers in it are formatted like above. Instead you might see phone numbers like 1231231234, 12312 31234, (123)123-1234 etc.

It is not really difficult to clean up the phone numbers if we know before hand how they are formatted. For eg. you can easily convert a phone number like 1231231234 to 123-123-1234 using excel text formatting functions like =TEXT(1231231234,"000-000-0000"). But it is a rare case in which we have control over the incoming format and quickly you will have to use a slew of format / text processing functions to clean up the data.

To simplify the whole thing, I have written a small VBA UDF (User Defined Function) which you can add to your excel add-ins list and use to clean up virtually any phone number format to standard phone number.

Function cleanPhoneNumber(thisNumber As String) As String
    ' this function aspires to clean any phone number format
    ' to standard format (+9999) 999-999-9999 or 999-999-9999
    ' works with almost all phone number formats stored in text

Dim retNumber As String

For i = 1 To Len(thisNumber)
    If Asc(Mid(thisNumber, i, 1)) >= Asc("0") And Asc(Mid(thisNumber, i, 1)) <= Asc("9") Then
        retNumber = retNumber + Mid(thisNumber, i, 1)
    End If
Next
If Len(retNumber) > 10 Then
    ' format for country code as well
    cleanPhoneNumber = Format(retNumber, "(+#) 000-000-0000")
Else
    cleanPhoneNumber = Format(retNumber, "000-000-0000")
End If
End Function

The above function is pretty straight forward and simple. It scans the input text for any numeric ASCII codes and saves them to another text field. Once the scanning is complete the function will format the final number to 999-999-9999 format if the number has 10 or less digits, otherwise to (+9999) 999-999-9999 format (with country code).

Like this? Learn these other data cleaning / processing tips:

Handling spelling mistakes in your data
Splitting text using excel formulas
Generating initials from names using excel
Adding a range of cells using Concat()

Your email address is safe with us. Our policies

Written by Chandoo
Tags: , , , , , , , , , , , , ,
Home: Chandoo.org Main Page
? Doubt: Ask an Excel Question

55 Responses to “Clean up Incorrectly Formatted Phone Numbers using Excel”

  1. JP says:

    Here’s the code I use, but I like yours better since it handles text. My code assumes that it’s only numbers and maybe a few odd characters:

    Sub Convert_Phone()
    Application.ScreenUpdating = False

    ‘ first highlight the section you want to work on

    With Selection.SpecialCells(xlConstants)
    .Replace what:=Chr(160), Replacement:=””, LookAt:=xlPart, _
    SearchOrder:=xlByColumns, MatchCase:=True
    .Replace what:=Chr(32), Replacement:=””, LookAt:=xlPart, _
    SearchOrder:=xlByColumns, MatchCase:=True
    .Replace what:=”)”, Replacement:=””, LookAt:=xlPart, _
    SearchOrder:=xlByColumns, MatchCase:=True
    .Replace what:=”(“, Replacement:=””, LookAt:=xlPart, _
    SearchOrder:=xlByColumns, MatchCase:=True
    .Replace what:=”-“, Replacement:=””, LookAt:=xlPart, _
    SearchOrder:=xlByColumns, MatchCase:=True
    .Replace what:=”+”, Replacement:=””, LookAt:=xlPart, _
    SearchOrder:=xlByColumns, MatchCase:=True
    End With

    ‘ at this point you could do one of two things:
    ‘ 1. do a “virtual” format where you just make the cell *appear* to be a
    ‘ phone number.
    ‘ Selection.NumberFormat = “(###) ###-####”

    ‘ 2. We can actually insert the parentheses and dash in the appropriate place.

    ‘ For each cell in Selection
    ‘ cell = “(” & Left(cell, 3) & “) ” & Mid(cell, 4, 3) & “-” & Right(cell, 4)
    ‘ Next cell

    ‘ uncomment whichever one you want!


    Application.ScreenUpdating = True
    End Sub

  2. Chandoo says:

    @JP.. that is one reason for writing the above function.. I wanted to comeup with something that can applied to almost all cases… As I mentioned in the article, most of the times we know how the incoming format is.. in those cases your code is suitable.

  3. JP says:

    What I’ll probably do is convert your code to a routine that can be used on a selection (similar to what mine is now), merge some of my code in, and add a few conditions, for example since I deal with US phone numbers only:

    If Len(retNumber) < 10 Then
    cleanPhoneNumber = “not valid” ‘ show error

    or

    If Len(retNumber) < 10 Then
    cleanPhoneNumber = retNumber ‘ just return the number

  4. Sam says:

    This looks great! How can add this add-in to my Excel workbooks? I don’t know anything about VBA and don’t have VBE. Is there a way for me to use this code some other way? Has anyone already packaged it up as an add-in I can download? If so, I would love you forever!

    -Sam

  5. Chandoo says:

    @Sam: Thanks for the comments. Welcome :)
    You can install a vba user defined function (UDF) like this very simply. Just follow these steps:

    1. In your workbook, right click on the sheet name and select view code
    2. in the resulting window, click menu > insert > module
    3. You will see a blank module (basically a place where you can write all your macros and udfs), copy paste the code from here to there
    4. save by pressing ctrl+s
    5. close it and return to your excel sheet
    6. save the work book, just in case
    7. Now use the cleanPhoneNumber() in your cells like you would use sum(), if() etc.

    remember, using this method, the cleanPhoneNumber() will only work in that particular workbook. If you need to use it in all the workbooks, just save the file as “Excel add in” type (menu > save as, and file type).
    Now from tools > addins, click browse and point to the saved excel addin file. Once the file is shown, make sure you check it to install that add-in. This way, you can use cleanPhoneNumber() in all the workbooks.

    Let me know if you have any difficulties with this method.

  6. Nelson says:

    Ok – I tried following the above instructions and created the UDF and even added it to my “add-ins”, but in the cell where I am putting the formula, i’m seeing the following result: “#NAME?”. When I step through the logic on the formula, it’s not recognizing the name “cleanphonenumber” in the formula.

    What am I missing here?

  7. Dan says:

    This was close to what I needed except it didn’t handle extensions, so I expanded on your function below. If it finds the letter “x” anywhere beyond the 7th digit, it will assume the following numbers are an extension and separate it out. (If “x” is found more than once, it uses the right-most location.)

    So for example:
    Input: (999) 999 9999 Ext. 123
    Output: 999-999-9999 x123

    Function cleanPhoneNumber(thisNumber As String) As String
    ‘ this function aspires to clean any phone number format
    ‘ to standard format (+9999) 999-999-9999 or 999-999-9999
    ‘ works with almost all phone number formats stored in text

    Dim retNumber As String
    Dim ext As String
    Dim extStart As Integer

    extStart = 0 ‘ if an extension is present, it begins at this location

    For i = 1 To Len(thisNumber)
    If Asc(Mid(thisNumber, i, 1)) >= Asc(“0″) And Asc(Mid(thisNumber, i, 1)) = 7 And LCase(Mid(thisNumber, i, 1)) = “x” Then
    extStart = Len(retNumber) + 1
    End If
    Next

    If extStart > 0 Then
    ‘ if an extention is present, separate it out
    ext = Right(retNumber, Len(retNumber) – extStart + 1)
    retNumber = Left(retNumber, extStart – 1)
    End If

    If Len(retNumber) > 10 Then
    ‘ format for country code as well
    cleanPhoneNumber = Format(retNumber, “(+#) 000-000-0000″)
    Else
    cleanPhoneNumber = Format(retNumber, “000-000-0000″)
    End If

    If extStart > 0 Then
    ‘ append the extension if present
    cleanPhoneNumber = cleanPhoneNumber + Format(ext, ” x#”)
    End If

    End Function

  8. Interesting function. Thank you for the function. I wrote a simple replace all based on the same logic as the cleanPhoneNumber function. Here goes. hope you guys will enjoy this one in excel.

    Function ReplaceAll(tStr As String, s2Replace As String, replaceWith As String) As String
    Dim retStr As String

    For i = 1 To Len(tStr)
    If Mid(tStr, i, 1) = s2Replace Then
    retStr = retStr + replaceWith
    Else
    retStr = retStr + Mid(tStr, i, 1)
    End If
    Next

    ReplaceAll = retStr

    End Function

  9. Chandoo says:

    @Dan and Deepak.. very good extensions. Thank you for sharing. :)

  10. barty says:

    HI

    I have been having the same problem as someone else above. When I try it, it keeps telling me ambiguous name and it puts a “#NAME?” in the cell where the formula was. Can any one help me?

  11. Mohaaron says:

    When copy/pasting the newly formatted phone number the paste is pasting the function name instead of the phone number. How can I copy/paste the resulting value? I need to do this so I can copy/paste thousands of rows with the newly formatted phone number.

  12. Chandoo says:

    @Mohaaron… Welcome to chandoo.org and thanks for comments.

    If you want to keep the formatting alone, just use paste special > formats (keyboard shortcut – ALT+E then s and t )

    If you want to use my function above to format the phone number, then assuming your first phone number is in A1, write =cleanPhoneNumber(a1) and drag it down to other cells.

  13. Cannon says:

    I copy and pasted the text but I don’t think it worked because the phone numbers are entered too many different ways in the document. A lot of them have brackets and I want it to be just straight 000-000-0000 or they just have spaces such as 000 000 0000. Do I need to add something to the code to make it change these formats too…?

  14. Krobb Lattrell says:

    if you’re getting a #NAME error, you need to make sure you’ve followed all of the instructions -they do work! I forgot the Insert -> Module part for an hour or so before fixing that and was stuck on the #NAME error. Thanks for all the help everyone!

  15. Patricia says:

    Just wanted to say thank you! This worked like a charm!

  16. Angela says:

    Thanks for the code! You saved me a ton of time on my data cleansing efforts.

  17. Lisa says:

    I kept getting the #NAME error until I went to Tools –> Macro –> Security and changed it to the medium setting and closed Excel (after saving). Upon re-opening, it asked me if I wanted to enable the macro. I enabled it and the function worked as expected, with no further name errors.

  18. Russ says:

    Important: The cell in which you use the function must be certain formats for the UDF to work. Put it in General format, for example. If you put the cleanPhoneNumber function into a TEXT column, it won’t work.

  19. Anil says:

    Thanks for the cool Utility

  20. RJW says:

    As someone who knows nothing about VBA UDA I had a few initial frustrations using this in Excel 2010
    1) For the “NAME#” problem – go to File/Options/Trust Centre/Macro settings/Enable All Macros
    2) Also must “Save As” a Macro-enabled file type – this was not the default on my file menu A “Doh!!” moment but it took me a while to figure this out:)
    2) Was getting “Ambiguous Term” error. After searching “Help” finally tried changing “cleanPhoneNumber” in the Scope line to “cleanPhoneNo” as in “Function cleanPhoneNo(thisNumber As String) As String – and did this thru out the program – I assume, any other name not containing the term “Number” would also solve the problem.

    After this it worked beautifully! Thank you!! – saved me dozens & dozens of labour hours manually cleaning up my client data bases from 2 businesses

  21. UK people, this is for you.

    I got a task from a UK client to format all his 100,000+ phone numbers to UK standard phone number format (5 space 6, i.e. 07321 987654). Some numbers started with the 44 country code and some didn’t, some came with with a leading zero and some didn’t, etc. Many numbers had various combinations of digits with dash, plus, dot, parenthesis, etc. There were a few international numbers as well (other than UK) which mostly got formatted correctly with the country code in parenthesis.

    The following code which I modified from the post got the vast majority of the numbers to be formatted correctly. I’m happy for being able to share it:

    Function cleanPhoneNumberUK(thisNumber As String) As String
    ‘ this function aspires to clean UK phone number format
    ‘ to UK standard format 09999 999999
    ‘ or otherwise to standard format (+9999) 999-999-9999 for non UK numbers
    ‘ works with almost all UK phone number formats stored in text

    Dim retNumber As String

    For i = 1 To Len(thisNumber)
    If Asc(Mid(thisNumber, i, 1)) >= Asc(“0″) And Asc(Mid(thisNumber, i, 1)) <= Asc("9") Then
    retNumber = retNumber + Mid(thisNumber, i, 1)
    End If
    Next

    If Len(retNumber) = 10 Or Len(retNumber) = 11 Then
    ' format by adding a leading zero if it's missing
    cleanPhoneNumberUK = Format(retNumber, "00000 000000")
    Else
    If Len(retNumber) = 12 And Left(retNumber, 2) = 44 Then
    ' UK country code exist but no leading zero, trim country code and add a leading zero
    cleanPhoneNumberUK = Format(Right(retNumber, 10), "00000 000000")
    Else
    If Len(retNumber) = 13 And Left(retNumber, 2) = 44 Then
    ' UK country code and leading zero exist, format by trimming country code
    cleanPhoneNumberUK = Format(Right(retNumber, 11), "00000 000000")
    Else
    cleanPhoneNumberUK = Format(retNumber, "(+#) 000-000-0000")
    ' format for country code other than UK
    End If
    End If
    End If

    End Function

  22. Sana says:

    Hi,

    I tried using this function it helps.. but problem is in my case I have other numbers(like building number, apt# etc) along with phone numbers.. and when i use this function its adding all these numbers together.. I want to extract only phone numbers of format xxx-xxx-xxxx or (xxx)xxx-xxxx or xxxxxxxxxx or xxx xxx xxxx
    Please help.. thanks

  23. Paul says:

    This function was just what I needed.

    Along with phone number I had country information with my data.

    So I made different formats based on the country. That allowed me to fill in country codes and use regional formatting preferences.

  24. TechGuru says:

    There are commercial tools which can help in Extracting phone & mobile numbers from Excel, Word , PDF multiple files and some of examples are here

    http://www.technocomsolutions.com/file-phone-number-extractor.htm

    Some can even extract phone numbers from Outlook and PST files

    http://www.technocomsolutions.com/outlook-phone-number-extractor.html

  25. Nan says:

    Hi – I would like to use the code to clean up phone numbers that have extensions, but I don’t know what to do with the code. Where do you put it in the Excel sheet? I am at the very basic level, so I need step by step instructions, with nothing left out!

    • facepalm says:

      Nan, I think you might have the same problem I did today.

      I had a pretty big moment of failure trying to implement this code today. While it was not my first time using VB, it was my first time using it within Excel. I got the module built just fine, but when it came time to actually execute the function, I was making a pretty silly error.

      I got hung up on trying to put the function in the cells where I had my phone number data already, and of course kept getting a circular reference error because I was referencing a cell the function itself was in. Case of the Mondays for sure.

      What you need to do, Nan, to use the function (assuming you have successfully created the function already), is have all of your phone number data in a column, say Column K, (just using K as a random example column), and then in Column J (also an example), put the function itself. So for the phone number in Column K, Cell 2 (K2), we would go to Column J, Cell 2 (J2), and write the formula: =cleanPhoneNumber(K2), and then it will give you the corrected phone number in that column J2 cell where you put the function formula.

      Once you do that, you can just Copy the corrected phone number in J2 and then > Paste Special > Paste Values in K2, so that your original cell with your original data, now has the corrected data. You can copy and paste by column, it does not have to be cell-by-cell.

      Maybe that will help you.

  26. JP says:

    THis is sick!! ,  you guys just saved my weekend. Thanks!!!!

  27. Shari says:

    Hi,

    Thanks so much for the formula! It worked beautifully except that my original numbers have a +1 in front of them, which then becomes (+1) using your code. What can I do now to delete that initial (+1)? Is there another line I can add to the code?
    Thanks

  28. Cass says:

    This works fantastically on regular phone numbers, but the extensions versions is just returning a blank cell. Any idea what’s going on there?

    • Cass says:

      I kept getting a blank cell, so I modified Dan’s function slightly:
       
      Function cleanPhoneX(thisNumber As String) As String
      ‘ this function aspires to clean any phone number format
      ‘ to standard format (+9999) 999-999-9999 or 999-999-9999
      ‘ works with almost all phone number formats stored in text

      Dim retNumber As String
      Dim ext As String
      Dim extStart As Integer

      extStart = 0 ‘ if an extension is present, it begins at this location

      For i = 1 To Len(thisNumber)
          If Asc(Mid(thisNumber, i, 1)) >= Asc(“0″) And Asc(Mid(thisNumber, i, 1)) <= Asc(“9″) Then
              retNumber = retNumber + Mid(thisNumber, i, 1)
          End If
          If LCase(Mid(thisNumber, i, 1)) = “x” Then
              extStart = Len(retNumber) + 1
              retNumber = retNumber + Mid(thisNumber, i, 1)
          End If
      Next

      If extStart > 0 Then
          ‘ if an extention is present, separate it out
          ext = Right(retNumber, Len(retNumber) – extStart + 1)
          retNumber = Left(retNumber, extStart – 1)
      End If

      If Len(retNumber) > 10 Then
          ‘ format for country code as well
          cleanPhoneX = Format(retNumber, “(+#) 000-000-0000″)
      Else
          cleanPhoneX = Format(retNumber, “000-000-0000″)
      End If

      If extStart > 0 Then
          ‘ append the extension if present
          cleanPhoneX = cleanPhoneX + Format(ext, ”  x#”)
      End If

      End Function
       
       
      Hope that helps someone. Watch for smart quotes when you copy paste.

  29. Joe C says:

    Thank you thank you thank you! Even the =TEXT(A1,”000-000-0000″) function helped me!!!

  30. Aron says:

    The =TEXT formula accomplished my desired results without dealing with the script.
    Using find and replace I dealt with the () and using text to columns I could separate the leading ‘1’ if applicable.
    Then, left with the unformatted 1231231233 I used the =text command to enter hyphens.
    133-123-1234
    Which allowed me to use the highlight duplicates function. 

  31. David says:

    Hello,

    I am running Excel 2010. I have tried everything and am still getting the “#NAME?” problem .   I know nothing about Macros but went to the Tab “Developer” hit “Macros” and created the cleanPhoneNumber as descibed. I pasted the above code into the window called filename.xlsm – Module1 (Code) and saved. I have gone into the Macro Security and made sure “Enable all macros” is picked. I have closed the program and reopened with the same results. I have made sure the column of phone numbers was formated to “General” like above was recommended.

    When I type in the fx: =cleanPhoneNumber that function shows up. I get the “#NAME?” problem everytime.

    Thanks for any help.     

  32. Hayley says:

    Have I mentioned lately how much I love you????? I didn’t even know user defined functions existed and this was so easy to copy and paste…and then it worked! it saved me HOURS of tedious, mind-numbing-but-necessary work.

  33. Bruce says:

    Thanks You!
    One problem, I’d like to use your function, but format with dots instead of dashes (e.g. 800.555.1212).  Just changing the dashes in your format is not enough.  I’ve tried escaping (with ‘ ) or CHR(46), but nothing seems to work.
    Any ideas, Brain Trust?

  34. Ken K says:

    I just used this to clean 10,000 Contact records which I then loaded into Salesforce.Com.  Wow.  Worked like a charm.  Thank You

  35. John says:

    I’m trying to use the cleanPhoneNumber() function but when I use it on a number like 333.333.3333 it is suggesting a format of 123.1231234

    all of my phone#’s are in this format and I just need a quick way to convert them with a macro. I would appreciate the help.

    John
    Wilkes University

  36. David says:

    Thx Chandoo! This worked perfectly! I’ve really dealt with macros before but the code was perfect and you can even tweak it a bit to get a slightly different result. Thx so much…this is awesome!

  37. David says:

    This worked like a charm. I’m using the modified code from several other users and noted I am getting hung up with a compile time syntax error on the extStart if function:
    If extStart > 0 Then
    ‘ if an extention is present, separate it out
    ext = Right(retNumber, Len(retNumber) – extStart + 1)
    retNumber = Left(retNumber, extStart – 1)

    Everything else in the code works. Any ideas?

  38. Mimi says:

    Boy, would this add-in make my life easier! Unfortunately, I have no clue how to “add it” to Excel. Where do I start?

  39. Lawrence says:

    Chandoo:

    Awesome, thank you!

  40. Lawrence says:

    Mimi: see the instructions from the OP near top of comments.

  41. Bobby Davidson says:

    I have created the function but don’t know how to use it. I have been thrown to the wolves and need to fix thousands of phone numbers. How do I run a function or macro in Excel 2003. I get errors but debug isn’t much help? I am a java programmer but am struggling at this excel stuff.

  42. Denise says:

    Thank you so much for this instruction. It helped me so much.

  43. Daniel says:

    What do you edit to just get the numbers? 55512334567

    • Hui... says:

      @Daniel

      Try:
      =SUMPRODUCT(MID(0&A1, LARGE(INDEX(ISNUMBER(–MID(A1, ROW($1:$25),1)) * ROW($1:$25),0), ROW($1:$25))+1,1) * 10^ROW($1:$25)/10)

  44. Set says:

    Huge! Thank you for this code and your site.

  45. Greg says:

    This thread has been very helpful. Thanks so much. However, I am new to VB and need some help modifying the code to achieve a similar result. The data I am trying to clean and format will need to ultimately need to be separated into two columns. The cleaned number will need to be separated to remove the country code, place the area code in one column, the remaining number in the next column, the extension in the next column and all data should be free of the dashes and parenthesis. Can this be acheived fairly easily? Thanks again

  46. Daniel says:

    Hey folks, this has been extremely helpful, thanks for all those that are still helping maintain this article.

    Question, I’m using this, but I’m needing to figure out a way to keep the leading zeros. The first 4 numbers of my cell are going to be a time stamp that needs to be ignored. The problem is, if the time has a zero in front (02:45), it doesn’t count the zero, so it always puts part of the timestamp as part of the number if I set it to start at the 3rd digit, or misses it if i start at the 4th.

    I’m using this:
    Function cleanPhoneNumber(thisNumber As String) As String
    ‘ this function aspires to clean any phone number format
    ‘ to standard format (+9999) 999-999-9999 or 999-999-9999
    ‘ works with almost all phone number formats stored in text

    Dim retNumber As String

    For i = 1 To Len(thisNumber)
    If Asc(Mid(thisNumber, i, 1)) >= Asc(“0″) And Asc(Mid(thisNumber, i, 1)) 10 Then
    ‘ format for country code as well
    cleanPhoneNumber = Format(retNumber, “0000000000”)
    Else
    cleanPhoneNumber = Format(retNumber, “0000000000”)
    End If
    End Function

    Any idea what I add to allow it to include the 0 in the timestamp?

  47. stephen says:

    This is fantastic. I have no idea what I just did, but it worked!

    Thanks so much!

Leave a Reply