In our Utopian imaginations all the data would have been standardized and shareable across systems and people. But alas, the reality is totally different. We seldom get data in the format / way we desire it to be. In other words, the ingredients are all there, but for us to prepare the dinner, you must pre-process them.
Often this pre-processing or cleaning up the data takes quite an amount of time it self leaving very little to do the actual work. That is when you can use excel’s powerful data cleaning techniques to handle the situations.
One common problem with corporate data is incorrectly formatted phone numbers. Most of us are used to a standard 10 digit phone number format like 123-123-1234 or (123) 123 1234, but when you get that customer data, very few phone numbers in it are formatted like above. Instead you might see phone numbers like 1231231234, 12312 31234, (123)123-1234 etc.
It is not really difficult to clean up the phone numbers if we know before hand how they are formatted. For eg. you can easily convert a phone number like 1231231234 to 123-123-1234 using excel text formatting functions like =TEXT(1231231234,"000-000-0000"). But it is a rare case in which we have control over the incoming format and quickly you will have to use a slew of format / text processing functions to clean up the data.
To simplify the whole thing, I have written a small VBA UDF (User Defined Function) which you can add to your excel add-ins list and use to clean up virtually any phone number format to standard phone number.
Function cleanPhoneNumber(thisNumber As String) As String
' this function aspires to clean any phone number format
' to standard format (+9999) 999-999-9999 or 999-999-9999
' works with almost all phone number formats stored in text
Dim retNumber As String
For i = 1 To Len(thisNumber)
If Asc(Mid(thisNumber, i, 1)) >= Asc("0") And Asc(Mid(thisNumber, i, 1)) <= Asc("9") Then
retNumber = retNumber + Mid(thisNumber, i, 1)
End If
Next
If Len(retNumber) > 10 Then
' format for country code as well
cleanPhoneNumber = Format(retNumber, "(+#) 000-000-0000")
Else
cleanPhoneNumber = Format(retNumber, "000-000-0000")
End If
End Function
The above function is pretty straight forward and simple. It scans the input text for any numeric ASCII codes and saves them to another text field. Once the scanning is complete the function will format the final number to 999-999-9999 format if the number has 10 or less digits, otherwise to (+9999) 999-999-9999 format (with country code).
Like this? Learn these other data cleaning / processing tips:
Handling spelling mistakes in your data
Splitting text using excel formulas
Generating initials from names using excel
Adding a range of cells using Concat()












11 Responses
Ciao Hui,
Collecting Excel tricks under the title “Notable Excel Websites (Non-MVP) Edition” is a brilliant idea…
Thank you in the name of all The FrankensTeam.
On our site there is a box with a picture and text highlighting:
This is a no-MVP site
we think ourselves “bad boys” a bit 🙂
For those who would like to know why our site is a no-MVP site, enough to click on the link:
http://goo.gl/lxDszY
Thank you again!
Thanks a lot
I really enjoyed this (newsletter). I must admit that I rarely read an Excel newsletter (and I subscribe to quite a few) all the way though, but this grabbed my attention and before I realized it, I was engrossed in it. I must also admit that most of this I don’t understand, yet. But, it excites me when I do learn something new in Excel. I can’t wait to see how much of this I can implement into my (constantly-evolving) ‘House Budget’ & ‘Family Medical’ worksheets that I have developed over the past few years! I sure hope to see more of these type of newsletters in the future! Thanks!
Thanks for doing this Hui! I appreciate being included.
I like Tom’s tip a lot. I posted about a tool I wrote to automate this at http://yoursumbuddy.com/tables-edit-query-dialog/
EXCELLENT !
Hui, This post is Superb! More over I have always been a fan of Roberto’s work and have learnt a lot from him.
Here are some of my recent contributions
1. Customising markers in a chart – http://www.goodly.co.in/customize-markers-in-a-chart/
2. Charting Hacks to work faster – http://www.goodly.co.in/5-charting-hacks-to-help-you-work-faster/
3. 7 Date formulas to make life easy – http://www.goodly.co.in/date-formulas-in-excel/
4. Customised scrollbar using VBA – http://www.goodly.co.in/customized-scroll-bar-in-excel/
5. Adding Direct Legends – http://www.goodly.co.in/customized-scroll-bar-in-excel/
Hope everyone enjoys!
I like the Excel Ninja Menus.
1. Select a cell or range then move till the 4-way cross appears. Right-Click and drag the selection to another place in the worksheet then, like a ninja, a menu full of skills and throwing stars pops up allowing me to do all kinds of awesomeness.
2. When you click the fill box on a Date and right click and drag it down, a lot of amazing Date options pop up.
I also brand my Excel to remind myself that I’m awesome. In my personal macro workbook I place the following code.
Private Sub Workbook_Open()
Application.Caption = “SuperKrishna’s Awesomeness”
End Sub
My favorite tip goes along with #17. If you try to copy subtotaled data (and in earlier Excel versions filtered data),when you paste it all the data displays instead of just the summarized data.
To get around this, select your summarized data, click on Find and Select tab and then select Go to Special. Click Visible cells Only and click OK. Now paste and you will see that only the summarized data has been copied.
You can also go CTRL+G and then click the Special icon at the bottom of the dialog box.
What a great idea, Chandoo! I’d love to be included in your next edition:) Perhaps a VBA exclusive version?
@Ryan
I will review this concept about 6 months out from the original post and be sure to keep your site in mind
Hui…
That sounds great, Hui:) I just realized I gave credit to Chandoo for the idea and I should have attributed it to you.
Sorry about that!