Handling spelling mistakes while searching your data using excel

Share

Facebook
Twitter
LinkedIn

Dilbert.com

Spelling mistakes are a thing of day to day carporate life. Most of the data in spreadsheets is entered by people and hence prone to having spelling mistakes or alternate spellings. For eg. a person named John could have been spelled as Jon. And when John calls you back to confirm his reservation and you use the search / vlookup to find his information the result would empty.

handling-spelling-mistakes-data-excel-vba-udfHere is one technique that I use often when the data has spelling mistakes or I need to do fuzzy search to fetch items that sound or spelled similar. Take the 2 texts you want to compare and,

  • Remove all the vowels – AEIOU
  • Replace PH with F, Z & J with G, CK with K, W with V, LL with L, SS with S
  • Remove any Hs
  • Finally compare both texts

To simplify the above 4 steps I have written a small VBA UDF (User Defined Function) that takes a text parameter and performs the above 4 steps.


Function SimpleText(thisTxt As String) As String
' this function generates a simple text from input text that
' can be used for fuzzy search
thisTxt = LCase(thisTxt)
thisTxt = Replace(thisTxt, "a", "")
thisTxt = Replace(thisTxt, "e", "")
thisTxt = Replace(thisTxt, "i", "")
thisTxt = Replace(thisTxt, "o", "")
thisTxt = Replace(thisTxt, "u", "")
thisTxt = Replace(thisTxt, "ph", "f")
thisTxt = Replace(thisTxt, "z", "g")
thisTxt = Replace(thisTxt, "ck", "k")
thisTxt = Replace(thisTxt, "w", "v")
thisTxt = Replace(thisTxt, "j", "g")
thisTxt = Replace(thisTxt, "ll", "l")
thisTxt = Replace(thisTxt, "ss", "s")
thisTxt = Replace(thisTxt, "h", "")
SimpleText = thisTxt
End Function

The above code can be used to perform fuzzy text searches or searches on unclean data. Of course, the above substitution rules are what I find good enough. Feel free to define additional rules as per your needs so that your fuzzy searches work even better.

If you are looking for generating SOUNDEX codes for excel strings you can use this excel soundex UDF. Soundex codes are phonetic codes generated for words based on how they sound, thus 2 words sounding similar (for eg. excess, access) would have same soundex code. You can use these codes to perform fuzzy searches.

More on text processing using excel:

Facebook
Twitter
LinkedIn

Share this tip with your colleagues

Excel and Power BI tips - Chandoo.org Newsletter

Get FREE Excel + Power BI Tips

Simple, fun and useful emails, once per week.

Learn & be awesome.

Welcome to Chandoo.org

Thank you so much for visiting. My aim is to make you awesome in Excel & Power BI. I do this by sharing videos, tips, examples and downloads on this website. There are more than 1,000 pages with all things Excel, Power BI, Dashboards & VBA here. Go ahead and spend few minutes to be AWESOME.

Read my storyFREE Excel tips book

Overall I learned a lot and I thought you did a great job of explaining how to do things. This will definitely elevate my reporting in the future.
Rebekah S
Reporting Analyst
Excel formula list - 100+ examples and howto guide for you

From simple to complex, there is a formula for every occasion. Check out the list now.

Calendars, invoices, trackers and much more. All free, fun and fantastic.

Advanced Pivot Table tricks

Power Query, Data model, DAX, Filters, Slicers, Conditional formats and beautiful charts. It's all here.

Still on fence about Power BI? In this getting started guide, learn what is Power BI, how to get it and how to create your first report from scratch.

11 Responses

  1. Ciao Hui,
    Collecting Excel tricks under the title “Notable Excel Websites (Non-MVP) Edition” is a brilliant idea…
    Thank you in the name of all The FrankensTeam.
    On our site there is a box with a picture and text highlighting:

    This is a no-MVP site
    we think ourselves “bad boys” a bit 🙂
    For those who would like to know why our site is a no-MVP site, enough to click on the link:
    http://goo.gl/lxDszY
    Thank you again!

  2. I really enjoyed this (newsletter). I must admit that I rarely read an Excel newsletter (and I subscribe to quite a few) all the way though, but this grabbed my attention and before I realized it, I was engrossed in it. I must also admit that most of this I don’t understand, yet. But, it excites me when I do learn something new in Excel. I can’t wait to see how much of this I can implement into my (constantly-evolving) ‘House Budget’ & ‘Family Medical’ worksheets that I have developed over the past few years! I sure hope to see more of these type of newsletters in the future! Thanks!

  3. Hui, This post is Superb! More over I have always been a fan of Roberto’s work and have learnt a lot from him.

    Here are some of my recent contributions

    1. Customising markers in a chart – http://www.goodly.co.in/customize-markers-in-a-chart/
    2. Charting Hacks to work faster – http://www.goodly.co.in/5-charting-hacks-to-help-you-work-faster/
    3. 7 Date formulas to make life easy – http://www.goodly.co.in/date-formulas-in-excel/
    4. Customised scrollbar using VBA – http://www.goodly.co.in/customized-scroll-bar-in-excel/
    5. Adding Direct Legends – http://www.goodly.co.in/customized-scroll-bar-in-excel/

    Hope everyone enjoys!

  4. I like the Excel Ninja Menus.
    1. Select a cell or range then move till the 4-way cross appears. Right-Click and drag the selection to another place in the worksheet then, like a ninja, a menu full of skills and throwing stars pops up allowing me to do all kinds of awesomeness.
    2. When you click the fill box on a Date and right click and drag it down, a lot of amazing Date options pop up.
    I also brand my Excel to remind myself that I’m awesome. In my personal macro workbook I place the following code.
    Private Sub Workbook_Open()
    Application.Caption = “SuperKrishna’s Awesomeness”
    End Sub

  5. My favorite tip goes along with #17. If you try to copy subtotaled data (and in earlier Excel versions filtered data),when you paste it all the data displays instead of just the summarized data.
    To get around this, select your summarized data, click on Find and Select tab and then select Go to Special. Click Visible cells Only and click OK. Now paste and you will see that only the summarized data has been copied.
    You can also go CTRL+G and then click the Special icon at the bottom of the dialog box.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.