Compare 2 sets of data by letter or word & highlight mismatches [vba]

Posted on December 15th, 2014 in VBA Macros - 21 comments

We analysts like to compare. If you ever want to keep an analyst busy, just give her 2-3 options. She won’t return to your desk until the cows come home. My wife uses this trick all the time. Picture this:

[In late 2013]
Me: I want to buy a new phone
She: Do you want Nexus 5 or Galaxy S5 or iPhone 5s?

Its late 2014 and I am not done comparing.

So today, let’s talk about an interesting comparison scenario.

Comparing by letter or word

Imagine you are looking at 2 lists like this and you want to know where items differ. Not which items, but where.

Compare 2 lists of data and highlight mismatched letters or words in Excel - How to?

That means, you want to know which letters or words in each line are different.

VBA to the rescue

Unfortunately, none of the standard features of Excel (formulas, conditional formatting, pivot tables etc.) can help us with this situation. But we don’t have to give up. We can use a simple VBA macro to instantly compare 2 lists and highlight mismatched letters or words.

[Related: How to compare 2 lists in Excel, a quick round up of techniques]

A quick demo of our comparison macro:

Compare two texts in Excel and highlight unmatched letters or words using VBA Macros - Demo

How does this macro work?

When you set out to create macros like this, the first step is to define basic algorithm (logical steps in plain English). To compare 2 sets of data, we need to do below:

  1. For each item in list 1
  2. Get corresponding item in list 2
  3. If they don’t match
  4. For word match
    1. For each word in first text
    2. Get corresponding word in second text
    3. Compare
    4. If not matched, highlight in red color
    5. Repeat for other words
  5. For letter match
    1. Find the first mismatched letter
    2. Highlight all the letters from that point in second text
  6. Repeat for next item in list 1
  7. Done

Once you write down this logic, we simply go ahead and implement it in VBA code.

The exact workings of the macro are somewhat complex. So I made a video explaining how the code works & what it can do. Please watch it below.

Video explaining the comparison macro

[see this video on our YouTube Channel]

Download Example Workbook

Click here to download the comparison macro workbook. Examine the code to understand how it is constructed. Feel free to extend it to suit your work needs.

Do you compare lists like this?

Every now and then, I end up having a situation where I need to compare by letter or word. I find VBA macro based solution to be perfect for this.

What about you? Do you compare lists? Where do you struggle with such comparisons? How would you use this macro? Please share your thoughts & tips in comments.

Become incomparable, learn VBA

While VBA is pretty powerful & awesome, not many venture beyond the basic recorded macros. You can transform the work, career & skills by learning VBA. It is not at all difficult and anyone can learn it. Start with below links.

 

Written by Chandoo
Tags: , , , , , , , , , ,
Home: Chandoo.org Main Page
? Doubt: Ask an Excel Question

21 Responses to “Compare 2 sets of data by letter or word & highlight mismatches [vba]”

  1. Usman says:

    Hi Chandoo,

    It's work perfect. But in real world it's not quite practical to put the words or sentence you have to match on same row number of two list. So, is any way macro can match the words or sentences if the matching words are not on the same row.

    Thank you,

    Usman

  2. Usman says:

    Data matching

  3. Safiuddin Iqbal says:

    Mr. Chandoo, I do this kind of comparisons very often by using the simple if condition (=IF(B4=C4,"True","False")). I did the same on your data and figured out error in your macro which missed out the following mismatch. However your macro technique is good!
    Thank you so much!

    what about me? what about me!

    • Rudra says:

      Safiuddin,
      you can just enter =B4=C4 instead of (=IF(B4=C4,”True”,”False”)) but I don't think both of these formulas can do what Chandoo's macro is doing.

      Regards
      Rudra

  4. Safiuddin Iqbal says:

    I am sorry it did throw error for letter matching! I stand corrected!

  5. Prajwal says:

    Hi Chandoo,

    This looks great! However, most of the time when we need to compare the data, each cell of the 1st column should look for corresponding data in the entire 2nd column instead of the corresponding cell.

  6. Tom says:

    Very nice. I know it would make things MUCH more complicated (and prone to error) but I wonder if this could be adjusted to properly handle situations where whole words were inserted or deleted. For instance:

    A. I like Excel
    B. I really like Excel

    As it stands now, this code would highlight words 2,3, and 4 of item B, when I would prefer to highlight only word 2 ("really").

    Perhaps the logic needs to include a test which, after encountering a difference, searches the remainder of the string for a match (perhaps you could set the threshold for how many consecutive characters have to match in order to avoid false positives). One conservative approach could be to see if it the entire remaining string in item A can be found (in tact) in item B.

    Maybe I'll give this a shot, but if someone else has ideas on how to accomplish this, I'd love to hear.

    Thanks for the post, Chandoo!

    • A says:

      I was thinking on the same lines but could not come up with answer yet. This macro is really a starting point but has a long way to go. I can see most subsequent request will be
      1. Checking all the second column as already mentioned above in comments.
      2. No checking words in sequence as it currently does.

      For the time being I am putting both these two list in Two separate word files and then running Compare function which mostly serves the purpose.

  7. Sandra says:

    How does this (or does this) differ from using the Fuzzy Lookup addin? For my application, the fact that Fuzzy returns a percent of accuracy makes me very happy. I can see if a value in a list matches my lookup column as a percentage (it matches 95% or 75%). And, I don't have to code it!

  8. […] if you want to stick to Excel, you are lucky enough because Chandoo wrote a post (Compare 2 sets of data by letter or word & highlight mismatches [vba]) to show you how to do so in Excel with […]

  9. Navin Sahni says:

    Mr. Chandoo,

    I have a table in xl file (recipients) to whom i want to send personal data via email (example their medical records - from another spread sheet) again in xl file.

    Please remember it is not the same content for every recipient; but the format is the same

    Can you please suggest me quick fix solution for the same

  10. Stephen says:

    Can you adapt this fantastic VBA macro to work on a single column with different spellings on parts of list

    Example for addresses
    Street, start, St
    Ave, Avenue

    It does not work with the same list in two colums any thoughts how to best modify this or another approach?

    Thanks

  11. Madhav says:

    Often I would have Column A with some string and column B containing long strings. Now the column B strings may not really match with column A string as given in Chandoo's example. I would like to colour the whole strings in column A in multiple or long strings in any cell of Column B. How to do this?

  12. Dear Sir,
    Thanks of your best service providing, I learn from you a lot of things,

  13. Francesco says:

    Hi,

    the macro fit exactly my needs, but... it gives an error if the text in the cell contains a "[" or "]" (something like "String not valid", run-time error 93) when the follosing instruction is executed in the "nextWord" function:
    If delim Like "*" & Mid(fromThis, i, 1) & "*" Then Exit For
    If I delete the row the macro works perfectly... Any idea?
    Many thanks

  14. Francesco says:

    I solved swapping pattern and string:
    For example, instead of using:
    startHere = IIf(delim Like "*" & Mid(fromThis, startHere, 1) & "*", startHere + 1, startHere)
    I used:
    startHere = IIf("*" & Mid(fromThis, startHere, 1) & "*" Like delim, startHere + 1, startHere)

    At a first glance, it seems working fine without any side effect.

  15. Meghana Dighe says:

    sir.
    thanks a lot very nice explanation of macro code for comparing data in 2column.
    it will definately help me.
    planning for joining u r online vba prog.

    meghana

  16. Bekki says:

    Hi Chandoo,

    First, this is absolutely amazing. Thank you for sharing this! It's made my life so much easier! 🙂

    I have noticed though that is does not pick up on all items. For example, when the data below is compared:

    100/15-07-070-03W4M/00 100/15-07-070-03W4/00

    only /00 is highlighted. The fact that the M is missing in the second column is not indicated at all. Is there something within the VBA that can be tweaked to solve this? (I'm sorry, 🙁 I'm not familiar enough with VBAs to figure this out on my own.)

    I also have a secondary question. Is it possible to modify the VBA to indicate which section of a string does not match as opposed to having it match words? Just curious 🙂

    Thanks a bunch!

  17. Greg says:

    This is a truly wonderful macro, but might there be a similar tool for those of us without Excel...say, a tool for Google Sheets?

  18. Ric says:

    Hello,

    Would it be possible to export this to an independent Excel file and change the colum? Lets say List 1 is in colum J and List 2 in column K.

    Thanks a lot!

Leave a Reply