We analysts like to compare. If you ever want to keep an analyst busy, just give her 2-3 options. She won’t return to your desk until the cows come home. My wife uses this trick all the time. Picture this:
[In late 2013]
Me: I want to buy a new phone
She: Do you want Nexus 5 or Galaxy S5 or iPhone 5s?
Its late 2014 and I am not done comparing.
So today, let’s talk about an interesting comparison scenario.
Comparing by letter or word
Imagine you are looking at 2 lists like this and you want to know where items differ. Not which items, but where.
That means, you want to know which letters or words in each line are different.
VBA to the rescue
Unfortunately, none of the standard features of Excel (formulas, conditional formatting, pivot tables etc.) can help us with this situation. But we don’t have to give up. We can use a simple VBA macro to instantly compare 2 lists and highlight mismatched letters or words.
[Related: How to compare 2 lists in Excel, a quick round up of techniques]
A quick demo of our comparison macro:
How does this macro work?
When you set out to create macros like this, the first step is to define basic algorithm (logical steps in plain English). To compare 2 sets of data, we need to do below:
- For each item in list 1
- Get corresponding item in list 2
- If they don’t match
- For word match
- For each word in first text
- Get corresponding word in second text
- Compare
- If not matched, highlight in red color
- Repeat for other words
- For letter match
- Find the first mismatched letter
- Highlight all the letters from that point in second text
- Repeat for next item in list 1
- Done
Once you write down this logic, we simply go ahead and implement it in VBA code.
The exact workings of the macro are somewhat complex. So I made a video explaining how the code works & what it can do. Please watch it below.
Video explaining the comparison macro
[see this video on our YouTube Channel]
Download Example Workbook
Click here to download the comparison macro workbook. Examine the code to understand how it is constructed. Feel free to extend it to suit your work needs.
Do you compare lists like this?
Every now and then, I end up having a situation where I need to compare by letter or word. I find VBA macro based solution to be perfect for this.
What about you? Do you compare lists? Where do you struggle with such comparisons? How would you use this macro? Please share your thoughts & tips in comments.
Become incomparable, learn VBA
While VBA is pretty powerful & awesome, not many venture beyond the basic recorded macros. You can transform the work, career & skills by learning VBA. It is not at all difficult and anyone can learn it. Start with below links.
- Introduction to VBA & 5 part crash course
- What is a macro and how to get started with VBA
- 40+ Example VBA macros
- Course: Online VBA Classes from Chandoo
40 Responses to “Compare 2 sets of data by letter or word & highlight mismatches [vba]”
Hi Chandoo,
It's work perfect. But in real world it's not quite practical to put the words or sentence you have to match on same row number of two list. So, is any way macro can match the words or sentences if the matching words are not on the same row.
Thank you,
Usman
Data matching
Mr. Chandoo, I do this kind of comparisons very often by using the simple if condition (=IF(B4=C4,"True","False")). I did the same on your data and figured out error in your macro which missed out the following mismatch. However your macro technique is good!
Thank you so much!
what about me? what about me!
Safiuddin,
you can just enter =B4=C4 instead of (=IF(B4=C4,”True”,”False”)) but I don't think both of these formulas can do what Chandoo's macro is doing.
Regards
Rudra
I am sorry it did throw error for letter matching! I stand corrected!
Hi Chandoo,
This looks great! However, most of the time when we need to compare the data, each cell of the 1st column should look for corresponding data in the entire 2nd column instead of the corresponding cell.
Very nice. I know it would make things MUCH more complicated (and prone to error) but I wonder if this could be adjusted to properly handle situations where whole words were inserted or deleted. For instance:
A. I like Excel
B. I really like Excel
As it stands now, this code would highlight words 2,3, and 4 of item B, when I would prefer to highlight only word 2 ("really").
Perhaps the logic needs to include a test which, after encountering a difference, searches the remainder of the string for a match (perhaps you could set the threshold for how many consecutive characters have to match in order to avoid false positives). One conservative approach could be to see if it the entire remaining string in item A can be found (in tact) in item B.
Maybe I'll give this a shot, but if someone else has ideas on how to accomplish this, I'd love to hear.
Thanks for the post, Chandoo!
I was thinking on the same lines but could not come up with answer yet. This macro is really a starting point but has a long way to go. I can see most subsequent request will be
1. Checking all the second column as already mentioned above in comments.
2. No checking words in sequence as it currently does.
For the time being I am putting both these two list in Two separate word files and then running Compare function which mostly serves the purpose.
How does this (or does this) differ from using the Fuzzy Lookup addin? For my application, the fact that Fuzzy returns a percent of accuracy makes me very happy. I can see if a value in a list matches my lookup column as a percentage (it matches 95% or 75%). And, I don't have to code it!
[…] if you want to stick to Excel, you are lucky enough because Chandoo wrote a post (Compare 2 sets of data by letter or word & highlight mismatches [vba]) to show you how to do so in Excel with […]
Mr. Chandoo,
I have a table in xl file (recipients) to whom i want to send personal data via email (example their medical records - from another spread sheet) again in xl file.
Please remember it is not the same content for every recipient; but the format is the same
Can you please suggest me quick fix solution for the same
Can you adapt this fantastic VBA macro to work on a single column with different spellings on parts of list
Example for addresses
Street, start, St
Ave, Avenue
It does not work with the same list in two colums any thoughts how to best modify this or another approach?
Thanks
Often I would have Column A with some string and column B containing long strings. Now the column B strings may not really match with column A string as given in Chandoo's example. I would like to colour the whole strings in column A in multiple or long strings in any cell of Column B. How to do this?
Dear Sir,
Thanks of your best service providing, I learn from you a lot of things,
Hi,
the macro fit exactly my needs, but... it gives an error if the text in the cell contains a "[" or "]" (something like "String not valid", run-time error 93) when the follosing instruction is executed in the "nextWord" function:
If delim Like "*" & Mid(fromThis, i, 1) & "*" Then Exit For
If I delete the row the macro works perfectly... Any idea?
Many thanks
I solved swapping pattern and string:
For example, instead of using:
startHere = IIf(delim Like "*" & Mid(fromThis, startHere, 1) & "*", startHere + 1, startHere)
I used:
startHere = IIf("*" & Mid(fromThis, startHere, 1) & "*" Like delim, startHere + 1, startHere)
At a first glance, it seems working fine without any side effect.
sir.
thanks a lot very nice explanation of macro code for comparing data in 2column.
it will definately help me.
planning for joining u r online vba prog.
meghana
Hi Chandoo,
First, this is absolutely amazing. Thank you for sharing this! It's made my life so much easier! 🙂
I have noticed though that is does not pick up on all items. For example, when the data below is compared:
100/15-07-070-03W4M/00 100/15-07-070-03W4/00
only /00 is highlighted. The fact that the M is missing in the second column is not indicated at all. Is there something within the VBA that can be tweaked to solve this? (I'm sorry, 🙁 I'm not familiar enough with VBAs to figure this out on my own.)
I also have a secondary question. Is it possible to modify the VBA to indicate which section of a string does not match as opposed to having it match words? Just curious 🙂
Thanks a bunch!
Nevermind! I didn't have a clear understanding of the marco process. I understand what's happening.
Again, thank you!
This is a truly wonderful macro, but might there be a similar tool for those of us without Excel...say, a tool for Google Sheets?
Hello,
Would it be possible to export this to an independent Excel file and change the colum? Lets say List 1 is in colum J and List 2 in column K.
Thanks a lot!
these two sentences are nearly the same:
Once upon a midnight dreary, while I pondered, weak and weary
Once upon a late night dreary, while I pondered, weak and weary
The macro will identify everything from "late" to its end as different, when it isn't really.
How can we identify that it is only "late night" that is in the place of "midnight" ?
I think with C# I would make 2 arrays and then identify which is not in either array, then highlight both text. But I have to stay within excel...
Hi,
It is awesome and works for me.
Can any one help me on below requirement.
I am trying to highlight the as below.
Lets take below example:
List1: working awesome
List2: This chandoo.org is working awesome
and I want to highlighte only the text "This chandoo.org is" in List2
I am assuming the above vide/excel talks about comparing the string from the starting. But I just want to compare entire cells. Can any one help me.
Hi,
I executed this macro in one excel says ExcelOne.
Now I downloaded another excel name ExcelTwo and I want to run the same macro from ExcelOne to ExcelTwo. How can I do this.
Any help!.
Hello ,
This macro is highlighting either complete word or all letters after misspelled letter.
I want that this will highlight only the misspelled lette.
E.g-
If column 1 contains Shahzeb and column 2 contains Shahzab then it should highlight only 'e'.
can anyone help..
First love the Macro, great job, although I'm having trouble if the List1 and List2 have a formula capturing data from other worksheets. I get a Run-time error 13. and the Macro Fails at the follow code: If Not cell1.Characters(j, 1).Text = cell2.Characters(j, 1).text _... If I copy and paste the "text" to another to columns (updating List1 and List 2 locations) the macro runs fine.
Hi,
If I need to find the missing words while comparing 2 ranges. How could I do? Is there any VBA to find missing words?
And thanks for this VBA. Great Job done.
Regards,
Hello,
Can you help me please to create the same script for Google sheets or LibreOffice Calc. Your script doesn't work there and this is exactly what I need for my work
Hi there
Forgive my lack of vba knowledge but how do i extend the list beyond 16 rows to ie 100k lines? Appreciate the help please
Thanks,
Paul
@Paul
Please ask your question in the Chandoo.org Forums
https://chandoo.org/forum/
Please attach a sample file so we can see what you are doing
My issue is in identifying differences in words in 2 different excel files do help me with understanding of simple formulas to apply and get the result.
My problem was to find the list of South African countries placed in a column in 1000 rows in a format: austria;algeria;brazil..
I placed all the South African countries in a cell with a delimiter ";". If you have sort of similar problem then below code will work for you making slight adjustments based on your problem statement.
Sub test()
Dim cnt As Long
Dim cn1 As Long
Dim i As Long
For i = 1 To 10
arr1 = Split(Cells(i, 1), ";")
arr2 = Split(Range("D3"), ";")
For cnt = LBound(arr1) To UBound(arr1)
For cnt1 = LBound(arr2) To UBound(arr2)
If CStr(arr1(cnt)) = CStr(arr2(cnt1)) Then
Range("B" & i).Value = "Match"
GoTo nextlevel
End If
Next cnt1
Next cnt
nextlevel:
Next i
End Sub
Thank you!
A quick demo of our comparison macro: --- this sheet has limitations wherein comparison is done only to data [16 cell] but doesn't go beyond that... could you please let me know how to edit cell range so as to accommodate 100+ cells [in both columns]
Hey Vinay... Just edit the named ranges "list1" and "list2" and point them to the 100+ cell ranges.
Thanks Chandoo... I edited same in formula section and now its working fine, & Not to forget u have done a excellent script work with existing features of Excel.. Keep going with your contributions
I am a new learner for VBA and Macro. I tried to edit this code to compare two set of data by numbers instead of by letters/word. Somewhere gone wrong and I am stuck! Please help.
Sub highlightDiffs()
'
' highlightDiffs Macro
'
Dim cell1 As Range, cell2 As Range, i As Long
Dim j As Long, k As Long, length As Long, word1 As Double, word2 As Double
resetColors
i = 1
For Each cell1 In Range("list1")
Set cell2 = Range("list2").Cells(i)
If Not cell1.Value2 = cell2.Value2 Then
'both cells don't match. Find the first word / character that doesn't match
length = Len(cell1.Value2)
'match letters
For j = 1 To length
If Not cell1.Characters(j, 1).NumberFormat = cell2.Characters(j, 1).NumberFormat _
Then Exit For
Next j
If j <= Len(cell2.Value2) Then
With cell2.Characters(j, Len(cell2.Value2) - j + 1).Font
.Color = -16776961
End With
End If
End If
i = i + 1
Next cell1
End Sub
How to implement the same for all the sheets of entire Workbook using Excel Macros.
Hi Chandoo, how can I extend List 1 and List 2 in your example file? Thanks
Dear,
Very nice Macro. Thanks for posting. However, it seems to fail when certain types of special characters are included. Specifically, it fails on this line:
"If delim Like "*" & Mid(fromThis, i, 1) & "*" Then Exit For".
Do you have a solution for this?