Formula Forensics No. 026 – Highlight Only Duplicate Entries

Posted on August 9th, 2012 in Formula Forensics , Huis , Posts by Hui - 19 comments

Last week I received an email from Chandoo.org reader, Debra.

I have a formula that I think would be a good idea for a future formula forensics post, and also I’ve always wondered how it works. 

I know how to use it and use it often, but I’m not really sure how it actually works.  (I found it years ago on a Microsoft Office website.)

Debras formula is:

=COUNTIF($B$3:$B3, B3)>1

So today at Formula Forensics we will look at this formula and see what makes it tick as well as looking at a small extensions Debra might be interested in using.

As always at Formula Forensics you can follow along using a Worked Example which you can Download here: Excel 97-2013.

 

Debras Formula

Debras formula =COUNTIF($B$3:$B3, B3)>1 uses the Excel Countif() function to count the occurrence of a value in a range.

The Countif() function has the following syntax:

 

So the Countif() function counts the number of occurrences of the Criteria in the Range

In Debra’s example: =COUNTIF($B$3:$B3, B3)>1

Range: $B$3:$B3

Criteria: B3

 

Looking at Debras formula it looks like the formula is looking to see how many times the value in Cell B3 occurs in the Range $B$3:$B3, which of course is one as the range only includes B3

The formula then compares the number of times the value occurs with the number 1 using the >1 logic at the end of Debras formula and returns True if the count is greater than 1 and False if it isn’t

=COUNTIF($B$3:$B3, B3)>1 is equivalent of using =If(COUNTIF($B$3:$B3, B3)>1, True, False)

You can see that Excel does the If and conversion to a Boolean automatically.

 

The clever part of the formula is the use of the Relative/Absolute Range Modifiers, the $ signs.

The $ signs in the formula =COUNTIF($B$3:$B3,B3)>1, serve to lock the start position of the range so that when it is copied down the range increases in size as it is copied down

The original Formula =COUNTIF($B$3:$B3, B3)>1

references cell B3

 

When we copy the cell C3 down to C4, the formula now becomes:

=COUNTIF($B$3:$B4, B4)>1

We can see that the Countif() function is now counting how many times the value in B4 occurs in the extended range B3:B4 and if it is greater than 1 it will return TRUE


When we copy the cell C4 down to C10, the formula now becomes:

=COUNTIF($B$3:$B10, B10)>1

We can see that the Countif() function is now counting how many times the value in B10 occurs in the extended range B3:B10 and if it is greater than 1 it will return TRUE

 

Finally resulting in a table of True/False highlighting the Duplicate Status of each entry:

 

Debras Formula Extended

A good aspect of having a Table of TRUE/FALSE is that you can use it with a number of excel functions to trigger them. One of the more useful features of Excel is Conditional Formatting.

Conditional Formatting can rely on a cell formula returning True/False to trigger whether it displays the cell using the Conditional Formatting or not.

Lets see how:

Select the Duplicate area C3:C12

On the Home Ribbon got Conditional Formatting, New Rule

The following Dialog box appears, we are going to add Two New Rules

Press New Rule

In the New Formatting Rule dialog select Use a Formula to determine which cells to format

In the Format values where this formula is true: type =C3

This will apply the Conditional Formatting shown in the Preview: window when the value in cell C3 is True, in our case when it is a Duplicate

Hint: Don’t worry about the other cells in our Range C3:C12, Excel will adjust the Conditional Formatting accordingly for those cells

Select the Format Button and set a Red & Bold Font (or whatever else you want)

Ok When Done

Excel shows us the first Conditional Formatting rule

We now add a second Conditional Formatting rule using the New Rule button

In the Format values where this formula is true: type =Not(C3)

This will apply the Conditional Formatting shown in the Preview: window when the value in cell C3 is not True ie: is False, in our case when it is not a Duplicate

Select the Format Button and set a White Font (or whatever else you want)

Hint: Because we have used a White Font on a White Cell background color it will appear that the cell is blank, Don’t worry It isn’t

Ok When Done

Excel shows us the two Conditional Formatting rules

Once we press Apply or Ok, Excel will apply the Conditional Formatting rules to our selected cells

You can see that the False cells above have a White Font on top of a White Background?

You don’t believe me, select some of the blank cells and change the background color!

Try changing a few cells in Column B to your own values to check that the Conditional Formatting rules are being applied correctly

 

Download

You can download a copy of the above file and follow along, Download Here – Excel 97-2013.

 

Formula Forensics “The Series”

This is the 26th post in the Formula Forensics series.

You can learn more about how to pull Excel Formulas apart in the following posts

Formula Forensic Series

 

Formula Forensics Needs Your Help

I need more ideas for future Formula Forensics posts and so I need your help.

If you have a neat formula that you would like to share with us all, try putting pen to paper and draft up a Post like above or;

If you have a formula that you would like explained as Debra did above, but don’t want to write a post, send it to Hui or Chandoo.

 

 

 

Written by Hui...
Tags: ,
Home: Chandoo.org Main Page
? Doubt: Ask an Excel Question

19 Responses to “Formula Forensics No. 026 – Highlight Only Duplicate Entries”

  1. Ram Kapoor says:

    Hi Debra,

    Thanks for your post!

    We can get the same results without applying Conditional Formatting also. Here is the formula.

    =IF(COUNTIF($B$3:$B3,B3)>1,COUNTIF($B$3:$B3,B3)>1,"")

    Regards,
    Ram

  2. Ram Kapoor says:

    Hi Debra,

    Thanks for your post!

    I observed, this can be done without applying Conditional Formatting also, here is the formula.

    =IF(COUNTIF($B$3:$B3,B3)>1,COUNTIF($B$3:$B3,B3)>1,"")

    Regards,
    Ram

  3. Manna Das says:

    Dear Ram,

    your answer was helpful, but if we use that we won't be able to identify first occurance of duplicates   

    • tadovn says:

      @Manna, it could be done.
      Named the range of value DIR is RANGE1, ex.
      So the formula will be:
      = IF(Countif(RANGE1,B3)>1,"TRUE","").
       
      It will show all duplicates and more.

  4. Nick aka Trader273 says:

    Always something new to learn.  I personally like going the conditional formatting route, less "helper columns", but I suppose both ways work well.  Also, kind of off topic, but when I enter a formula in the conditional formatting and hit "OK", it seems xl automatically puts quotes around it, resulting in nothing happening. I can go in and delete them out and then the formula works.  Not sure why it's doing that, though not a huge deal.

  5. Felicia says:

    this is helpful but what if you want all the duplicates and the original to be highlighted or selected. if i am trying to remove 'MATCHES" i currently have to sort by account number and remove both rows, how can i do that with a formula?

    IE,, row  3, 6 and 9 all match and i need to delet all 3,, right now i sort by account number and then delete all 3 rows, with this formula, i can delete the TRUE but it leaves the original false....
    thanks

  6. Jonathan Micael says:

    Hoo dude... Long time ago I search this form... Thanks a lot! I really apreciatte your website!

    See ya. 

  7. Ramki says:

    Hi!!,
    Add to this , if you want to identify and remove duplicate values in a range i.e. C1:C10, use =countif(C1:C10,C1)>1. Go for absolute reference if you want to delete existing values also. This formula gives True/False, select True, press delete.
     

  8. md says:

    Hi Chandoo,

    Conditional formatting is always a first option when we think of finding duplicate with clear reporting the data in issue.
    Thanks to Debra for putting this solution on Chandoo.

    Well, I would like to request extension to the extension what you have did.
    Now since we got to know that Duplicate record is found;
    1. I want to see how many times this record appeared
    2. If any colour formatting to source data to identify duplicates as a group? 

  9. Mark says:

    Hi Chandoo,

    I think I am missing something here?

    Why not use the Conditional Formatting to show the duplicate values (in the screenshots its the rule above the one you use in this example). I think it works really well, as it shows you the origoinal and all the values that are duplicated (in this example, the TRUE is only assigned to the any duplicates but not the original).

    Why are you creating the helper column?

    Ramki - there is also a function on the Data Ribbon to remove duplicates.

  10. Ramki says:

    Hi! Mark!!
    Yeah.. I am aware of the tip in excel 2007. But the above tip is handy in excel 2003.

  11. julien says:

    To go a little bit further, is it possible with formula only, when a duplicate is found, to find at which line the other duplicates is ?
     
    Thanks,

  12. SMA Quadri (Pakistan) says:

    I apreciatte your website

  13. David says:

    Can finding duplicates with conditional formatting only be done if all the items are listed in a single column? In other words, how would I find duplicates if I have four lists - four columns in excel 97 and I want to find duplicates within all four lists - where a duplicate of an item in column-1 could be located in any of the other three columns?
    Thanks, Dave

  14. Saju says:

    Dear Chandu,
    through countifs function i found out count of duplicate values in a database.
    how to findout cell reference of those duplicates?

  15. […] How to highlight duplicate entries in a list [Chandoo] […]

  16. […] How to highlight duplicate entries in a list [Chandoo] […]

Leave a Reply