• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

I am at my wits end... Please help with identifying duplicate cells (not as easy as it sounds)

sdak000

New Member
Hello everyone,

First of all let me say thank you in advance for any help you can provide, I have come to my wits end trying to get this to work.

I have a feeling that this will need to be a VB macro but I don't know to code it correctly.

I have an excel data sheet with 5 columns and little over 1 thousands rows however, its littered with duplicate entries and I need to identify those duplicate entries report on them and also clean up the list... let me give you an example below.

Col A Col B Col C Col D Col E
2001 5005 4565 350 1
2002 6548 5479 350 2
2003 5005 4565 350 1
2004 4577 7897 350 33
2005 1247 6571 350 14
2006 4577 7897 350 5
2007 5005 4565 350 77

I need to do the following:

  • Scan Column B & C and identify when duplicate pairs exist. In the example above Row's (1, 3 & 7) Also Row's (4 & 6)
    • NOTE: more than one duplicate can exist of the paired cells from column B & C - see examples noted above.
  • Format the results in a way that will allow me to report on the duplicate entries.
    • NOTE: The data in the adjacent cells is important so if possible it would be great to end up with a second separate table that looks like something below
    • 2nd note: If you have a better idea how to format it, that achieves the sames results please feel free to suggest.

Col A Col B Col C Col D Col E
2001 5005 4565 350 1
2003 5005 4565 350 1
2007 5005 4565 350 77
2004 4577 7897 350 33
2006 4577 7897 350 5


I still technically need to also end up with a table which has the duplicates removed....

So it would look like the following.

Col A Col B Col C Col D Col E
2001 5005 4565 350 1
2002 6548 5479 350 2
2004 4577 7897 350 33
2005 1247 6571 350 14

NOTE: for this table, it doesn't matter which duplicate data we choose to keep, just as long as the duplicate data does not exist.

Let me know if I haven't explained anything clear enough.

If what I am trying to do is too hard... or not possible please leave a comment and let me know, so I don't continue trying it myself haha

Thanks again
 
it can be done, i'm almost certain!!
The real trouble here is understand what you really want and that is not easy :)

For instance... i see that you have a 1-1 relationship on ColB and ColC is that always the case?

For a starter i would create what a DBA calls a 'unique-key'. Something like ColB & ColC & ColD
 
Hello Misra & iferror,

Thanks for your quick reply :)

I have uploaded an excel sheet to better show the results I need, as I require two tables with two different results.

Thanks

EDIT: and your correct iferror, I am only interested in duplicates that exist when both the cells in column B & C are duplicated at the same time, like I have represented in the example data.

Thanks
 

Attachments

Last edited:
Hi sdak000,

I am sure excel ninja here will able to give you more types of solutions. Besides, here I have attached a solution. See if this is as per your requirement. You can use filter to view single and double entry by different fields.

IF(OR((COUNTIF(B$2:B3,B2))>1,(COUNTIF(B2:B$13,B2))>1),"Duplicate","Single")

Note: Change this value "B$13" as per your range.
 

Attachments

Hi sdak000,

Please see the attached file, yellow cells are helper cells, green cells are solution. These are array formula so might slow when you will use it on your original data of 1000+ rows.

Regards,
 

Attachments

You can do this using couple of formulas and filter. Of course, VBA could be used.

1. Insert 2 Columns by selecting Column F.

2. In Cell F2 write following formula:
Code:
=COUNTIFS($B$2:$B$13,B2,$C$2:$C$13,C2)>1

3. In Cell G2 write following formula:
Code:
=COUNTIFS($B$2:B2,B2,$C$2:C2,C2)=1

4. Then filter on these columns

Solution1: Filter = TRUE on Column F gives Table 1

Solution2: Filter = TRUE on Column G gives Table 2
 
Back
Top