• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Duplicate Records

srinidhi

Active Member
I have a list of names in column A & Month Names in column B, some of the names are repeated every month.

Now the challenge is to find out the duplicate records when I filter on a particular month.
 
You could do this pretty quick w/ a PivotTable. Would that work? Otherwise, you could use a helper column with this formula:

=SUMPRODUCT(--(A$2:A2=A2),--(B$2:B2=B2))=1


Filter the helper column for TRUE to show only unique values.
 
Hi Luke M,


I have gone thru all the pages from this site on sumproduct and explaination on double negatives by you in some of the comments. I have understood them.


I am trying to understand how your above formula works, particularly in this portion:

example:

=--({1,0,0,0,1,0}),--({1,0,0,1,0,0,1})

or simply this...

=--({True,False})


I know True return 1, and False returns 0, but I am not able to get what happens when we have combination of trues, and falses!!


I even tried the option of evaluate with your formula, the steps suddenly jumps off to last step either as 1=1 or 2=1 :(


Pls solve my confusion..


Regards,

Prasad
 
I want to write more ... pls dont take me wrong.


after all the permutation and combination I feel the result will always be what is the first argument,

ex;

Var 1 Var 2 Result

0 0 0

0 1 0

1 0 1

1 1 1


note

the result is 0 if var 1 is 0 and

the result is 1 if var 1 is 1


I tried the same with 3 variables:

Var 1 Var 2 Var 3 Result

0 0 0 0

0 0 1 0

0 1 0 0

0 1 1 0

1 0 0 1

1 0 1 1

1 1 0 1

1 1 1 1


but I know it is not working that way.
 
No problem, prasaddn. The key is remembering how the SUMPRODUCT function works. In it's normal use, it multiples each array "across", and then sums up those products.

So, when it gets to this point:

=({1,0,0,0,1,0,1}),({1,0,0,1,0,0,1})

The function multiplies each pair, like this:

=(1*1,0*0,0*0,0*1,1*0,0*0,1*1)

which becomes this:

(1,0,0,0,0,0,1)

Which gets summed up to simply be 2. This tells us that there were 2 rows where both conditions were met.


For your problem, the formula is keeping a running tally going of how many records so far match the current line. If it's more than 1, it's a duplicate, and should be hidden.


further reading on SUMPRODUCT:

http://www.ozgrid.com/Excel/Excel-conditional-summing.htm
 
my 2nd post got trapped as spam i guess. Luke like the other day, can you unspam mine as well..


If I apply Karnaugh Map logic, only the first variable determines whether the result should be 1 or 0.
 
Un-spammed.


If everythings working correctly, only rows where everything evaluates to True/1 should be counted. Thus, it is necessary to Var1 to = 1, but it is not the only requirement.
 
I tried with sample data and found some discrepancy in result (Or may be my bad).


See..

Name Month Result

Agent 2 Jan TRUE

Agent 2 Feb TRUE

Agent 3 Mar TRUE

Agent 4 Jan TRUE

Agent 5 Feb TRUE

Agent 6 Mar TRUE

Agent 10 Jan TRUE

Agent 2 Feb FALSE

Agent 3 Mar FALSE

Agent 4 Jan FALSE

Agent 5 Feb FALSE

Agent 6 Mar FALSE

Agent 1 Jan TRUE

Agent 2 Feb FALSE

Agent 3 Mar FALSE

Agent 4 Jan FALSE

Agent 5 Feb FALSE

Agent 6 Mar FALSE

Agent 1 Jan FALSE

Agent 2 Feb FALSE

Agent 3 Mar FALSE

Agent 4 Jan FALSE

Agent 5 Feb FALSE

Agent 6 Mar FALSE


Now when I filter for Jan, why is Agent 2 and Agent 10 showing as True they are not repeated in the month?
 
True is a good thing. That means it's the first occurance. All the FALSE's indicate a duplicate.
 
Got it... True gives names which are listed only once, and False for names repeating (Duplicates).


Today, I seem to be out of mind :(


My apologies..
 
Back
Top