1. Welcome to Chandoo.org Forums. Short message for you

    Hi Guest,

    Thanks for joining Chandoo.org forums. We are here to make you awesome in Excel. Before you post your first question, please read this short introduction guide. When posting or responding to questions please remember our values at Chandoo.org are: Humility, Passion, Fun, Awesomeness, Simplicity, Sharing Remember that we have people here for whom English is not there first language and we need to allow for this in our dealings.

    Yours,
    Chandoo
  2. Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  3. When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Duplication Check For Specific Unique Data Types

Discussion in 'Ask an Excel Question' started by david gabra, May 15, 2018.

  1. david gabra

    david gabra Member

    Messages:
    119
    Hi
    I am totally stuck.
    I have a file with 10000s of rows and I need to check for duplicates between employee and vendors. Please note I need to exclude employee to employee duplicates.
    I only need to check employee to the supplier duplicates , and I do not know how to do exclude employee to employee duplicates from the analysis - so getting 1000s of false positives.

    Let me give you a sample file and show the outcome required.

    It is really key to be able to know which duplicates go together - so we need to be able when we filter to distinguish each group of duplicates.

    Thank you

    We need to run on the House field

    Type Ref Number House Expected Result
    Employee 1A 23 Positive -A
    Employee 1B 23 Positive -A
    Vendor 1C 23 Positive -A
    Vendor 1C 23 Positive -A
    Employee 1D 24
    Employee 1E 23
    Employee 1D 24
    Employee 1D 24
    Employee 1D 30
    Employee 1D 54 Positive -B
    Vendor 1D 40 Positive -C
    Employee 1D 40 Positive -C
    Employee 1D 30
    Employee 1D 100
    Employee 1D 656
    Employee 1D 65
    Employee 1D 656
    Employee 1D 65
    Supplier 1D 54 Positive -B

    Attached Files:

  2. Chihiro

    Chihiro Excel Ninja

    Messages:
    4,820
    Looking at your sample. It isn't clear why Row 2 to 5 are all identified as Positive-A, but not Row 7.

    Can you clearly demonstrate/explain logic used?
  3. david gabra

    david gabra Member

    Messages:
    119
    Hi

    My mistake - row 7 should be - good catch :). This means that we are on the same page i hope.

    The logic

    1)
    if we have 'Type' from column A which is employee only and we have duplicates in column 'C' - this should be ignored.

    rows 8 and 9 in excel are duplicates but only employees so this should be ignored.


    2) if we have 'Type' from column A which is employee and Vendor and we have duplicates in column 'C' - this should be highlighted as an a potential issue.

    Therefore in excel:
    rows 2 to 5 - have duplicates in column for employee and vendor - these should be highlighted

    row 11 and row 20 are duplicates and these should be shown.

    row 12 and 13 are duplicates and this should be shown.

    upload_2018-5-15_20-9-23.png

    Attached Files:

  4. Chihiro

    Chihiro Excel Ninja

    Messages:
    4,820
    Hmm, not sure about marking A, B, C etc (without helper column).

    But can you use this?
    =IF(COUNTIFS($C$2:$C$20,C2,$A$2:$A$20,"<>"&A2)>0,"Positive - "&C2,"")

    upload_2018-5-15_13-24-30.png
    Thomas Kuriakose likes this.
  5. david gabra

    david gabra Member

    Messages:
    119
    I have no clue how it works and how it excludes same duplicate in employees - but it looks like it work - amazing
    if you care to explain that will be great
  6. Chihiro

    Chihiro Excel Ninja

    Messages:
    4,820
    COUNTIF is used to count any item in list that has same house but different type ("<>" & value). If count is greater than 0, then it means there is duplicate with different type. Then IF is used to return concatenated text "Positive -" & house value.
    Thomas Kuriakose likes this.
  7. david gabra

    david gabra Member

    Messages:
    119
    Hi
    If i wanted to reduce false positives more
    can we add another column to that will help me.
    I should have asked in the beginning - did not want to overwhelm


    upload_2018-5-15_21-31-38.png

    Attached Files:

  8. Chihiro

    Chihiro Excel Ninja

    Messages:
    4,820
    Just add extra condition.
    =IF(COUNTIFS($C$2:$C$20,C2,$D$2:$D$20,D2,$A$2:$A$20,"<>"&A2)>0,"Positive - "&C2,"")
    Thomas Kuriakose likes this.
  9. david gabra

    david gabra Member

    Messages:
    119

Share This Page