• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Removal of duplicate entries using VBA

Kumar Salui

New Member
Hello All,
I am trying to remove the duplicate entries from "Masterdata" tab and copy them to "Duplicatedata" tab of the attached excel file. Let me explain the problem in detail:

I have a 10000+ line items of customer data, where the customers are serviced by multiple partners, some of them are serviced by a single partner. However, I found that the data was entered in such a way that some of the partner data became duplicates for the same customer.

Please refer to the attached excel:

e.g. Row#2 & row#3, the customer is serviced by one partner, but there is a duplicate entry; the same happens for Row #4 & #5.
For Row#6-8, the customer is serviced by two different partners, but for one partner, there is a duplicate entry.
For Row#9-11, customer serviced by one partner, but it is entered 3 times.

The expected outcome:
For row #2-3 - Row# 2 should remain in "Masterdata", row # 3 should be removed from "Masterdata" tab and copy to "Duplicatedata";
For Row#4, #5 - Row# 4 should remain in "Masterdata", row # 5 should be removed from "Masterdata" tab and copy to "Duplicatedata";
For Row#6-8 - Row# 6 & 7 should remain in "Masterdata" (non-duplicate rows), row # 8 should be removed from "Masterdata" tab and copy to "Duplicatedata"; same for
For row#9-11 - Row# 9 should remain in "Masterdata", row # 10-11 should be removed from "Masterdata" tab and copy to "Duplicatedata";


Just to mention, I found some of the data were duplicated many times (as high as 6 times).

Thank you all in advance.
Kind regards,
Kumar
 

Attachments

  • test.xlsx
    11.5 KB · Views: 2
Hello, according to your attachment a VBA demonstration as a beginner starter :​
Code:
Sub Demo1()
      Const S = "Duplicatedata"
        Dim L&, R&
        Application.ScreenUpdating = False
    With [Masterdata!A1].CurrentRegion.Columns
            L = .Rows.Count
           .Item(4).Value2 = .Parent.Evaluate("A1:A" & L & "&B1:B" & L)
           .Item(5).Formula = "=COUNTIF(D$1:D1,D1)>1"
            R = Application.CountIf(.Item(5), True)
            If R Then .Resize(, 5).Sort .Item(5), 1
           .Item("D:E").Clear
        If R Then
           .Rows(1).Copy Sheets(S).[A1]
           .Rows(L - R + 1 & ":" & L).Cut Sheets(S).[A2]
            Sheets(S).UsedRange.Columns.AutoFit
        End If
    End With
        Application.ScreenUpdating = True
End Sub
Do you like it ? So thanks to click on bottom right Like !​
 
In the attached, 2 macros:
blah
Assumes the source table is at cell A1 in the MasterData sheet.

blah1
Written to try to accommodate if the table isn't at cell A1.

Both macros clear the Duplicatedata sheet each time, so don't run them twice in succession otherwise you'll lose the first run's duplicates.

They both use conditional formatting and autofilter.
 

Attachments

  • Chandoo46850test-2.xlsm
    27.3 KB · Views: 3
This is a heck of a lot more straightforward in Get & Transform Data (aka Power Query), you just have to 'Refresh All' in the in the Queries & Connections section of the Data tab of the ribbon.
This leaves your original data untouched and gives 2 more tables.
See attached.
 

Attachments

  • Chandoo46850test-3.xlsx
    22.8 KB · Views: 4
Back
Top