1. Welcome to Chandoo.org Forums. Short message for you

    Hi Guest,

    Thanks for joining Chandoo.org forums. We are here to make you awesome in Excel. Before you post your first question, please read this short introduction guide. When posting or responding to questions please remember our values at Chandoo.org are: Humility, Passion, Fun, Awesomeness, Simplicity, Sharing Remember that we have people here for whom English is not there first language and we need to allow for this in our dealings.

    Yours,
    Chandoo
  2. Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  3. When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Can't shake off duplicates efficiently using macro

Discussion in 'VBA Macros' started by shahin, Aug 12, 2017 at 3:19 PM.

  1. shahin

    shahin Active Member

    Messages:
    389
    I've written a macro to remove duplicate values by deleting the entire row. The macro is supposed to compare the value of Range A1 to any of the Range underneath within the same column to be sure that they are not the same. If it is then it will delete the entire row in which duplicate value is found. My macro can do that but I need to run the macro twice to get the whole thing done because after first run i can still see some duplicates. However, when i run it twice or thrice, it can shake off all duplicate values. One more thing: it takes almost 3 minutes to shake off 2500 duplicates out of 6000. That means it works slowly as well. What change should i make in my macro to do the whole stuff in a single run and within fewer times possible?

    Here is what I was trying with:

    Code (vb):

        Sub Dup_removal()
            Application.ScreenUpdating = False
            Dim i As Long
            Dim j As Long
            i = 2
            Do Until Cells(i, 1).Value = ""
                j = i + 1
                Do Until Cells(j, 1).Value = ""
                    If Cells(i, 1).Value = Cells(j, 1).Value Then
                        Cells(j, 1).Select
                        Selection.EntireRow.Delete
                    End If
                    j = j + 1
                Loop
                i = i + 1
            Loop
            Application.ScreenUpdating = True
        End Sub
     
  2. NARAYANK991

    NARAYANK991 Excel Ninja

    Messages:
    15,034
    Hi ,

    The fastest way to weed out duplicates would be to use a dictionary to store the keys , in this case the values in column A.

    For each value , use the Exists method to check whether the key already exists in the dictionary ; if it does , then delete that particular row and move on to the next.

    In deleting rows , start from the last row of data and work your way upwards towards the first row of data.

    Another option is to first sort the entire data on the key column ; this way all duplicates will be clustered together , and you can do a wholesale delete for each set of duplicates.

    Narayan
    Last edited: Aug 12, 2017 at 4:12 PM
    shahin likes this.
  3. Hui

    Hui Excel Ninja Staff Member

    Messages:
    10,524
    It is slow because you are working directly off the Worksheet Cells

    Load the whole area into an array in VBA

    Then check for duplicates as Narayan mentioned using a dictionary

    Super fast
    shahin likes this.
  4. NARAYANK991

    NARAYANK991 Excel Ninja

    Messages:
    15,034
    Hi ,

    Try this :
    Code (vb):

    Public Sub RemoveDuplicates()
              Debug.Print Time
              Dim dict As Scripting.Dictionary
              Dim cell As Range
              Dim currval As Variant
           
              Set dict = New Scripting.Dictionary
              lastrow = Me.Range("A" & Rows.Count).End(xlUp).Row
              j = 1
           
              For i = lastrow To 2 Step -1
                  With Me.Cells(i, 1)
                        currval = .Value
               
                        If dict.Exists(currval) Then
                          .EntireRow.Delete
                        Else
                          dict.Add currval, j
                          j = j + 1
                        End If
                    End With
              Next
              Debug.Print Time
    End Sub
     
    Narayan
  5. shahin

    shahin Active Member

    Messages:
    389
    You are awesome, Narayan. It's a great way to accomplish this type of thing. Thanks a lot.
  6. shahin

    shahin Active Member

    Messages:
    389
    @Narayan, It took 3 seconds only to do the whole operation. Btw, why "Me" keyword was putting a barrier? I ran the macro after taking it out, though! Forgive my ignorance.
    Last edited: Aug 12, 2017 at 5:33 PM
  7. NARAYANK991

    NARAYANK991 Excel Ninja

    Messages:
    15,034
    Hi ,

    You might have copied the code into a Module ; the Me keyword works in the Worksheet section , where it refers to the worksheet which has the code ; in a Module , it will generate an error.

    Narayan

Share This Page