• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Filter and sum based on duplicate values in column

realhajiev

New Member
Dear friends,

I need your help to deal with the following problem:

I have a list of information on B7:E15. This bottom-end (E15) is not fixed as new entries are being made daily (1000k-2000k entries). What I need to do is to look for duplicate values based on the information on column C and, if found and if its date is earlier than date on cell H3, to sum corresponding values in Value1 and Value2 columns, and delete the duplicate row. For instance, if the value on cell C8 is found in another in Cell 12 and if its date on B12 earlier than date on H3, then add D12 to D8, and E12 to E8 and then delete row 12.

Please see the sample file in the attachment.

Thank you very much in advance for your help.
 

Attachments

  • VBA question.xlsx
    9.7 KB · Views: 10
Hi Narayan,

Yes, it matters. If the date in B8 is later than the date in H3, then this line should be left intact; nothing needs to be done for this one.

Thanks,
Real
 
Hi ,

Let us then put down the procedure as follows :

For every item in column C , see the corresponding date in column B.

If the date in column B is greater than the date in H3 , then skip this item and go to the next item in column C.

If the date in column B is less than the date in H3 , then see if this item in column C has occurred before.

If it has not occurred before , skip this item and go to the next item in column C.

if it has occurred before , then add the values in columns D and E to the earlier values , and delete the current row. Go to the next item in column C.

With reference to your uploaded data , we have duplicates in C8 and C12.

There are 4 possibilities :

1. The dates in both B8 and C8 can be less than the date in H3.

2. The date in B8 can be less than the date in H3 , while the date in C8 is greater than the date in H3.

3. The date in B8 can be greater than the date in H3 , while the date in C8 is less than the date in H3.

4. The dates in both B8 and C8 can be greater than the date in H3.

Can you confirm what is to be done in each of these cases , assuming that the occurrence in row 8 is the first occurrence , and the occurrence in row 12 is the duplicate ?

Narayan
 
Hi

I was looking at this last night and it got too late. I saw Narayan's response last night which he would not have been able to put together in NZ as it would have been too late even for him :)

I looked at the problem a little more simply. If there was a duplicate and in the second, third instance and if that duplicate date was less than the date in a given cell then consolidate and remove. Here is the coding and I have attached a file to show my view of the problem.
Code:
Option Explicit
 
Sub DelDups()
Dim ar As Variant
Dim i As Long
Dim j As Long
Dim n As Long
  
ar = Sheet1.[a1].CurrentRegion.Value
n = 1
With CreateObject("Scripting.Dictionary")
For i = 2 To UBound(ar, 1)
    If Not .exists(ar(i, 2)) Then
        n = n + 1: .Item(ar(i, 2)) = n
        For j = 1 To UBound(ar, 2)
            ar(n, j) = ar(i, j)
        Next
    Else
        If ar(i, 1) < [G1] Then
            ar(.Item(ar(i, 2)), 3) = ar(.Item(ar(i, 2)), 3) + ar(i, 3)
            ar(.Item(ar(i, 2)), 4) = ar(.Item(ar(i, 2)), 4) + ar(i, 4)
            For j = 1 To UBound(ar, 2)
                ar(i, j) = ""
            Next
            n = n + 1
        End If
    End If
    Next
End With
With Sheets.Add().Cells(1).Resize(UBound(ar, 1), UBound(ar, 2))
    .Value = ar
    .Columns(1).SpecialCells(4).EntireRow.Delete
End With
End Sub


Take care

Smallman
 

Attachments

  • SumDupsWithDict3.xlsm
    19.6 KB · Views: 1
Hi Narayan,

Sorry for delayed reply and thank you for your kind assistance!

You have put the general flow correctly. The only difference would be the fact that I am looking for duplicates not before, but after the current row. However, I don't think it would make a difference, because all rows will ultimately be affected by the macro.

Regarding the scenarios, I can put them as follows:

1. The dates in both B8 and B12 be less than the date in H3 - ideal match! Needs to be summed.

2. The date in B8 can be less than the date in H3, while the date in C12 is greater than the date in H3 - in that case, we don't need to sum them.

3. The date in B8 can be greater than the date in H3, while the date in B12 is less than the date in H3 - this scenarios is highly unlikely, because in the original document the values are added in chronological order. But, if it happens, then again, we don't need to sum them.

4. The date in both B8 and B12 can be greater than the date in H3 - no need to sum them.

As you see, the only scenario that we need to add them is when both dates are less than the date on C3.

Thank you again for your help!

Real
 
Hi Smallman,

Thank you very much for your help!

Your code is superb. The only comment is that I want the result to be in the same place, not in another sheet. It is a complex workbook where all sheets have its place. For that reason, it is very preferable to have the result in the same worksheet.

And also, I could not decipher your code and therefore don't know whether the code will do same operation till the last row in the worksheet. The tables there are non-contiguous, but have the same structure. Each has more than 1000 records.

Thanks a lot again for your help.

Kind regards,
Real
 
Hi realhajiev

The file and coding I provided was just an example. Such a small amount of it needs to change to get what you require. It was designed to handle any amount of data not be static on the dataset you provided.

I triplicated your list and ran the code over your original file. Seems to work well. I would put the data in a fresh sheet with the express purpose of that sheet to produce this procedure. It gives you the opportunity to check the expected results rather than accepting the code will be your saviour.

Anyways here is the adapted file and coding.

Code:
Option Explicit
Sub DelDups()
Dim ar As Variant
Dim i As Long
Dim j As Long
Dim n As Long

ar = Sheet1.[C8].CurrentRegion.Value
n = 1
With CreateObject("Scripting.Dictionary")
For i = 2 To UBound(ar, 1)
    If Not .exists(ar(i, 2)) Then
        n = n + 1: .Item(ar(i, 2)) = n
        For j = 1 To UBound(ar, 2)
            ar(n, j) = ar(i, j)
        Next
    Else
        If ar(i, 1) < [H3] Then
            ar(.Item(ar(i, 2)), 3) = ar(.Item(ar(i, 2)), 3) + ar(i, 3)
            ar(.Item(ar(i, 2)), 4) = ar(.Item(ar(i, 2)), 4) + ar(i, 4)
            For j = 1 To UBound(ar, 2)
                ar(i, j) = ""
            Next
            n = n + 1
        End If
    End If
    Next
End With
[C8].CurrentRegion.ClearContents
With Sheet1.[C8].Resize(UBound(ar, 1), UBound(ar, 2))
    .Value = ar
    Range("c8", Range("c65536").End(xlUp)).SpecialCells(4).EntireRow.Delete
End With
End Sub

If you want me to explain how the coding works just sing out. As far as scripting dictionaries go this one is pretty straight forward. THey can get pretty hairy.

Take care

Smallman
 

Attachments

  • VBA questionA.xlsm
    25.4 KB · Views: 5
Hi Smallman,

The code absolutely suits the purpose.

But you have guessed it correctly: it is very difficult for me to understand it. I would be very grateful if you add some comments to the code or show links to references etc.

My original intention was to make some small changes in your code (like column number) and to adjust it for our own operations, but this task seems impossible now :) I did not expect such a difficult code.

But anyways, I am almost there and it fully suits our needs.

Thanks!

Real
 
Ok

So you want to understand how to manipulate this coding. Start here.

Code:
With Sheet1.[C8].Resize(UBound(ar, 1), UBound(ar, 2) - 1)

Notice the -1 at the end. The result will give you Column 1 to 3 not Column 1 to 4.

If you want other columns excluded as in if you have not presented a like dataset in your example there may need to be further manipulatino within the coding. Another loop at the end of the procedure which will only extract the data you want.

I think if you post a workbook with a real life example and an output I will be able to shed more light.

Take care

Smallman
 
Hi Smallman,

Thank you again for your help.

I uploaded the new file that is almost same with the real life example.

In the file, based on the values on C column, if their Settlement date (F column) is less than the date on R2, then we need to sum the corresponding values on Quantity (I column) and Cost (J column) columns and then delete the duplicate column.

Looking forward to your response!

Thank you,
Real
 

Attachments

  • Test 2.xlsx
    10.7 KB · Views: 2
Hi realhajiev

I will have a look at this later tonight as I am heading out to play tennis tonight.

I noticed your headings were split over two lines. My solution will remove line 4. As a general rule headers should always be on one row. It makes managing the data easier and creating quick and dirty pivot tables becomes a breaze.

Take care

Smallman
 
Hi Realhajiev

Your file did not have any duplicates or any dates later than the date listed in R2?

I have fixed your file so there is an example to check the coding with. The following will work. I highlighted the lines I have changed form the above.

I added an error trap in case your file has not duplications at all.

It should be easier for your to make changes to this coding in future based on the slight changes I made to it.

Code:
Option Explicit

Sub DelDupsA()
Dim ar As Variant
Dim i As Long
Dim j As Long
Dim n As Long

ar = Sheet1.[B3].CurrentRegion.Value 'Changed C8 to B3
n = 1
With CreateObject("Scripting.Dictionary")
For i = 2 To UBound(ar, 1)
    If Not .exists(ar(i, 2)) Then
        n = n + 1: .Item(ar(i, 2)) = n
        For j = 1 To UBound(ar, 2)
            ar(n, j) = ar(i, j)
        Next
    Else
        If ar(i, 5) < [R2] Then 'Change H3 to R2, Changed ar(i, 2) to ar(i, 5)
            ar(.Item(ar(i, 2)), 8) = ar(.Item(ar(i, 2)), 8) + ar(i, 8) 'Changed 3 to Col 8
            ar(.Item(ar(i, 2)), 9) = ar(.Item(ar(i, 2)), 9) + ar(i, 9) 'Changed 4 to Col 9
            For j = 1 To UBound(ar, 2)
                ar(i, j) = ""
            Next
            n = n + 1
        End If
    End If
    Next
End With
[B3].CurrentRegion.ClearContents
With Sheet1.[B3].Resize(UBound(ar, 1), UBound(ar, 2)) 'Change C8 to B3
    .Value = ar
    On Error Resume Next 'in case no blanks
    Range("B3", Range("c65536").End(xlUp)).SpecialCells(4).EntireRow.Delete 'Changed C8 to B3
    On Error GoTo 0
End With
End Sub

It is really late here in Oz so I will be off now. Will check on this in the morning.

Take care

Smallman
 

Attachments

  • Test 3.xlsm
    22.9 KB · Views: 6
Hi Smallman,

Thank you again for your help!

Something goes wrong in my original sheet and I am trying to figure out why. In particular, the code cleared the contents of the range where I ran the code. It did not happen in your sheet.

I'll send you a final worksheet with all 'commentaries' very soon.

Thank you and sorry for delay :)

Real
 
Hi Smallman,

Sorry for such a late comeback! We had a very busy two week due to the year end and therefore I could not prepare an answer properly!

I have attached a new test file based on your file. The problem with the previous macro was that, once it did job, I somehow consolidated all information together, which did not suit our purpose.

In this file, I have added two more tables (in real life they are more than 15), made sure that not all but some of them are duplicated.

The requirement remains as before: to run through C column, look for duplicate values, if found, check for dates on F column, if they are less than the date on R2, then sum the corresponding values on columns I and J and then delete the row where the duplicate value resides.

I am really grateful for what you have done so far! It is very kind of you!

Looking forward to your reply!

Kind regard,
Real
 

Attachments

  • Test 3 (1).xlsm
    24.2 KB · Views: 1
Oh

You have several tables not one??? What a crucial bit of information to leave out of your example.

Your 6 line gap between each table contain the same characteristics as your deleted lines. How are you supposed to distinguish the difference between the lines which are separating your tables and the lines you just removed. I have left all the blanks in till you answer this question.

I would include the whole table in a tabular format so everything was in one clean table. If the tables belong to different traders/departments then add another column and give it an original name like "Trader" and now you can differentiate it. The way you have this set out needs work in my opinion.

Here is the modified code.

Code:
Option Explicit

Sub DelDupsA()
Dim ar As Variant
Dim i As Long
Dim j As Long
Dim n As Long

ar = Range("B3", Range("J65536").End(xlUp)).Value 'Changed C8 to B3
n = 1
With CreateObject("Scripting.Dictionary")
For i = 2 To UBound(ar, 1)
    If Not .exists(ar(i, 2)) Then
        n = n + 1: .Item(ar(i, 2)) = n
        For j = 1 To UBound(ar, 2)
            ar(n, j) = ar(i, j)
        Next
    Else
        If ar(i, 5) < [R2] Then 'Change H3 to R2, Changed ar(i, 2) to ar(i, 5)
            ar(.Item(ar(i, 2)), 8) = ar(.Item(ar(i, 2)), 8) + ar(i, 8) 'Changed 3 to Col 8
            ar(.Item(ar(i, 2)), 9) = ar(.Item(ar(i, 2)), 9) + ar(i, 9) 'Changed 4 to Col 9
            For j = 1 To UBound(ar, 2)
                ar(i, j) = ""
            Next
            n = n + 1
        End If
    End If
    Next
End With
Range("B3", Range("J65536").End(xlUp)).ClearContents
With ActiveSheet.[B3].Resize(UBound(ar, 1), UBound(ar, 2)) 'Change C8 to B3
    .Value = ar
End With
End Sub

Take care

Smallman
 

Attachments

  • Test 3Smallman.xlsm
    33.1 KB · Views: 4
Hi Smallman,

Thank you for your feedback!

Sorry for not disclosing on time: there are many tables instead of one. I totally understand your view regarding the tabular format, but unfortunately I can not change it, because this is a shared table and I just own one sheet of it.

Regarding the new blank lines that appear after we delete the duplicate values, we need to delete the whole Row itself! It is like delete and shift rows up.

I have run your code several times and there were few problems. I upload to comparison tables, one before macro runs, the other after it. I have put comments next to the lines that seemed problematic.

Thank you very much again for your help!

Real
 

Attachments

  • Test 3Smallman (3)_Before Macro.xlsm
    35.1 KB · Views: 1
  • Test 3Smallman (3)_After macro.xlsm
    35.5 KB · Views: 1
Your blank rows were the item I had not considered. Shame you are not able to make your data a best practice tabular layout. It makes a lot more sense to start with strong design and work for me there.

I have altered the coding by putting A Yellow and Blue line at the bottom of the test file. These are both duplicated. One is before the date in R2 and one after. The one before gets removed and consolidated, the one after is left in place.

Revised coding and file.

Code:
Option Explicit

Sub DelDupsA()
Dim ar As Variant
Dim i As Long
Dim j As Long
Dim n As Long

ar = Range("B3", Range("J65536").End(xlUp)).Value 'Changed C8 to B3
n = 1
With CreateObject("Scripting.Dictionary")
For i = 2 To UBound(ar, 1)
    If Not .exists(ar(i, 2)) Then
        n = n + 1: .Item(ar(i, 2)) = n
    Else
        If ar(i, 5) < [R2] Then 'Change H3 to R2, Changed ar(i, 2) to ar(i, 5)
            ar(.Item(ar(i, 2)), 8) = ar(.Item(ar(i, 2)), 8) + ar(i, 8) 'Changed 3 to Col 8
            ar(.Item(ar(i, 2)), 9) = ar(.Item(ar(i, 2)), 9) + ar(i, 9) 'Changed 4 to Col 9
            For j = 1 To UBound(ar, 2)
                ar(i, j) = ""
            Next
            n = n + 1
        End If
    End If
    Next
End With
Range("B3", Range("J65536").End(xlUp)).ClearContents
With ActiveSheet.[B3].Resize(UBound(ar, 1), UBound(ar, 2)) 'Change C8 to B3
    .Value = ar
End With
End Sub

You will need to start to learn the coding so you can make these small tweaks as your file changes.

Take care

Smallman
 

Attachments

  • Test SmallmanV4.xlsm
    54.7 KB · Views: 4
Hi Smallman,

Thank you for your help! The file now works as intended.

It was very kind of you to help me on this!

Thank you very much!
Real
 
Back
Top