• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

highlight duplicates with macro only

estatefinds

New Member
I looked all over the internet and found some macro programs to highlight duplicates but it ended up not responding.


Sub DupFinder()

Dim r As Range, t As Range

Set t = Range(“c1:c350000”)

For Each r In t

v = r.Value

If Application.WorksheetFunction.CountIf(t, v) > 1 Then

r.Interior.ColorIndex = 3

End If

Next

End Sub

I know very little right now about VBA; is there something that could be done to this macro that would allow macro to run with out ending up with the "not responding". When I click to run macro I can see that it does what it suppossed to by highlighting, but then freezes up with the "not responding" message. Thanks
 
Hi, estatefinds!


First of all welcome to Chandoo's website Excel forums. Thank you for your joining us and glad to have you here.


As a starting point I'd recommend you to read the green sticky topics at this forums main page. There you'll find general guidelines about how this site and community operates (introducing yourself, posting files, netiquette rules, and so on).


Among them you're prompted to perform searches within this site before posting, because maybe your question had been answered yet.


Feel free to play with different keywords so as to be led thru a wide variety of articles and posts, and if you don't find anything that solves your problem or guides you towards a solution, you'll always be welcome back here. Tell us what you've done, consider uploading a sample file as recommended, and somebody surely will read your post and help you.


And about your question...


If you haven't performed yet the search herein, try going to the topmost right zone of this page (Custom Search), type the keywords used in Tags field when creating the topic or other proper words and press Search button. You'd retrieve many links from this website, like the following one(s) -if any posted below-, maybe you find useful information and even the solution. If not please advise so as people who read it could get back to you as soon as possible.


I tried with the magic words "highlight duplicate macro" and got 10 pages, aka 100 entries.


And specifically about your question, the problem is this: you have a huge amount of data, 350K cells and for each one you're calling an Excel function, ... well, if I were you I'd try looking for a 2nd hand Cray One, they should be cheap today.


Regards!
 
if I were you I'd try looking for a 2nd hand Cray One, they should be cheap today.


I dont understand what the 2nd hand cray one is? I had looked through this site and cant really find how

I can get this to work I have other macros that handle the entire data no problem that was done. I dont understand stand why it doesnt work with this macro. this is what i need help on. Thanks
 
Hi, estatefinds!


Just fyi it's been one of the first supercomputers: http://en.wikipedia.org/wiki/Cray-1.


The problem with your code is that the running instance of Excel in your machine (and in any machine, I tested it on an i7-920 with 6GB/2000Mhz RAM and had to use Task Manager to end the job) is unable to handle so much calculations (350K*350K = 122.5Tb = 1.225 E11 check of cell values and 350K of attribute cells settings... lot of work.


I added just a line for control each 1K of lines and they took 30 seconds, so the 350K would take 3 hours, maybe that's why Excel shows the "not responding" message.


This is the tweaked code:

-----

[pre]
Code:
Option Explicit

Sub DupFinder()
Dim r As Range, t As Range, v As Variant
Set t = Range("c1:c350000")
For Each r In t
v = r.Value
If Application.WorksheetFunction.CountIf(t, v) > 1 Then
r.Interior.ColorIndex = 3
End If
If (r.Row Mod 1000) = 0 Then Debug.Print Now(), r.Row
Next
End Sub
[/pre]
-----


This is the time control:

-----

29/04/2013 09:21:40 p.m. 1000

29/04/2013 09:22:10 p.m. 2000

29/04/2013 09:22:40 p.m. 3000

29/04/2013 09:23:10 p.m. 4000

29/04/2013 09:23:40 p.m. 5000

29/04/2013 09:24:10 p.m. 6000

29/04/2013 09:24:40 p.m. 7000

-----


Doing it by conditional formatting with this formula condition:

=CONTAR.SI(C$1:C$350000;C1)>1 -----> in english: =COUNTIF(C$1:C$350000,C1)>1

... it took... sorry, I couldn't time it, it finished yet... 3 seconds?


Regards!
 
Hi, estatefinds!


I don't know if you understood me: the tweaked code will also take 3 hours! It only displays the actual time each thousand cells.


The formula for conditional formatting (Start tab, Styles group, Conditional Format icon, New Rule, Use Formula -last option-) is the approach that takes no time.


Regards!
 
Estatefinds


I think your code is slow as you are using a Worksheet Function call which is very slow in itself and you are using it multiple times


Can you try the following which uses the Range.Find method


If this doesn't help, We can speed it up by bringing the whole Range into VBA and doing the work in Arrays

Then writing it out which should be substantially faster again.

[pre]
Code:
Sub DupFinder2()
Dim MaxRows As Long, i As Long

MaxRows = 350000

i = 1
For i = 1 To MaxRows
With Worksheets(1).Cells(i, 3)
t = .Value
With Worksheets(1).Range(Cells(i + 1, 3), Cells(MaxRows, 3))
Set c = .Find(t, LookIn:=xlValues)
If Not c Is Nothing Then
Worksheets(1).Cells(i, 3).Interior.ColorIndex = 3
firstAddress = c.Address
Do
c.Interior.ColorIndex = 3
Set c = .FindNext(c)
Loop While Not c Is Nothing And c.Address <> firstAddress
End If

End With
End With
Next
End Sub
[/pre]
 
Back
Top