• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Remove duplicates and move rows

Good evening.

I have a contact table.
I tried to remove the duplicates, selecting all the columns of the table, but I noticed that there are some duplicates that in certain columns differ from each other due to the fact that the cell of a column can be empty while the same of the double staff is full or the two cells of the same column have different content.

It is possible with a macro:
1) extrapolate the first word of each cell of column a of the table sheet by removing the duplicates (from e9 down of the sheet select);
2) to be able to indicate standard columns to identify any duplicates (a business name or a url that are unique) and standard columns to be filled only if the rows double the standard value than an empty cell (not if they are both full ), so as to make them the same and eliminate them (a1 of the sheet select with results in the unique sheet)?
3) after action n. 2 would I like to transfer the lines containing certain words (which have a space ahead and after the word or characters continuously such as the url) in the indicated sheet (a7 of the sheet select with results in the unique sheet)?

Thanks
 

Attachments

  • rimozione duplicati e spostamento righe.xlsx
    13.7 KB · Views: 6
Your English is good, but I suspect you mean not "extrapolate" in item 1 ("estrapolare") but "extract" ("estrarre"); am I right?

I'm not sure what you need here. I see several worksheets. I think you're talking about Sheet2 ("TABLE"), and you want to deal with rows that are duplicate. In TABLE I see several collections of company names and URLs, and in UNIQUE you have (correctly) eliminated the duplicates. I think you want to write a VBA program to do that—to read TABLE, select the non-duplicate data by looking at the full company name and URL, and write the results to UNIQUE.

About step 3 I'm more confused, but I think you want to put in worksheet SELECT some data about what the program did. I'm not sure what, though.

So far, though, I'm convinced that you can write a program to do what you want...once I understand better what you want :).
 
You got it all right. I use the google translator
In the third point I would like to move the contacts to the different sheets based on their name.
For example I want to move the rows that contain in the column of the name "a" of the sheet "unique" to another sheet (sheet1) while those that contain the word bar in the sheet 2 and so on
 
We can write a program together that'll do what you want. But let me point out that you can at least identify and eliminate the duplicates without a VBA program. I'll describe the steps using R1C1 notation rather than A1:

1) Create a temporary helping column. Let's say that's in column 7.
2) In column 7 put the formula "=RC1&RC3". Now column 7 has il nome ~and~ the URL.
3) Sort the rows on column 7.
4) In column 8 put this formula: "=RC7=R[-1]C7". Each value in column 8 is now True (if this row is a duplicate of the previous row) or False (if this row is the first one to have this exact name and URL).
5) Copy column 8 and paste it OVER ITSELF as Values (not formulae).
6) Now sort the data on column 8, so that all the Trues and all the Falses are together.

At this point I normally delete all the rows that have True in column 8, because my purpose is simply eliminate the duplicates. I gather you want to do something else, something I'm not sure of yet. But you have identified all the duplicates, ~without~ writing a program, if that pleases you.

Depending on where you want to go from here, we can start on writing a program or you can decide that does what you need.
 
I would first like to make the rows identical by filling only the empty cells with the contents of the full cells (this for the identical rows for certain colonists).
This action is useful to me because for the same data such as name, email etc. on one line there is a data that is not in the other line (on one line there could be a telephone number and on the other it may not and vice versa while the other could have the address in addition (so I would like to create a complete line: name, email, telephone and address).
for this reason I have indicated the possibility of selecting standard reference columns to find similar rows and then completing them making them identical and once made identical the superfluous row can be removed.
 
Ah! I understand now: The same company may appear on multiple lines, but each lne may have values filled in that are missing in others. So you want to create a list that has just one entry for each company, but as many fields for that company as can be gathered from the (perhaps multiple) records that mention it.

For that I would create a "class" to represent a company and containing all the properties that might be needed. (A "class" is a definition of data that contains more than one field, or "property". In this case we'll create a class called "Company" that has three properties to start with: Nome, URL and Telefono. Like this:
Code:
Public Nome 'company name
Public URL
Public Telefono

We can add more without much trouble if you think of more. This code will go in a "class module". Then, in a separate module, we'll write the program that uses that class.

...But not right now. Right now I'm going to bed. More tomorrow. If you don't hear from me, feel free to nag me about it :).
 
You're going to write a program that reads all the lines from the worksheet named TABLE, and creates a list of the companies it finds there. It might look something like this (I'll ask more questions below):
Code:
Const cCo=1, cURL=3, cTel=5 'set the columns used in the worksheet
Set owb=ThisWorkbook
Set ows=owb.Worksheets("TABLE")
Set Cos=New Collection 'create a Collection that will hold all our companies

rZ=ows.Cells(2,1).End(xlDown).Row 'find the last row
For jr=2 to rZ 'for every row in the worksheet
  vc=ows.Cells(jr,cCo).Value 'get the company name for this row
  vu = ows.Cells(jr, cURL).Value 'get the URL
  vt = ows.Cells(jr, cTel).Value  'get the telephone number
  If Exists(Cos, vc) then 'is this company name already in the collection?
    Set oco = Cos(vc) 'fetch that Company object
  Else
    Set oco = New Company 'create a new company object
    Cos.Add oco, vc 'add it to the collection
    oco.Nome = vc 'set the name
    End If
  If vu<>"" Then oco.URL = vu 'update the URL
  If vt<>"" Then oco.Telefono = vt 'update the telephone number
  Next jr

(Warning: I didn't test this code, so it may have mistakes in it.) I'm not sure how much you already understand, but I won't be surprised if you need to ask:
1) What is a "class" and an "object" and what's the difference between them?
2) What is a "collection"?
3) How do I create the "Company" class?
4) What is the "Exists" function you use in that program?

And there are some things we need to discuss:
a) Will there ever be two different companies that have the same name? If so, how do you tell them apart?
b) The above program fills in the URL and telephone for each company if not blank. But what do you want to do if the same company is listed twice with a different telephone number or URL? This program takes only the first and ignores any new value; is that what you want?
c) This program doesn't do anything, yet, with the values it collects; so far I'm just showing you how to make a list with each company name and a URL and telephone number for each. After we have that working right, we'll have to write something that places the values where you want them.
d) I ignored columns 2 and 4; if you need those collected, too, then please explain to me what they are.
 
Answers to your questions:
a) Will there ever be two different companies that have the same name? If so, how do they stand out?
- for this reason I thought about different procedures and to be able to choose the standard columns (I would like to proceed by identifying more columns such as name, url - there may be the same name but the url will be different, etc.)
b) The above program inserts each company's URL and phone if it is not empty. But what do you want to do if the same company is listed twice with a different phone number or URL?
I wrote empty just for this case (you are heavy). In this case I will manually / visually evaluate what to do (I don't think there are other ways).
d) I ignored columns 2 and 4; if you also need those collected, please explain to me what they are.
the possibility of being able to indicate the comparison columns and values included in a cell makes it possible not to have a standard model but to have a generic one so that it can be adapted to the various cases (the name column in other cases could be in a different column, etc.) .
 
I'm back.

a) If there can be two companies by the same name and you want to use the URL to distinguish them, that leads to another question: What if you run into a situation like this:
Code:
Company name  URL
------------  ---
Arbuzzi LLC
Arbuzzi LLC   http://arbuzzi.com
The first has no URL listed and the second one does. Are they to be considered two different companies? My suggestion is that only if the two companies have different non-blank URLs should they put in two different slots. You know best, though.

d) I don't understand this yet; perhaps you can show me an example that will make it clear. But I'm pretty sure we can add this later, if the need arises.

Next: How much do you know already about writing a VBA program? Because I don't plan to write it for you. I'll teach you how to write programs; I want you to know what your program does, so you'll be able to change it later if you need to.
 
In the answer "a" you thought well.
as for my answer b if not everything can be automtato I should fix the few lines left manually.
I can't program in vba but I'm ready to learn. If you are willing and have time you could teach me. But I don't think that's the case here.
 
Back
Top