• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

concatenating data from fields with more than one duplicate entry

Chart68

New Member
Hi I have some data where there is more than one entry per person & I need to concatenate data from a particular field for each duplicate entry.


Candidate QUALWT from lookup

1000026 6 6;10

1000026 10

1000091 13 13;6; 10

1000091 6

1000091 10

1000133 6 6;10

1000133 10

1000139 13 13;39

1000139 39


As you can see I'm taking the value from QUALWT & concatenating it in to one field by

=IF(A2=A3, CONCATENATE(J2,";",J3)) but there are more than two duplicates for a candidate.


Could you give me some Idea how to work out how many duplicates there are for each occurence then create the necessary formula to concatenate. My spreadsheet has 6666 lines of data.


Thank you


Claire
 
Hi Claire,

I do not have a way to access the file you uploaded (due to firewall restrictions on my side). However, I think I understand what you are asking.


Using a helper column, you could get the results you are looking for. Please try out the following:


I setup the sample data as follows:

[pre]
Code:
List1	List2	Helper	Result
1000026	6	6	6;10
1000026	10	6;10	6;10
1000091	13	13	13;6;10
1000091	6	13;6	13;6;10
1000091	10	13;6;10	13;6;10
1000133	6	6	6;10
1000133	10	6;10	6;10
1000139	13	13	13;39
1000139	39	13;39	13;39
[/pre]
For ease of writing the formula, I named the data ranges as follows:

A2:A10 is List1

B2:B10 is List2

C2:C10 is Helper


I have also assumed that your data in List1 is sorted, so that all similar entries will appear next to each other.

(I have also assumed that a given item in List1 is not repeated more than a few dozen times. If that is not correct, we may need to tweak the approach.)


The Helper column is calculated as (shown for cell C2)

=IF(A2<>A1,B2,CONCATENATE(C1,";",B2))

Copy C2 down into the remaining cells through C10


The Result column is calculated as (shown for cell D2)

=INDEX(Helper,MAX((COUNTIF(A2,list1)>0)*ROW($Z$1:INDEX($Z:$Z,ROWS(list1)))))

entered with Ctrl + Shift + Enter

Copy D2 down into the remaining cells through D10


If you don't want the concatenated result repeated for identical values in List1, we can tweak the formula for the Result.


Hope this helps.


Cheers,

Sajan.
 
Hi Sajan,


Thank you for this, Ive hit a problem, I can get the first piece of formula to work for the Helper colum, but I cannot get the result coulmn to work, I just get a #NAME? error. I understand the code for the Helper column but am completely lost for the code for the result column.


Thanks


Claire
 
Hi Chart,


As per Sajan's formula if the named range are replaced with original ranges it will be:

=INDEX($C$2:$C$15,MAX((COUNTIF(A2,$A$2:$A$15)>0)*ROW($Z$1:INDEX($Z:$Z,ROWS($A$2:$A$15)))))


A2:A10 is List1

B2:B10 is List2

C2:C10 is Helper

But why Z column is selected? no idea.


Thanks,

Suresh Kumar S
 
Hi Claire,

Glad to hear that the solution worked for you. If you are unsure of any segment of the formulas, we would be to happy to explain.


Hi Suresh,

ROW($Z$1:INDEX($Z:$Z, ....) is a technique to generate an array of sequential numbers. (There are many other ways to do so. I just happen to use this approach the most!) One could substitute column A for Z in the formula, and it will work the same. However, if you delete column A for any reason, Excel rewrites the formula, and sometimes not in a desirable way! By using column Z, since it is "out of the way" of most data ranges folks use, it is relatively safe! Of course it depends on your specific workbook's data organization, etc. Hope that helps.


Cheers,

Sajan.
 
Back
Top