• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User


  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Analysing sets of data


New Member
Hi all,

I've got a (hopefully) interesting problem; I need to analyse multiple sets of data to get the best match possible. Let me explain.

I'm starting with 2000 location ids each of which is separated into one of about 200 groups with a group id. The locations have been regrouped due to a change in the grouping criteria. What I need to do is compare the original groups with the new groups and rename the new groups to ensure minimum change.


Original groups:

ID Group ID

123 CE01

124 CE01

125 CE01

126 CE02

127 CE02

128 CE02

129 CE03

130 CE03

131 CE03

New group

ID Temp Group ID

123 T01

124 T01

131 T01

126 T02

127 T02

130 T02

129 T03

125 T03

128 T03

In the examples above T01 would become CE01 as there's a 66% match. Similarly, T02 would become CE02. T03 would become CE03 by default.

Can anyone help?




New Member
you can write a simple UDF that can take 2 parameters,

(1) old group name

(2) list of new groups

then scan the list of new groups to find the closest match.