ShyamExcel
New Member
Here's an interesting question for you rockstars. its in 2 parts
1. I have a excel sheet with 6 columns. A, B, C, D, E and F.
All columns are text except F.
Column E contains the core phrase that I'm interesting in finding duplicates of.
A, B, C and D can be looked at as phrases that narrow down the location of the phrase E.
As as example, you can think of A as a country name, B as state name, C as district name and D as city name, E is the person's name. F is his age (the only number column as I mentioned earlier)
So A could be India, B Karnataka, C Shimgoa D Sagar, E John Peter smith. F 25
So the entries A to D could be in duplicates and that's fine. We want to mark rows only when the names are same AND from the same city. I used to do this easily by concatenating the various columns and using countif and so any non zero entry would indicate a duplicate.
Here comes the challenge.
The name column in E can be in any order of the same words in the names. so I would like to consider as a duplicate even if the words (of varying number of words in each name set) are in different order, such as 'Peter John smith' or 'smith john peter' and having the same A, B, C and D.
To clarify, I would like to highlight all people from a particular city (so obviously same district, state and country names) whose name may have been written in any word order.
How would you approach this? I already searched in the forum before posting.
2. Another variation of this question would be to highlight the john peter smith who is the oldest, if there are many john peter smith entries from the same city, with names jumbled up. So in this part, if there are multiple entries for john peter smith from the same city, I would like to highlight the eldest or all but the eldest (which I want to delete)
Thanks!
Shyam
1. I have a excel sheet with 6 columns. A, B, C, D, E and F.
All columns are text except F.
Column E contains the core phrase that I'm interesting in finding duplicates of.
A, B, C and D can be looked at as phrases that narrow down the location of the phrase E.
As as example, you can think of A as a country name, B as state name, C as district name and D as city name, E is the person's name. F is his age (the only number column as I mentioned earlier)
So A could be India, B Karnataka, C Shimgoa D Sagar, E John Peter smith. F 25
So the entries A to D could be in duplicates and that's fine. We want to mark rows only when the names are same AND from the same city. I used to do this easily by concatenating the various columns and using countif and so any non zero entry would indicate a duplicate.
Here comes the challenge.
The name column in E can be in any order of the same words in the names. so I would like to consider as a duplicate even if the words (of varying number of words in each name set) are in different order, such as 'Peter John smith' or 'smith john peter' and having the same A, B, C and D.
To clarify, I would like to highlight all people from a particular city (so obviously same district, state and country names) whose name may have been written in any word order.
How would you approach this? I already searched in the forum before posting.
2. Another variation of this question would be to highlight the john peter smith who is the oldest, if there are many john peter smith entries from the same city, with names jumbled up. So in this part, if there are multiple entries for john peter smith from the same city, I would like to highlight the eldest or all but the eldest (which I want to delete)
Thanks!
Shyam