• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Search for a word (string) and return that string if found

Hi All,

I have a list of names and addresses.


Each row in my spreadsheet is for one person.

The column headings are:


ColA ColB ColC ColD ColE ColF

Name Address Address1 Address2 Address3 Address4 and other information in further columns


I need to insert a new column which will tell me which county each person is from


Every entry is an Irish address


The problem is you could have somebody's address under anyone of the 4 address headings as:

Dublin City, Co. Dublin, Dublin 1

Waterford City, Co. Waterford, County Waterford

and so on for all 32 counties of Ireland


The one recurring feature will be that either one of ColB, ColC, ColD, ColE, ColF will have to contain one of the 32 counties


I have a control sheet with all 32 counties in it:


Antrim

Armagh

Carlow

Cavan

Clare

Cork

Donegal

Down

Dublin

Fermanagh

Galway

Kerry

Kildare

Kilkenny

Laois

Leitrim

Limerick

Derry

Longford

Louth

Mayo

Meath

Monaghan

Offaly

Roscommon

Sligo

Tipperary

Tyrone

Waterford

Westmeath

Wexford

Wicklow


Any suggestions??


Thanks
 
try this array formula:

=INDEX(DataList,MATCH(TRUE,ISNUMBER(SEARCH(DataList,B3&C3&D3&E3&F3)),0))


where DataList is the range containing the 32 counties. Remeber to confirm formula using Ctrl+Shift+Enter, not just Enter.
 
Hi Kevin,


You will have to adjust the sheet reference:


In an empty column of first sheet row 2:

=LOOKUP(999,SEARCH(County_List,CONCATENATE(C2,D2,E2,F2),1),County_List)


WHERE:

"County_List" refers to a named range of all irish counties.
 
Hi,

Data scrubbing is an art form, to say the least!


You may need to tweak both formulas slightly, assuming that data will always have the counties prefixed with "Co. ". If that is not a correct assumption, the formulas will need to be adjusted further.


Luke's formula can be tweaked as:

=INDEX(DataList,MATCH(TRUE,ISNUMBER(SEARCH("Co. " & DataList,B2&C2&D2&E2&F2)),0))


Shrivallabha's formula can be tweaked as:

=LOOKUP(999,SEARCH("Co. " & County_List,CONCATENATE(B2,C2,D2,E2,F2),1),County_List)


I tested these formulas with the following dummy address:

[pre]
Code:
Address	              Address1	        Address2
Waterford City	      Co. Tyrone	Dublin 1
[/pre]
expecting the result "Tyrone".


Some things to pay attention to:

The MATCH function returns the first match it finds.

The LOOKUP function expects a sorted list, but the result of the SEARCH does not guarantee that.


Cheers,

Sajan.
 
Excellent formula posted by all three of you...


@Shri:


Could you plz tell me why do we use "999" as the lookup value in lookup function?


It looks like a "missing value" kind of a statement...just guessing...


Actually I am not aware of this technique....


Can you plz explain it to me?


Regards,

Kaushik
 
Hi All,

these formulas work perfectly and I haven't had to tweak them!!


I am interested to know what the 999 does in the lookup formula?


This site is amazing!
 
the 999 is an arbitrary large number. We really just want to find any number that the SEARCH function spits out. SEARCH returns the numerical position of the text we're looking for, so we could expect it to be anywhere from 1-100, or #N/A if not found. So, we just pick a random large number and tell the formula to LOOKUP the closest match. So, if 5 is the largest number, we'll take that. If 89 is the largest number, that'll work as well.


The formula I posted didn't have this step because I used the ISNUMBER function. This changed all the random numbers and errors into boolean outputs, so mine just looked for a TRUE value. The overall function is the same, however.
 
Hi,


Glad that the formulas posted served purpose.


LOOKUP is a bit of unique function which forces the nested function (SEARCH in above case) to work out the whole ARRAY. This behavior is somewhat similar to SUMPRODUCT as we don't have to do CTRL + SHIFT + ENTER to force ARRAY behavior.


Secondly, it has "good" ability to ignore errors so there is no error handling for SEARCH function. So as long as the word is contained somewhere in the CONCATENATED part it will return matching result.

@Sajan: Try the formula without tweak, it should work.


Regarding 999 (as Luke as pointed out rightly) it is number large enough to look back at the smaller numbers that SEARCH will throw at us.


Also, if you want to understand the formula behavior [step by step how it works] then try to use following option:

Formula Auditing | Evaluate Formula

Do it on smaller samples so that results can be understood easily enough.


I owe this formula to Aladin Akyurek (of MrExcel). Here are few links:

http://www.mrexcel.com/forum/showthread.php?t=102091

http://www.mrexcel.com/forum/showthread.php?t=18954

http://www.mrexcel.com/forum/showthread.php?t=310278 (Post #7)
 
Hi Shrivallabha,

Could you check the sample address I posted against your original formula? I got a different result than expected. (i.e. the formula returned Waterford, when I was expecting Tyrone.)


Glad to hear that the OP does not have any such data conditions, and was able to use the original formulas as is.


-Sajan.
 
Hi Sajan,


Its because you mixed up 2 counties (Waterford and Tyrone) and which wasn't a case with OP's data. So OP posted he didn't have to tweak any of the formulas.


If that was the case then it'd need different approach to list out all counties that feature in the addresses.
 
Back
Top