• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Excel Formula to Identify similar Sounding words

Status
Not open for further replies.
Hi,


I wish to know if there is a formula in excel to find out similar sounding words even if the spelling is wrong. We need to match 2 names in database where there are spelling mistakes. Nothing is working out to know whether its just a spelling mistake or altogether a different word.


Any idea is welcome. Example is as below.


Sample 1 Sample 2

Michael Mchale

Alabbar Al Abbar

Hamza Hemza

Abdul Azeez Abdul Aziz

Sheikh Ahmead Sh Ahmed

Mouaaz Muaz

Reza Raza

Kishor Kishore

A AlSharhan Abdulla Al Sharhan


Regards,

Ramnath
 
I would do a quick pass through and find all the duplicates

http://chandoo.org/wp/tag/duplicates/


Add a Fuzzy search

http://chandoo.org/wp/2008/09/25/handling-spelling-mistakes-in-excel-fuzzy-search/

to narrow the list


Then spend some time on the non-duplicate ones manually
 
I share with you several vba script i found.

Conseils Microsoft Access : Correspondances floues - Soundex (allenbrowne.com)

>>> use code - tags <<<
Code:
Public Function Soundex(varText As Variant) As Variant
On Error GoTo Err_Handler
    'Purpose:   Return Soundex value for the text passed in.
    'Return:    Soundex code, or Null for Error, Null or zero-length string.
    'Argument:  The value to generate the Soundex for.
    'Author:    Allen Browne (allen@allenbrowne.com), November 2007.
    'Algorithm: Based on http://en.wikipedia.org/wiki/Soundex
    Dim strSource As String     'varText as a string.
    Dim strOut As String        'Output string to build up.
    Dim strValue As String      'Value for current character.
    Dim strPriorValue As String 'Value for previous character.
    Dim lngPos As Long          'Position in source string
   
    'Do not process Error, Null, or zero-length strings.
    If Not IsError(varText) Then
        strSource = Trim$(Nz(varText, vbNullString))
        If strSource <> vbNullString Then
            'Retain the initial character, and process from 2nd.
            strOut = Left$(strSource, 1&)
            strPriorValue = SoundexValue(strOut)
            lngPos = 2&
           
            'Examine a character at a time, until we output 4 characters.
            Do
                strValue = SoundexValue(Mid$(strSource, lngPos, 1&))
                'Omit repeating values (except the zero for padding.)
                If ((strValue <> strPriorValue) And (strValue <> vbNullString)) Or (strValue = "0") Then
                    strOut = strOut & strValue
                    strPriorValue = strValue
                End If
                lngPos = lngPos + 1&
            Loop Until Len(strOut) >= 4&
        End If
    End If
   
    'Return the output string, or Null if nothing generated.
    If strOut <> vbNullString Then
        Soundex = strOut
    Else
        Soundex = Null
    End If
   
Exit_Handler:
    Exit Function
   
Err_Handler:
    MsgBox "Error " & Err.Number & ": " & Err.Description, vbExclamation, "Soundex()"
    'Call LogError(Err.Number, Err.Description, conMod & ".Soundex")
    Resume Exit_Handler
End Function
Private Function SoundexValue(strChar As String) As String
    Select Case strChar
    Case "B", "F", "P", "V"
        SoundexValue = "1"
    Case "C", "G", "J", "K", "Q", "S", "X", "Z"
        SoundexValue = "2"
    Case "D", "T"
        SoundexValue = "3"
    Case "L"
        SoundexValue = "4"
    Case "M", "N"
        SoundexValue = "5"
    Case "R"
        SoundexValue = "6"
    Case vbNullString
        'Pad trailing zeros if no more characters.
        SoundexValue = "0"
    Case Else
        'Return nothing for "A", "E", "H", "I", "O", "U", "W", "Y", non-alpha.
    End Select
End Function




**************************************
Function Soundex(Surname As String) As String
' Developed by Richard J. Yanco
' This function follows the Soundex rules given at
' http://home.utah-inter.net/kinsearch/Soundex.html

    Dim Result As String, c As String * 1
    Dim Location As Integer

    Surname = UCase(Surname)

'   First character must be a letter
    If Asc(Left$(Surname, 1)) < 65 Or Asc(Left$(Surname, 1)) > 90 Then
        Soundex = ""
        Exit Function
    Else
'       St. is converted to Saint
        If Left(Surname, 3) = "ST." Then
            Surname = "SAINT" & Mid(Surname, 4)
        End If

'       Convert to Soundex: letters to their appropriate digit,
'                     A,E,I,O,U,Y ("slash letters") to slashes
'                     H,W, and everything else to zero-length string

        Result = Left(Surname, 1)
        For Location = 2 To Len(Surname)
            Result = Result & Category(Mid(Surname, Location, 1))
        Next Location
     
'       Remove double letters
        Location = 2
        Do While Location < Len(Result)
            If Mid$(Result, Location, 1) = Mid$(Result, Location + 1, 1) Then
                Result = Left$(Result, Location) & Mid$(Result, Location + 2)
            Else
                Location = Location + 1
            End If
        Loop
   
'       If category of 1st letter equals 2nd character, remove 2nd character
        If Category(Left$(Result, 1)) = Mid$(Result, 2, 1) Then
            Result = Left$(Result, 1) & Mid$(Result, 3)
        End If
   
'       Remove slashes
        For Location = 2 To Len(Result)
            If Mid$(Result, Location, 1) = "/" Then
                Result = Left$(Result, Location - 1) & Mid$(Result, Location + 1)
            End If
        Next
   
'       Trim or pad with zeroes as necessary
        If Len(Result) = 4 Then
            Soundex = Result
        ElseIf Len(Result) < 4 Then
            Soundex = Result & String(4 - Len(Result), "0")
        Else
            Soundex = Left(Result, 4)
        End If
    End If
End Function

Private Function Category(c) As String
'   Returns a Soundex code for a letter
        If c Like "[AEIOUY]" Then
            Category = "/"
        ElseIf c Like "[BPFV]" Then
            Category = "1"
        ElseIf c Like "[CSKGJQXZ]" Then
            Category = "2"
        ElseIf c Like "[DT]" Then
            Category = "3"
        ElseIf c = "L" Then
            Category = "4"
        ElseIf c Like "[MN]" Then
            Category = "5"
        ElseIf c = "R" Then
            Category = "6"
        Else 'This includes H and W, spaces, punctuation, etc.
            Category = ""
    End If
End Function


*************************************************************************
Function Soundex2(s) As String
    Dim i As Integer, s1 As String
'   Suppression des espaces et
'       transformation du mot en majuscule
    s = UCase(Trim(s))
    s1 = Left(s, 1)
    Select Case True
        Case s1 Like "[ÀÂÄ]": s1 = "A"
        Case s1 Like "[ÉÈÊË]": s1 = "E"
        Case s1 Like "[ÎÏ]": s1 = "I"
        Case s1 Like "[ÔÖ]": s1 = "O"
        Case s1 Like "[ÙÛÜ]": s1 = "U"
        Case s1 = "Ç": s1 = "C"
    End Select
    s = s1 & Mid(s, 2)
'   Calcul du soundex
'   Premier caractère
    Soundex2 = Left(s, 1)
'   Autres caractères
    For i = 2 To Len(s)
        If Len(Soundex2) = 4 Then
            Exit Function
        Else
            s1 = Mid(s, i, 1)
            Select Case True
                Case s1 Like "[BP]": s1 = "1"
                Case s1 Like "[CKQ]": s1 = "2"
                Case s1 Like "[DT]": s1 = "3"
                Case s1 = "L": s1 = "4"
                Case s1 Like "[MN]": s1 = "5"
                Case s1 = "R": s1 = "6"
                Case s1 Like "[GJ]": s1 = "7"
                Case s1 Like "[XZS]": s1 = "8"
                Case s1 Like "[FV]": s1 = "9"
                Case Else
                    s1 = ""
            End Select
'           Elimination des doubles
            If s1 <> "" Then
                If s1 <> Right(Soundex2, 1) Then
                    Soundex2 = Soundex2 & s1
                End If
            End If
        End If
    Next i
End Function

**************************************
Private Function SoundEx(ByVal WordString As String, _
   Optional SoundExLen As Integer = 4) As String

Dim Counter As Integer
Dim CurrChar As String

If SoundExLen > 10 Then
    SoundExLen = 10
ElseIf SoundExLen < 4 Then
    SoundExLen = 4
End If
SoundExLen = SoundExLen - 1

WordString = UCase(WordString)

For Counter = 1 To Len(WordString)
    If Asc(Mid(WordString, Counter, 1)) < 65 Or Asc(Mid(WordString, Counter, 1)) > 90 Then
       Mid(WordString, Counter, 1) = " "
    End If
Next Counter
WordString = Trim(WordString)

SoundEx = WordString
'fr :
'   1       B, P
'   2       C, K, Q
'   3       D, T
'   4       L
'   5       M, N
'   6       R
'   7       G, J
'   8       X, Z, S
'   9       F, V
SoundEx = Replace(SoundEx, "A", "0")
SoundEx = Replace(SoundEx, "E", "0")
SoundEx = Replace(SoundEx, "I", "0")
SoundEx = Replace(SoundEx, "O", "0")
SoundEx = Replace(SoundEx, "U", "0")
SoundEx = Replace(SoundEx, "Y", "0")
SoundEx = Replace(SoundEx, "H", "0")
SoundEx = Replace(SoundEx, "W", "0")
SoundEx = Replace(SoundEx, "B", "1")
SoundEx = Replace(SoundEx, "P", "1")
SoundEx = Replace(SoundEx, "F", "1")
SoundEx = Replace(SoundEx, "V", "1")
SoundEx = Replace(SoundEx, "C", "2")
SoundEx = Replace(SoundEx, "S", "2")
SoundEx = Replace(SoundEx, "G", "2")
SoundEx = Replace(SoundEx, "J", "2")
SoundEx = Replace(SoundEx, "K", "2")
SoundEx = Replace(SoundEx, "Q", "2")
SoundEx = Replace(SoundEx, "X", "2")
SoundEx = Replace(SoundEx, "Z", "2")
SoundEx = Replace(SoundEx, "D", "3")
SoundEx = Replace(SoundEx, "T", "3")
SoundEx = Replace(SoundEx, "L", "4")
SoundEx = Replace(SoundEx, "M", "5")
SoundEx = Replace(SoundEx, "N", "5")
SoundEx = Replace(SoundEx, "R", "6")

CurrChar = Left(SoundEx, 1)
For Counter = 2 To Len(SoundEx)
    If Mid(SoundEx, Counter, 1) = CurrChar Then
        Mid(SoundEx, Counter, 1) = " "
    Else
        CurrChar = Mid(SoundEx, Counter, 1)
    End If
Next Counter
SoundEx = Replace(SoundEx, " ", "")

SoundEx = Mid(SoundEx, 2)
SoundEx = Replace(SoundEx, "0", "")

SoundEx = SoundEx & String(SoundExLen, "0")
SoundEx = Left(WordString, 1) & Left(SoundEx, SoundExLen)
End Function


I'm sure these vba can be merge for best result
if english or french text.


Can some one help for that job?
kind regards
 
Last edited by a moderator:
Tetonne
As this is about 11 years ago opened thread by other member.
As well as per Forum Rules.
Please open a new thread.
This thread is closed.
 
Status
Not open for further replies.
Back
Top