DZone Snippets is a public source code repository. Easily build up your personal collection of code snippets, categorize them with tags / keywords, and share them with the world

Snippets has posted 5883 posts at DZone. View Full User Profile


  • submit to reddit
        From Greg Jorgensen's <a href=>recipe</a>.
def soundex(name, len=4):

    # digits holds the soundex values for the alphabet
    digits = '01230120022455012623010202'
    sndx = ''
    fc = ''

    # translate alpha chars in name to soundex digits
    for c in name.upper():
        if c.isalpha():
            if not fc: fc = c   # remember first letter
            d = digits[ord(c)-ord('A')]
            # duplicate consecutive soundex digits are skipped
            if not sndx or (d != sndx[-1]):
                sndx += d
    sndx = fc + sndx[1:]   # replace first digit with first char
    sndx = sndx.replace('0','')       # remove all 0s
    return (sndx + (len * '0'))[:len] # padded to len characters
Similar words will return the same soundex
>>> soundex('hello')
>>> soundex('hola')