python - I have an array of strings; is there a way to see which one an argument string is closest to? -


i'm working in python, , i've decided break down , make huge array of phrases result of speech recognition module compare to. far i've got:

phrases = [     "what time it",     "what's weather",     "what's date",     "hello",     "hi",     "what's up",     "how you" ] 

(i've started few minutes ago, haven't got yet... outline) anyway, i'd function kind of this...

def match(phrase):     #match_greatest start @ 0 continuously update if string     #being compared has higher percentage match     match_greatest = 0      #match store actual string closest     match = ""      in phrases:         #this part need with...         match_current = #somehow percentage argument phrase matches phrase it's comparing          #if current phrase closer match before, update         if match_current > match_greatest:             match_greatest = match_current             match =      return match 

...so example, if call match("what time a") or match("what time sit") -- these examples of misreading speech recognition give -- , use current set of phrases, return "what time it".

one of reasonable distances between strings "edit distance" or levenshtein distance. calculates amount of edits (insertions, deletions , substitutions) turn 1 string another.

python implementation here, requires dynamic programming

https://pypi.python.org/pypi/python-levenshtein/

you can implement algorithm yourself, pretty simple.

if want speech-oriented distance worth consider soundex, specific extension of levenshtein account phonetic properties of words. see

https://pypi.python.org/pypi/fuzzy

you can iterate on strings , find has smallest edit distance.


Comments

Popular posts from this blog

google chrome - Developer tools - How to inspect the elements which are added momentarily (by JQuery)? -

angularjs - Showing an empty as first option in select tag -

php - Cloud9 cloud IDE and CakePHP -