python - I have an array of strings; is there a way to see which one an argument string is closest to? -
i'm working in python, , i've decided break down , make huge array of phrases result of speech recognition module compare to. far i've got:
phrases = [ "what time it", "what's weather", "what's date", "hello", "hi", "what's up", "how you" ]
(i've started few minutes ago, haven't got yet... outline) anyway, i'd function kind of this...
def match(phrase): #match_greatest start @ 0 continuously update if string #being compared has higher percentage match match_greatest = 0 #match store actual string closest match = "" in phrases: #this part need with... match_current = #somehow percentage argument phrase matches phrase it's comparing #if current phrase closer match before, update if match_current > match_greatest: match_greatest = match_current match = return match
...so example, if call match("what time a") or match("what time sit") -- these examples of misreading speech recognition give -- , use current set of phrases, return "what time it".
one of reasonable distances between strings "edit distance" or levenshtein distance. calculates amount of edits (insertions, deletions , substitutions) turn 1 string another.
python implementation here, requires dynamic programming
https://pypi.python.org/pypi/python-levenshtein/
you can implement algorithm yourself, pretty simple.
if want speech-oriented distance worth consider soundex, specific extension of levenshtein account phonetic properties of words. see
https://pypi.python.org/pypi/fuzzy
you can iterate on strings , find has smallest edit distance.
Comments
Post a Comment