This is an example of using the Microsoft Speech SDK 5.1 under Windows for command and control speech recognition in Python.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 | from win32com.client import constants
import win32com.client
import pythoncom
"""Sample code for using the Microsoft Speech SDK 5.1 via COM in Python.
Requires that the SDK be installed; it's a free download from
http://microsoft.com/speech
and that MakePy has been used on it (in PythonWin,
select Tools | COM MakePy Utility | Microsoft Speech Object Library 5.1).
After running this, then saying "One", "Two", "Three" or "Four" should
display "You said One" etc on the console. The recognition can be a bit
shaky at first until you've trained it (via the Speech entry in the Windows
Control Panel."""
class SpeechRecognition:
""" Initialize the speech recognition with the passed in list of words """
def __init__(self, wordsToAdd):
# For text-to-speech
self.speaker = win32com.client.Dispatch("SAPI.SpVoice")
# For speech recognition - first create a listener
self.listener = win32com.client.Dispatch("SAPI.SpSharedRecognizer")
# Then a recognition context
self.context = self.listener.CreateRecoContext()
# which has an associated grammar
self.grammar = self.context.CreateGrammar()
# Do not allow free word recognition - only command and control
# recognizing the words in the grammar only
self.grammar.DictationSetState(0)
# Create a new rule for the grammar, that is top level (so it begins
# a recognition) and dynamic (ie we can change it at runtime)
self.wordsRule = self.grammar.Rules.Add("wordsRule",
constants.SRATopLevel + constants.SRADynamic, 0)
# Clear the rule (not necessary first time, but if we're changing it
# dynamically then it's useful)
self.wordsRule.Clear()
# And go through the list of words, adding each to the rule
[ self.wordsRule.InitialState.AddWordTransition(None, word) for word in wordsToAdd ]
# Set the wordsRule to be active
self.grammar.Rules.Commit()
self.grammar.CmdSetRuleState("wordsRule", 1)
# Commit the changes to the grammar
self.grammar.Rules.Commit()
# And add an event handler that's called back when recognition occurs
self.eventHandler = ContextEvents(self.context)
# Announce we've started using speech synthesis
self.say("Started successfully")
"""Speak a word or phrase"""
def say(self, phrase):
self.speaker.Speak(phrase)
"""The callback class that handles the events raised by the speech object.
See "Automation | SpSharedRecoContext (Events)" in the MS Speech SDK
online help for documentation of the other events supported. """
class ContextEvents(win32com.client.getevents("SAPI.SpSharedRecoContext")):
"""Called when a word/phrase is successfully recognized -
ie it is found in a currently open grammar with a sufficiently high
confidence"""
def OnRecognition(self, StreamNumber, StreamPosition, RecognitionType, Result):
newResult = win32com.client.Dispatch(Result)
print "You said: ",newResult.PhraseInfo.GetText()
if __name__=='__main__':
wordsToAdd = [ "One", "Two", "Three", "Four" ]
speechReco = SpeechRecognition(wordsToAdd)
while 1:
pythoncom.PumpWaitingMessages()
|
Python is a natural choice for a speech recognition control application, since it's very easy to support user scripting.
I have a simple voice recognition application based on the above code, that sits in the system tray and runs short chunks of Python script via exec when it recognizes a word. I've found the Windows Scripting Host particularly useful, particularly the SendKeys method: eg
shell = win32com.client.Dispatch("WScript.Shell")
shell.SendKeys({PGUP})
mapped to saying "Page up", and so on. (The full code for the GUI version is on my web page at http://inigo.0catch.com - it uses wxWindows)
Program runs, but aborts? Isn't there a loop missing? The program starts, a voice says 'Started successfully' and then the program ends.
Now fixed. Yes, you're right. It worked fine running under PythonWin, but needed a message loop running from the command line. I've now added it.
wxPython Version. Can you please tell what version of wxPython you use because the one I installed keeps giving me errors in the time conversion. Maybe I have the wrong wxPython version or am using a newer one. Thanks in Advance
Andre
Program aborts. Downloaded code and Microsoft Speech SDK 5.1 to same file. Atempted to run code.
Traceback (most recent call last):
File "L:\SpeechRecognition\Speech.py", line 55, in class ContextEvents(win32com.client.getevents("SAPI.SpSharedRecoContext")):
TypeError: Error when calling the metaclass bases
I'm using Python 2.5.1 on Windows XP system.
Any thoughts on speech recognition?
I've made a somewhat cleaner module for working with Microsoft Speech Recognition, based on Surguy's example above.
The 'speech' module is available by typing 'easy_install speech' at the Windows command prompt (if you've got easy_install installed.) You'll still need to do the Speech SDK install and run MakePY, but I'm trying to figure out how to elimiate the MakePY step :)
The project lives at http://pyspeech.googlecode.com and is on PyPI.
The code looks like this:
It works great -- I've successfully used the module to build a music robot that understands instructions like "Play me some Radiohead, any album."
Hope this helps! Feedback at pyspeech.googlecode.com would be appreciated.
Michael
I've released speech v0.3.5, which simplifies the above code somewhat. You no longer call pump_waiting_messages(), as that is taken care of for you in the background. Just sleep as long as you wish the program to keep running, or do whatever other tasks you wish. Also, listeners now have a stoplistening() and islistening() method rather than calling speech.islistening(listener).
The revised code:
Thanks for the code. I have tried it on Windows 7 with Python 3.3 and pywin build 218 but I get this exception, please help me with this, thanks.
class ContextEvents(win32com.client.getevents("SAPI.SpSharedRecoContext")): TypeError: NoneType takes no arguments