Clean up api.ai support, remove AT&T support since AT&T is shutting i…

…t down
HisashiQ · Apr 2, 2016 · 266ad1f · 266ad1f
1 parent 3d2377a
commit 266ad1f
Show file tree

Hide file tree

Showing 7 changed files with 105 additions and 187 deletions.
diff --git a/README.rst b/README.rst
@@ -21,7 +21,15 @@ Speech Recognition
     :target: https://pypi.python.org/pypi/SpeechRecognition/
     :alt: License
 
-Library for performing speech recognition with support for `CMU Sphinx <http://cmusphinx.sourceforge.net/wiki/>`__, Google Speech Recognition, `Wit.ai <https://wit.ai/>`__, `IBM Speech to Text <http://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/speech-to-text.html>`__, and `AT&T Speech to Text <http://developer.att.com/apis/speech>`__.
+Library for performing speech recognition, with support for several engines and APIs, online and offline.
+
+Speech recognition engine/API support:
+
+* `CMU Sphinx <http://cmusphinx.sourceforge.net/wiki/>`__ (works offline)
+* Google Speech Recognition
+* `Wit.ai <https://wit.ai/>`__
+* `api.ai <https://api.ai/>`__
+* `IBM Speech To Text <http://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/speech-to-text.html>`__
 
 **Quickstart:** ``pip install SpeechRecognition``. See the "Installing" section for more details.
 
@@ -135,7 +143,7 @@ The solution is to decrease this threshold, or call ``recognizer_instance.adjust
 The recognizer doesn't understand my particular language/dialect.
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-Try setting the recognition language to your language/dialect. To do this, see the documentation for ``recognizer_instance.recognize_sphinx``, ``recognizer_instance.recognize_google``, ``recognizer_instance.recognize_wit``, ``recognizer_instance.recognize_ibm``, and ``recognizer_instance.recognize_att``.
+Try setting the recognition language to your language/dialect. To do this, see the documentation for ``recognizer_instance.recognize_sphinx``, ``recognizer_instance.recognize_google``, ``recognizer_instance.recognize_wit``, ``recognizer_instance.recognize_api``, and ``recognizer_instance.recognize_ibm``.
 
 For example, if your language/dialect is British English, it is better to use ``"en-GB"`` as the language rather than ``"en-US"``.
 
@@ -144,7 +152,7 @@ The code examples throw ``UnicodeEncodeError: 'ascii' codec can't encode charact
 
 When you're using Python 2, and your language uses non-ASCII characters, and the terminal or file-like object you're printing to only supports ASCII, an error is thrown when trying to write non-ASCII characters.
 
-This is because in Python 2, ``recognizer_instance.recognize_sphinx``, ``recognizer_instance.recognize_google``, ``recognizer_instance.recognize_wit``, ``recognizer_instance.recognize_ibm``, and ``recognizer_instance.recognize_att`` return unicode strings (``u"something"``) rather than byte strings (``"something"``). In Python 3, all strings are unicode strings.
+This is because in Python 2, ``recognizer_instance.recognize_sphinx``, ``recognizer_instance.recognize_google``, ``recognizer_instance.recognize_wit``, ``recognizer_instance.recognize_api``, and ``recognizer_instance.recognize_ibm`` return unicode strings (``u"something"``) rather than byte strings (``"something"``). In Python 3, all strings are unicode strings.
 
 To make printing of unicode strings work in Python 2 as well, replace all print statements in your code of the following form:
 
@@ -225,18 +233,20 @@ Authors
     haas85
     DelightRun <[email protected]>
     maverickagm
+    kamushadenes <[email protected]> (Kamus Hadenes)
+    sbraden <[email protected]> (Sarah Braden)
 
 Please report bugs and suggestions at the `issue tracker <https://github.com/Uberi/speech_recognition/issues>`__!
 
 How to cite this library (APA style):
 
-    Zhang, A. (2016). Speech Recognition (Version 3.2) [Software]. Available from https://github.com/Uberi/speech_recognition#readme.
+    Zhang, A. (2016). Speech Recognition (Version 3.3) [Software]. Available from https://github.com/Uberi/speech_recognition#readme.
 
 How to cite this library (Chicago style):
 
-    Zhang, Anthony. 2016. *Speech Recognition* (version 3.2).
+    Zhang, Anthony. 2016. *Speech Recognition* (version 3.3).
 
-Also check out the `Python Baidu Yuyin API <https://github.com/DelightRun/PyBaiduYuyin>`__, which is based on an older version of this project, and adds support for `Baidu Yuyin <http://yuyin.baidu.com/>`__.
+Also check out the `Python Baidu Yuyin API <https://github.com/DelightRun/PyBaiduYuyin>`__, which is based on an older version of this project, and adds support for `Baidu Yuyin <http://yuyin.baidu.com/>`__. Note that Baidu Yuyin is only available inside China.
 
 License
 -------

diff --git a/examples/extended_results.py b/examples/extended_results.py
@@ -43,6 +43,17 @@
 except sr.RequestError as e:
     print("Could not request results from Wit.ai service; {0}".format(e))
 
+# recognize speech using api.ai
+API_AI_CLIENT_ACCESS_TOKEN = "INSERT API.AI API KEY HERE" # api.ai keys are 32-character lowercase hexadecimal strings
+try:
+    from pprint import pprint
+    print("api.ai recognition results:")
+    pprint(r.recognize_api(audio, client_access_token=API_AI_CLIENT_ACCESS_TOKEN, show_all=True)) # pretty-print the recognition result
+except sr.UnknownValueError:
+    print("api.ai could not understand audio")
+except sr.RequestError as e:
+    print("Could not request results from api.ai service; {0}".format(e))
+
 # recognize speech using IBM Speech to Text
 IBM_USERNAME = "INSERT IBM SPEECH TO TEXT USERNAME HERE" # IBM Speech to Text usernames are strings of the form XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
 IBM_PASSWORD = "INSERT IBM SPEECH TO TEXT PASSWORD HERE" # IBM Speech to Text passwords are mixed-case alphanumeric strings
@@ -54,13 +65,3 @@
     print("IBM Speech to Text could not understand audio")
 except sr.RequestError as e:
     print("Could not request results from IBM Speech to Text service; {0}".format(e))
-
-# recognize speech using AT&T Speech to Text
-ATT_APP_KEY = "INSERT AT&T SPEECH TO TEXT APP KEY HERE" # AT&T Speech to Text app keys are 32-character lowercase alphanumeric strings
-ATT_APP_SECRET = "INSERT AT&T SPEECH TO TEXT APP SECRET HERE" # AT&T Speech to Text app secrets are 32-character lowercase alphanumeric strings
-try:
-    print("AT&T Speech to Text thinks you said " + r.recognize_att(audio, app_key=ATT_APP_KEY, app_secret=ATT_APP_SECRET))
-except sr.UnknownValueError:
-    print("AT&T Speech to Text could not understand audio")
-except sr.RequestError as e:
-    print("Could not request results from AT&T Speech to Text service; {0}".format(e))
diff --git a/examples/microphone_recognition.py b/examples/microphone_recognition.py
@@ -38,6 +38,15 @@
 except sr.RequestError as e:
     print("Could not request results from Wit.ai service; {0}".format(e))
 
+# recognize speech using api.ai
+API_AI_CLIENT_ACCESS_TOKEN = "INSERT API.AI API KEY HERE" # api.ai keys are 32-character lowercase hexadecimal strings
+try:
+    print("api.ai thinks you said " + r.recognize_api(audio, client_access_token=API_AI_CLIENT_ACCESS_TOKEN))
+except sr.UnknownValueError:
+    print("api.ai could not understand audio")
+except sr.RequestError as e:
+    print("Could not request results from api.ai service; {0}".format(e))
+
 # recognize speech using IBM Speech to Text
 IBM_USERNAME = "INSERT IBM SPEECH TO TEXT USERNAME HERE" # IBM Speech to Text usernames are strings of the form XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
 IBM_PASSWORD = "INSERT IBM SPEECH TO TEXT PASSWORD HERE" # IBM Speech to Text passwords are mixed-case alphanumeric strings
@@ -47,13 +56,3 @@
     print("IBM Speech to Text could not understand audio")
 except sr.RequestError as e:
     print("Could not request results from IBM Speech to Text service; {0}".format(e))
-
-# recognize speech using AT&T Speech to Text
-ATT_APP_KEY = "INSERT AT&T SPEECH TO TEXT APP KEY HERE" # AT&T Speech to Text app keys are 32-character lowercase alphanumeric strings
-ATT_APP_SECRET = "INSERT AT&T SPEECH TO TEXT APP SECRET HERE" # AT&T Speech to Text app secrets are 32-character lowercase alphanumeric strings
-try:
-    print("AT&T Speech to Text thinks you said " + r.recognize_att(audio, app_key=ATT_APP_KEY, app_secret=ATT_APP_SECRET))
-except sr.UnknownValueError:
-    print("AT&T Speech to Text could not understand audio")
-except sr.RequestError as e:
-    print("Could not request results from AT&T Speech to Text service; {0}".format(e))
diff --git a/examples/wav_transcribe.py b/examples/wav_transcribe.py
@@ -39,6 +39,15 @@
 except sr.RequestError as e:
     print("Could not request results from Wit.ai service; {0}".format(e))
 
+# recognize speech using api.ai
+API_AI_CLIENT_ACCESS_TOKEN = "INSERT API.AI API KEY HERE" # api.ai keys are 32-character lowercase hexadecimal strings
+try:
+    print("api.ai thinks you said " + r.recognize_api(audio, client_access_token=API_AI_CLIENT_ACCESS_TOKEN))
+except sr.UnknownValueError:
+    print("api.ai could not understand audio")
+except sr.RequestError as e:
+    print("Could not request results from api.ai service; {0}".format(e))
+
 # recognize speech using IBM Speech to Text
 IBM_USERNAME = "INSERT IBM SPEECH TO TEXT USERNAME HERE" # IBM Speech to Text usernames are strings of the form XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
 IBM_PASSWORD = "INSERT IBM SPEECH TO TEXT PASSWORD HERE" # IBM Speech to Text passwords are mixed-case alphanumeric strings
@@ -48,24 +57,3 @@
     print("IBM Speech to Text could not understand audio")
 except sr.RequestError as e:
     print("Could not request results from IBM Speech to Text service; {0}".format(e))
-
-# recognize speech using api.ai Speech to Text
-# Note: Use the developer access token for managing entities and intents, and use the client access token for making queries.
-API_AI_CLIENT_ACCESS_TOKEN = "INSERT API.AI SPEECH TO TEXT ACCESS TOKEN HERE"  # api.ai access tokens are 32-character lowercase alphanumeric strings
-API_AI_SUBSCRIPTION_KEY = "INSERT API.AI SPEECH TO TEXT SUBSCRIPTION KEY HERE"  # api.ai subscription_keys are strings of the form XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
-try:
-    print("api.ai Speech to Text thinks you said " + r.recognize_api(audio, username=API_AI_CLIENT_ACCESS_TOKEN, password=API_AI_SUBSCRIPTION_KEY))
-except sr.UnknownValueError:
-    print("api.ai Speech to Text could not understand audio")
-except sr.RequestError as e:
-    print("Could not request results from api.ai Speech to Text service; {0}".format(e))
-
-# recognize speech using AT&T Speech to Text
-ATT_APP_KEY = "INSERT AT&T SPEECH TO TEXT APP KEY HERE" # AT&T Speech to Text app keys are 32-character lowercase alphanumeric strings
-ATT_APP_SECRET = "INSERT AT&T SPEECH TO TEXT APP SECRET HERE" # AT&T Speech to Text app secrets are 32-character lowercase alphanumeric strings
-try:
-    print("AT&T Speech to Text thinks you said " + r.recognize_att(audio, app_key=ATT_APP_KEY, app_secret=ATT_APP_SECRET))
-except sr.UnknownValueError:
-    print("AT&T Speech to Text could not understand audio")
-except sr.RequestError as e:
-    print("Could not request results from AT&T Speech to Text service; {0}".format(e))
diff --git a/reference/library-reference.rst b/reference/library-reference.rst
@@ -192,45 +192,41 @@ Raises a ``speech_recognition.UnknownValueError`` exception if the speech is uni
 
 Performs speech recognition on ``audio_data`` (an ``AudioData`` instance), using the Wit.ai API.
 
-The Wit.ai API key is specified by ``key``. Unfortunately, these are not available without `signing up for an account <https://wit.ai/getting-started>`__ and creating an app. You will need to add at least one intent (recognizable sentence) before the API key can be accessed, though the actual intent values don't matter.
+The Wit.ai API key is specified by ``key``. Unfortunately, these are not available without `signing up for an account <https://wit.ai/>`__ and creating an app. You will need to add at least one intent to the app before you can see the API key, though the actual intent settings don't matter.
 
-To get the API key for a Wit.ai app, go to the app settings, go to the section titled "API Details", and look for "Server Access Token" or "Client Access Token". If the desired field is blank, click on the "Reset token" button on the right of the field. Wit.ai API keys are 32-character uppercase alphanumeric strings.
-
-Though Wit.ai is designed to be used with a fixed set of phrases, it still provides services for general-purpose speech recognition.
+To get the API key for a Wit.ai app, go to the app's overview page, go to the section titled "Make an API request", and look for something along the lines of ``Authorization: Bearer XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX``; ``XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX`` is the API key. Wit.ai API keys are 32-character uppercase alphanumeric strings.
 
 The recognition language is configured in the Wit.ai app settings.
 
 Returns the most likely transcription if ``show_all`` is false (the default). Otherwise, returns the `raw API response <https://wit.ai/docs/http/20141022#get-intent-via-text-link>`__ as a JSON dictionary.
 
 Raises a ``speech_recognition.UnknownValueError`` exception if the speech is unintelligible. Raises a ``speech_recognition.RequestError`` exception if the key isn't valid, the quota for the key is maxed out, or there is no internet connection.
 
-``recognizer_instance.recognize_ibm(audio_data, username, password, language = "en-US", show_all = False)``
------------------------------------------------------------------------------------------------------------
-
-Performs speech recognition on ``audio_data`` (an ``AudioData`` instance), using the IBM Speech to Text API.
+``recognizer_instance.recognize_api(audio_data, client_access_token, show_all = False)``
+------------------------------------------------------------------------
 
-The IBM Speech to Text username and password are specified by ``username`` and ``password``, respectively. Unfortunately, these are not available without an account. IBM has published instructions for obtaining these credentials in the `IBM Watson Developer Cloud documentation <https://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/doc/getting_started/gs-credentials.shtml>`__.
+Perform speech recognition on ``audio_data`` (an ``AudioData`` instance), using the api.ai Speech to Text API.
 
-The recognition language is determined by ``language``, an IETF language tag with a dialect like ``"en-US"`` or ``"es-ES"``, defaulting to US English. At the moment, this supports the tags ``"en-US"`` and ``"es-ES"``.
+The api.ai API client access token is specified by ``client_access_token``. Unfortunately, this is not available without `signing up for an account <https://console.api.ai/api-client/#/signup>`__ and creating an agent. To get the API client access token, go to the agent settings, go to the section titled "API keys", and look for "Client access token". API client access tokens are 32-character lowercase hexadecimal strings.
 
-Returns the most likely transcription if ``show_all`` is false (the default). Otherwise, returns the `raw API response <http://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/speech-to-text/api/v1/#recognize>`__ as a JSON dictionary.
+The recognition language is set when creating an agent in the web console.
 
-Raises a ``speech_recognition.UnknownValueError`` exception if the speech is unintelligible. Raises a ``speech_recognition.RequestError`` exception if an error occurred, such as an invalid key, or a broken internet connection.
+Returns the most likely transcription if ``show_all`` is false (the default). Otherwise, returns the `raw API response <https://api.ai/docs/reference/#a-namepost-multipost-query-multipart>`__ as a JSON dictionary.
 
-``recognizer_instance.recognize_att(audio_data, app_key, app_secret, language = "en-US", show_all = False)``
-------------------------------------------------------------------------------------------------------------
+Raises a ``speech_recognition.UnknownValueError`` exception if the speech is unintelligible. Raises a ``speech_recognition.RequestError`` exception if the key isn't valid, the quota for the key is maxed out, or there is no internet connection.
 
-Performs speech recognition on ``audio_data`` (an ``AudioData`` instance), using the AT&T Speech to Text API.
+``recognizer_instance.recognize_ibm(audio_data, username, password, language = "en-US", show_all = False)``
+-----------------------------------------------------------------------------------------------------------
 
-The AT&T Speech to Text app key and app secret are specified by ``app_key`` and ``app_secret``, respectively. Unfortunately, these are not available without `signing up for an account <http://developer.att.com/apis/speech>`__ and creating an app.
+Performs speech recognition on ``audio_data`` (an ``AudioData`` instance), using the IBM Speech To Text API.
 
-To get the app key and app secret for an AT&T app, go to the `My Apps page <https://matrix.bf.sl.attcompute.com/apps>`__ and look for "APP KEY" and "APP SECRET". AT&T app keys and app secrets are 32-character lowercase alphanumeric strings.
+The IBM Speech to Text username and password are specified by ``username`` and ``password``, respectively. Unfortunately, these are not available without `signing up for an account <https://console.ng.bluemix.net/registration/>`__. Once logged into the Bluemix console, follow the instructions for `creating an IBM Watson service instance <http://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/doc/getting_started/gs-credentials.shtml>`__, where the Watson service is "Speech To Text". IBM Speech To Text usernames are strings of the form XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX, while passwords are mixed-case alphanumeric strings.
 
-The recognition language is determined by ``language``, an IETF language tag with a dialect like ``"en-US"`` or ``"es-ES"``, defaulting to US English. At the moment, this supports the tags ``"en-US"`` and ``"es-ES"``.
+The recognition language is determined by ``language``, an IETF language tag with a dialect like ``"en-US"`` or ``"es-ES"``, defaulting to US English. The supported languages are listed under the ``model`` parameter of the `audio recognition API documentation <http://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/speech-to-text/api/v1/#recognize_audio_sessionless12>`__.
 
-Returns the most likely transcription if ``show_all`` is false (the default). Otherwise, returns the `raw API response <https://developer.att.com/apis/speech/docs#resources-speech-to-text>`__ as a JSON dictionary.
+Returns the most likely transcription if ``show_all`` is false (the default). Otherwise, returns the `raw API response <http://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/speech-to-text/api/v1/#recognize_audio_sessionless12>`__ as a JSON dictionary.
 
-Raises a ``speech_recognition.UnknownValueError`` exception if the speech is unintelligible. Raises a ``speech_recognition.RequestError`` exception if the key isn't valid, or there is no internet connection.
+Raises a ``speech_recognition.UnknownValueError`` exception if the speech is unintelligible. Raises a ``speech_recognition.RequestError`` exception if an error occurred, such as an invalid key, or a broken internet connection.
 
 ``AudioSource``
 ---------------

diff --git a/setup.py b/setup.py
@@ -43,7 +43,7 @@ def run(self):
     description = speech_recognition.__doc__,
     long_description = open("README.rst").read(),
     license = speech_recognition.__license__,
-    keywords = "speech recognition google wit ibm att",
+    keywords = "speech recognition google wit api ibm",
     url = "https://github.com/Uberi/speech_recognition#readme",
     classifiers = [
         "Development Status :: 5 - Production/Stable",