Skip to content

Commit 11c1e8b

Browse files
committed
Use language parameter instead of a separate config parameter, for simplicity
1 parent 6d6b4ac commit 11c1e8b

File tree

2 files changed

+13
-28
lines changed

2 files changed

+13
-28
lines changed

reference/library-reference.rst

+2-5
Original file line numberDiff line numberDiff line change
@@ -181,16 +181,13 @@ The ``callback`` parameter is a function that should accept two parameters - the
181181

182182
Performs speech recognition on ``audio_data`` (an ``AudioData`` instance), using CMU Sphinx.
183183

184-
The recognition language is determined by ``language``, an IETF language tag like ``"en-US"`` or ``"en-GB"``, defaulting to US English. Out of the box, only ``en-US`` is supported. See `Notes on using `PocketSphinx <https://github.com/Uberi/speech_recognition/blob/master/reference/pocketsphinx.rst>`__ for information about installing other languages. This document is also included under ``reference/pocketsphinx.rst``.
184+
The recognition language is determined by ``language``, an RFC5646 language tag like ``"en-US"`` or ``"en-GB"``, defaulting to US English. Out of the box, only ``en-US`` is supported. See `Notes on using `PocketSphinx <https://github.com/Uberi/speech_recognition/blob/master/reference/pocketsphinx.rst>`__ for information about installing other languages. This document is also included under ``reference/pocketsphinx.rst``. The ``language`` parameter can also be a tuple of filesystem paths, of the form ``(acoustic_parameters_directory, language_model_file, phoneme_dictionary_file)`` - this allows you to load arbitrary Sphinx models.
185185

186186
If specified, the keywords to search for are determined by ``keyword_entries``, an iterable of tuples of the form ``(keyword, sensitivity)``, where ``keyword`` is a phrase, and ``sensitivity`` is how sensitive to this phrase the recognizer should be, on a scale of 0 (very insensitive, more false negatives) to 1 (very sensitive, more false positives) inclusive. If not specified or ``None``, no keywords are used and Sphinx will simply transcribe whatever words it recognizes. Specifying ``keyword_entries`` is more accurate than just looking for those same keywords in non-keyword-based transcriptions, because Sphinx knows specifically what sounds to look for.
187187

188-
If specified, config is a dictionary that can contain the following keys: language_directory, acoustic_parameters_directory, language_model_file and phoneme_dictionary_file. If set,
189-
their value will be used instead of the preset value. Any other key will be ignored.
190-
191188
Sphinx can also handle FSG or JSGF grammars. The parameter ``grammar`` expects a path to the grammar file. Note that if a JSGF grammar is passed, an FSG grammar will be created at the same location to speed up execution in the next run. If ``keyword_entries`` are passed, content of ``grammar`` will be ignored.
192189

193-
Returns the most likely transcription if ``show_all`` is false (the default). Otherwise, returns the Sphinx ``pocketsphinx.pocketsphinx.Hypothesis`` object generated by Sphinx.
190+
Returns the most likely transcription if ``show_all`` is false (the default). Otherwise, returns the Sphinx ``pocketsphinx.pocketsphinx.Decoder`` object resulting from the recognition.
194191

195192
Raises a ``speech_recognition.UnknownValueError`` exception if the speech is unintelligible. Raises a ``speech_recognition.RequestError`` exception if there are any issues with the Sphinx installation.
196193

speech_recognition/__init__.py

+11-23
Original file line numberDiff line numberDiff line change
@@ -752,27 +752,23 @@ def stopper(wait_for_stop=True):
752752
listener_thread.start()
753753
return stopper
754754

755-
def recognize_sphinx(self, audio_data, language="en-US", keyword_entries=None, grammar=None, show_all=False, config={}):
755+
def recognize_sphinx(self, audio_data, language="en-US", keyword_entries=None, grammar=None, show_all=False):
756756
"""
757757
Performs speech recognition on ``audio_data`` (an ``AudioData`` instance), using CMU Sphinx.
758758
759-
The recognition language is determined by ``language``, an RFC5646 language tag like ``"en-US"`` or ``"en-GB"``, defaulting to US English. Out of the box, only ``en-US`` is supported. See `Notes on using `PocketSphinx <https://github.com/Uberi/speech_recognition/blob/master/reference/pocketsphinx.rst>`__ for information about installing other languages. This document is also included under ``reference/pocketsphinx.rst``.
759+
The recognition language is determined by ``language``, an RFC5646 language tag like ``"en-US"`` or ``"en-GB"``, defaulting to US English. Out of the box, only ``en-US`` is supported. See `Notes on using `PocketSphinx <https://github.com/Uberi/speech_recognition/blob/master/reference/pocketsphinx.rst>`__ for information about installing other languages. This document is also included under ``reference/pocketsphinx.rst``. The ``language`` parameter can also be a tuple of filesystem paths, of the form ``(acoustic_parameters_directory, language_model_file, phoneme_dictionary_file)`` - this allows you to load arbitrary Sphinx models.
760760
761761
If specified, the keywords to search for are determined by ``keyword_entries``, an iterable of tuples of the form ``(keyword, sensitivity)``, where ``keyword`` is a phrase, and ``sensitivity`` is how sensitive to this phrase the recognizer should be, on a scale of 0 (very insensitive, more false negatives) to 1 (very sensitive, more false positives) inclusive. If not specified or ``None``, no keywords are used and Sphinx will simply transcribe whatever words it recognizes. Specifying ``keyword_entries`` is more accurate than just looking for those same keywords in non-keyword-based transcriptions, because Sphinx knows specifically what sounds to look for.
762762
763763
Sphinx can also handle FSG or JSGF grammars. The parameter ``grammar`` expects a path to the grammar file. Note that if a JSGF grammar is passed, an FSG grammar will be created at the same location to speed up execution in the next run. If ``keyword_entries`` are passed, content of ``grammar`` will be ignored.
764764
765-
If specified, config is a dictionary that can contain the following keys: language_directory, acoustic_parameters_directory, language_model_file and phoneme_dictionary_file.
766-
If set, their value will be used instead of the preset value. Any other key will be ignored.
767-
768765
Returns the most likely transcription if ``show_all`` is false (the default). Otherwise, returns the Sphinx ``pocketsphinx.pocketsphinx.Decoder`` object resulting from the recognition.
769766
770767
Raises a ``speech_recognition.UnknownValueError`` exception if the speech is unintelligible. Raises a ``speech_recognition.RequestError`` exception if there are any issues with the Sphinx installation.
771768
"""
772769
assert isinstance(audio_data, AudioData), "``audio_data`` must be audio data"
773-
assert isinstance(language, str), "``language`` must be a string"
770+
assert isinstance(language, str) or (isinstance(language, tuple) and len(language) == 3), "``language`` must be a string or 3-tuple of Sphinx data file paths of the form ``(acoustic_parameters, language_model, phoneme_dictionary)``"
774771
assert keyword_entries is None or all(isinstance(keyword, (type(""), type(u""))) and 0 <= sensitivity <= 1 for keyword, sensitivity in keyword_entries), "``keyword_entries`` must be ``None`` or a list of pairs of strings and numbers between 0 and 1"
775-
assert isinstance(config, dict), "``config` must be a dictionary"
776772

777773
# import the PocketSphinx speech recognition module
778774
try:
@@ -784,28 +780,20 @@ def recognize_sphinx(self, audio_data, language="en-US", keyword_entries=None, g
784780
raise RequestError("bad PocketSphinx installation; try reinstalling PocketSphinx version 0.0.9 or better.")
785781
if not hasattr(pocketsphinx, "Decoder") or not hasattr(pocketsphinx.Decoder, "default_config"):
786782
raise RequestError("outdated PocketSphinx installation; ensure you have PocketSphinx version 0.0.9 or better.")
787-
if "language_directory" in config:
788-
language_directory = config["language_directory"]
789-
else:
783+
784+
if isinstance(language, str): # directory containing language data
790785
language_directory = os.path.join(os.path.dirname(os.path.realpath(__file__)), "pocketsphinx-data", language)
791-
if not os.path.isdir(language_directory):
792-
raise RequestError("missing PocketSphinx language data directory: \"{}\"".format(language_directory))
793-
if "acoustic_parameters_directory" in config:
794-
acoustic_parameters_directory = config["acoustic_parameters_directory"]
795-
else:
786+
if not os.path.isdir(language_directory):
787+
raise RequestError("missing PocketSphinx language data directory: \"{}\"".format(language_directory))
796788
acoustic_parameters_directory = os.path.join(language_directory, "acoustic-model")
789+
language_model_file = os.path.join(language_directory, "language-model.lm.bin")
790+
phoneme_dictionary_file = os.path.join(language_directory, "pronounciation-dictionary.dict")
791+
else: # 3-tuple of Sphinx data file paths
792+
acoustic_parameters_directory, language_model_file, phoneme_dictionary_file = language
797793
if not os.path.isdir(acoustic_parameters_directory):
798794
raise RequestError("missing PocketSphinx language model parameters directory: \"{}\"".format(acoustic_parameters_directory))
799-
if "language_model_file" in config:
800-
language_model_file = config["language_model_file"]
801-
else:
802-
language_model_file = os.path.join(language_directory, "language-model.lm.bin")
803795
if not os.path.isfile(language_model_file):
804796
raise RequestError("missing PocketSphinx language model file: \"{}\"".format(language_model_file))
805-
if "phoneme_dictionary_file" in config:
806-
phoneme_dictionary_file = config["phoneme_dictionary_file"]
807-
else:
808-
phoneme_dictionary_file = os.path.join(language_directory, "pronounciation-dictionary.dict")
809797
if not os.path.isfile(phoneme_dictionary_file):
810798
raise RequestError("missing PocketSphinx phoneme dictionary file: \"{}\"".format(phoneme_dictionary_file))
811799

0 commit comments

Comments
 (0)