merged master

bobkentt · Jun 12, 2019 · 9eaba82 · 9eaba82
2 parents 7d1c1a5 + e4d3335
commit 9eaba82
Show file tree

Hide file tree

Showing 43 changed files with 1,018 additions and 457 deletions.
diff --git a/CHANGELOG.rst b/CHANGELOG.rst
@@ -11,15 +11,24 @@ This project adheres to `Semantic Versioning`_ starting with version 1.0.
 
 Added
 -----
+- nlu configs can now be directly compared for performance on a dataset in ``rasa test nlu``
 
 Changed
 -------
+- update the tracker in interactive learning through reverting and appending events
+  instead of replacing the tracker
+- ``POST /conversations/{conversation_id}/tracker/events`` supports a list of events
 
 Removed
 -------
 
 Fixed
 -----
+- fixed creation of ``RasaNLUHttpInterpreter``
+- form actions are included in domain warnings
+- default actions overriden by custom actions and listed in the domain are excluded
+  from domain warnings
+- SQL ``data`` column type to ``Text`` for compatibility with MySQL
 
 [1.0.9] - 2019-06-10
 ^^^^^^^^^^^^^^^^^^^^

diff --git a/docs/_static/spec/rasa.yml b/docs/_static/spec/rasa.yml
@@ -139,9 +139,9 @@ paths:
       - JWT: []
       tags:
       - Tracker
-      summary: Append an event to a tracker
+      summary: Append events to a tracker
       description: >-
-        Appends a new event to the tracker state of the conversation.
+        Appends one or multiple new events to the tracker state of the conversation.
         Any existing events will be kept and the new events will be
         appended, updating the existing state.
       parameters:
@@ -152,7 +152,11 @@ paths:
         content:
           application/json:
             schema:
-              $ref: '#/components/schemas/Event'
+              oneOf:
+              - $ref: '#/components/schemas/Event'
+              - type: array
+                items:
+                  $ref: '#/components/schemas/Event'
       responses:
         200:
           $ref: '#/components/responses/200Tracker'

diff --git a/docs/nlu/choosing-a-pipeline.rst b/docs/nlu/choosing-a-pipeline.rst
@@ -32,8 +32,6 @@ use the ``supervised_embeddings`` pipeline:
 
     pipeline: "supervised_embeddings"
 
-It's good practice to define the ``language`` parameter in your configuration, but
-for the ``supervised_embeddings`` pipeline this parameter doesn't affect anything.
 
 A Longer Answer
 ---------------
@@ -56,6 +54,9 @@ so it will work with any language that you can tokenize (on whitespace or using
 
 You can read more about this topic `here <https://medium.com/rasa-blog/supervised-word-vectors-from-scratch-in-rasa-nlu-6daf794efcd8>`__ .
 
+Rasa gives you the tools to compare the performance of both of these pipelines on your data directly,
+see :ref:`comparing-nlu-pipelines`.
+
 
 You can also use MITIE as a source of word vectors in your pipeline, see :ref:`section_mitie_pipeline`.
 We do not recommend that you use these; mitie support is likely to be deprecated in a future release.

diff --git a/docs/user-guide/connectors/slack.rst b/docs/user-guide/connectors/slack.rst
@@ -21,7 +21,7 @@ Getting Credentials
      ``message.groups``, ``message.im`` and ``message.mpim`` events)
   3. The ``slack_channel`` is the target your bot posts to.
      This can be a channel or an individual person. You can leave out
-     the argument to post to the bot's "App" page.
+     the argument to post DMs to the bot.
   4. Use the entry for ``Bot User OAuth Access Token`` in the
      "OAuth & Permissions" tab as your ``slack_token``. It should start
      with ``xoxob``.

diff --git a/docs/user-guide/evaluating-models.rst b/docs/user-guide/evaluating-models.rst
@@ -48,6 +48,28 @@ The full list of options for the script is:
 
 .. program-output:: rasa test nlu --help
 
+.. _comparing-nlu-pipelines:
+
+Comparing NLU Pipelines
+^^^^^^^^^^^^^^^^^^^^^^^
+
+By passing multiple pipeline configurations (or a folder containing them) to the CLI, Rasa will run
+a comparative examination between the pipelines.
+
+.. code-block:: bash
+
+  $ rasa test nlu --config pretrained_embeddings_spacy.yml supervised_embeddings.yml
+    --nlu data/nlu.md --runs 3 --percentages 0 25 50 70 90
+
+
+The command in the example above will create a train/test split from your data,
+then train each pipeline multiple times with 0, 25, 50, 70 and 90% of your intent data excluded from the training set.
+The models are then evaluated on the test set and the f1-score for each exclusion percentage is recorded. This process
+runs three times (i.e. with 3 test sets in total) and then a graph is plotted using the means and standard deviations of
+the f1-scores.
+
+The f1-score graph - along with all train/test sets, the trained models, classification and error reports - will be saved into a folder
+called ``nlu_comparison_results``.
 
 
 Intent Classification

diff --git a/docs/user-guide/running-rasa-with-docker.rst b/docs/user-guide/running-rasa-with-docker.rst
@@ -75,6 +75,15 @@ To check that the command completed correctly, look at the contents of your work
 
 The initial project files should all be there, as well as a ``models`` directory that contains your trained model.
 
+
+.. note::
+
+   By default Docker runs containers as ``root`` user. Hence, all files created by
+   these containers will be owned by ``root``. See the `documentation of docker
+   <https://docs.docker.com/v17.12/edge/engine/reference/commandline/run/>`_
+   and `docker-compose <https://docs.docker.com/compose/compose-file/>`_ if you want to
+   run the containers with a different user.
+
 Talking to Your Assistant
 ~~~~~~~~~~~~~~~~~~~~~~~~~
 

diff --git a/rasa/cli/arguments/default_arguments.py b/rasa/cli/arguments/default_arguments.py
@@ -61,13 +61,14 @@ def add_domain_param(
 
 
 def add_config_param(
-    parser: Union[argparse.ArgumentParser, argparse._ActionsContainer]
+    parser: Union[argparse.ArgumentParser, argparse._ActionsContainer],
+    default: Optional[Text] = DEFAULT_CONFIG_PATH,
 ):
     parser.add_argument(
         "-c",
         "--config",
         type=str,
-        default=DEFAULT_CONFIG_PATH,
+        default=default,
         help="The policy and NLU pipeline configuration of your bot.",
     )
 

diff --git a/rasa/cli/arguments/test.py b/rasa/cli/arguments/test.py
@@ -110,6 +110,16 @@ def add_test_nlu_argument_group(
         default="confmat.png",
         help="Output path for the confusion matrix plot.",
     )
+    parser.add_argument(
+        "-c",
+        "--config",
+        nargs="+",
+        default=None,
+        help="Model configuration file. If a single file is passed and cross "
+        "validation mode is chosen, cross-validation is performed, if "
+        "multiple configs or a folder of configs are passed, models "
+        "will be trained and compared directly.",
+    )
 
     cross_validation_arguments = parser.add_argument_group("Cross Validation")
     cross_validation_arguments.add_argument(
@@ -118,20 +128,31 @@ def add_test_nlu_argument_group(
         default=False,
         help="Switch on cross validation mode. Any provided model will be ignored.",
     )
-    cross_validation_arguments.add_argument(
-        "-c",
-        "--config",
-        type=str,
-        default=DEFAULT_CONFIG_PATH,
-        help="Model configuration file (cross validation only).",
-    )
     cross_validation_arguments.add_argument(
         "-f",
         "--folds",
         required=False,
         default=10,
         help="Number of cross validation folds (cross validation only).",
     )
+    comparison_arguments = parser.add_argument_group("Comparison Mode")
+    comparison_arguments.add_argument(
+        "-r",
+        "--runs",
+        required=False,
+        default=3,
+        type=int,
+        help="Number of comparison runs to make.",
+    )
+    comparison_arguments.add_argument(
+        "-p",
+        "--percentages",
+        required=False,
+        nargs="+",
+        type=int,
+        default=[0, 25, 50, 75, 90],
+        help="Percentages of training data to exclude during comparison.",
+    )
 
 
 def add_test_core_model_param(parser: argparse.ArgumentParser):

diff --git a/rasa/cli/arguments/train.py b/rasa/cli/arguments/train.py
@@ -8,7 +8,7 @@
     add_out_param,
     add_domain_param,
 )
-from rasa.constants import DEFAULT_CONFIG_PATH, DEFAULT_DATA_PATH
+from rasa.constants import DEFAULT_DATA_PATH, DEFAULT_CONFIG_PATH
 
 
 def set_train_arguments(parser: argparse.ArgumentParser):

diff --git a/rasa/cli/interactive.py b/rasa/cli/interactive.py
@@ -44,7 +44,6 @@ def add_subparser(
 
 
 def interactive(args: argparse.Namespace):
-    args.finetune = False  # Don't support finetuning
     args.fixed_model_name = None
     args.store_uncompressed = False
 
@@ -58,7 +57,6 @@ def interactive(args: argparse.Namespace):
 
 
 def interactive_core(args: argparse.Namespace):
-    args.finetune = False  # Don't support finetuning
     args.fixed_model_name = None
     args.store_uncompressed = False
 

diff --git a/rasa/cli/test.py b/rasa/cli/test.py
@@ -12,8 +12,11 @@
     DEFAULT_ENDPOINTS_PATH,
     DEFAULT_MODELS_PATH,
     DEFAULT_RESULTS_PATH,
+    DEFAULT_NLU_RESULTS_PATH,
+    CONFIG_SCHEMA_FILE,
 )
-from rasa.test import test_compare
+from rasa.test import test_compare_core, compare_nlu_models
+from rasa.utils.validation import validate_yaml_schema, InvalidYamlFileError
 
 logger = logging.getLogger(__name__)
 
@@ -27,7 +30,7 @@ def add_subparser(
         parents=parents,
         conflict_handler="resolve",
         formatter_class=argparse.ArgumentDefaultsHelpFormatter,
-        help="Tests a trained Rasa model using your test NLU data and stories.",
+        help="Tests Rasa models using your test NLU data and stories.",
     )
 
     arguments.set_test_arguments(test_parser)
@@ -38,15 +41,15 @@ def add_subparser(
         parents=parents,
         conflict_handler="resolve",
         formatter_class=argparse.ArgumentDefaultsHelpFormatter,
-        help="Tests a trained Rasa Core model using your test stories.",
+        help="Tests Rasa Core models using your test stories.",
     )
     arguments.set_test_core_arguments(test_core_parser)
 
     test_nlu_parser = test_subparsers.add_parser(
         "nlu",
         parents=parents,
         formatter_class=argparse.ArgumentDefaultsHelpFormatter,
-        help="Tests a trained Rasa NLU model using your test NLU data.",
+        help="Tests Rasa NLU models using your test NLU data.",
     )
     arguments.set_test_nlu_arguments(test_nlu_parser)
 
@@ -83,23 +86,61 @@ def test_core(args: argparse.Namespace) -> None:
         )
 
     else:
-        test_compare(args.model, stories, output)
+        test_compare_core(args.model, stories, output)
 
 
 def test_nlu(args: argparse.Namespace) -> None:
-    from rasa.test import test_nlu, test_nlu_with_cross_validation
+    from rasa.test import test_nlu, perform_nlu_cross_validation
+    import rasa.utils.io
 
     nlu_data = get_validated_path(args.nlu, "nlu", DEFAULT_DATA_PATH)
     nlu_data = data.get_nlu_directory(nlu_data)
 
-    if not args.cross_validation:
+    if args.config is not None and len(args.config) == 1:
+        args.config = os.path.abspath(args.config[0])
+        if os.path.isdir(args.config):
+            config_dir = args.config
+            config_files = os.listdir(config_dir)
+            args.config = [
+                os.path.join(config_dir, os.path.abspath(config))
+                for config in config_files
+            ]
+
+    if isinstance(args.config, list):
+        logger.info(
+            "Multiple configuration files specified, running nlu comparison mode."
+        )
+
+        config_files = []
+        for file in args.config:
+            try:
+                validate_yaml_schema(
+                    rasa.utils.io.read_file(file),
+                    CONFIG_SCHEMA_FILE,
+                    show_validation_errors=False,
+                )
+                config_files.append(file)
+            except InvalidYamlFileError:
+                logger.debug(
+                    "Ignoring file '{}' as it is not a valid config file.".format(file)
+                )
+                continue
+
+        output = args.report or DEFAULT_NLU_RESULTS_PATH
+        compare_nlu_models(
+            configs=config_files,
+            nlu=nlu_data,
+            output=output,
+            runs=args.runs,
+            exclusion_percentages=args.percentages,
+        )
+    elif args.cross_validation:
+        logger.info("Test model using cross validation.")
+        config = get_validated_path(args.config, "config", DEFAULT_CONFIG_PATH)
+        perform_nlu_cross_validation(config, nlu_data, vars(args))
+    else:
         model_path = get_validated_path(args.model, "model", DEFAULT_MODELS_PATH)
         test_nlu(model_path, nlu_data, vars(args))
-    else:
-        print ("No model specified. Model will be trained using cross validation.")
-        config = get_validated_path(args.config, "config", DEFAULT_CONFIG_PATH)
-
-        test_nlu_with_cross_validation(config, nlu_data, vars(args))
 
 
 def test(args: argparse.Namespace):