fix chatbot example (asyml#711)

* fix chatbot example * remove path insertion * edit tutorial based on the review * edit tutorial based on the review Co-authored-by: Hector <[email protected]>
Pushkar-Bhuse · Mar 31, 2022 · 3de4bee · 3de4bee
1 parent c5a6996
commit 3de4bee
Show file tree

Hide file tree

Showing 3 changed files with 54 additions and 34 deletions.
diff --git a/examples/chatbot/README.md b/examples/chatbot/README.md
@@ -1,23 +1,40 @@
 # Retrieval-based Chatbot
 
-This example showcases the use of `Forte` to build a retrieval-based chatbot and perform text 
-analysis on the retrieved results. We use the dataset released as part of this paper 
-[Target-Guided Open-Domain Conversation](https://arxiv.org/abs/1905.11553). The dataset consists 
-of conversations between two entities A and B. We finetune a BERT model that helps retrieve a 
+This example showcases the use of `Forte` to build a retrieval-based chatbot and perform text
+analysis on the retrieved results. We use the dataset released as part of this paper
+[Target-Guided Open-Domain Conversation](https://arxiv.org/abs/1905.11553). The dataset consists
+of conversations between two entities A and B. We finetune a BERT model that helps retrieve a
 response for a context.
 
 **Note**: All the commands below should be run from `examples/chatbot/` directory.
 
-In this example, the user speaks in German and the bot extracts information stored in English. The 
-bot finally translates the response to German. For text analysis, we run a *Semantic Role 
-Labeler* (SRL) on the retrieved response to identify predicate mentions and arguments. Let us see a 
+In this example, the user speaks in German and the bot extracts information stored in English. The
+bot finally translates the response to German. For text analysis, we run a *Semantic Role
+Labeler* (SRL) on the retrieved response to identify predicate mentions and arguments. Let us see a
 step-by-step guide on how to run this example.
 
+## Packages
+After setting up `forte` environment by installing packages in `requirements.txt` in the project root folder.
+```
+pip install termcolor
+pip install tensorflow==2.4 # version compantible with numpy==1.19.5
+pip install faiss-gpu
+```
+
+Then we need to install follow instructions in [forte-wrappers](https://github.com/asyml/forte-wrappers)
+After cloning the repo, in the folder `forte-wrappers`, User needs to run the following command to install external packages.
+```
+pip install src/faiss
+pip install src/huggingface
+pip install src/nltk
+```
+
+
 ## Using the example in inference mode
 
 ### Downloading the models
 
-Before we run the chatbot, we need to download the models. 
+Before we run the chatbot, we need to download the models.
 
 - Download chatbot model
 
@@ -37,8 +54,8 @@ python download_models.py --model-name indexer
 python download_models.py --model-name srl
 ```
 
-**Note**: All the models will be saved in `model/` directory. To change the save directory use 
-`--path` option in the commands above. If you change the model directory, please ensure that you 
+**Note**: All the models will be saved in `model/` directory. To change the save directory use
+`--path` option in the commands above. If you change the model directory, please ensure that you
 update the path for each of the processors in `config.yml` file.
 
 ### Running the example
@@ -49,22 +66,22 @@ Now to see the example in action, just run
 python chatbot_example.py
 ```
 
-This starts an interactive python program which prompts for an input in German from the user. The 
-program translates the input to English, retrieves the most relevant response from the corpus, 
-translates it back to German and runs analysis on the output. The user can quit the program by 
+This starts an interactive python program which prompts for an input in German from the user. The
+program translates the input to English, retrieves the most relevant response from the corpus,
+translates it back to German and runs analysis on the output. The user can quit the program by
 pressing `Ctrl + D`.
 
 
 ## Training a Chatbot
 
-We have designed this example to explain how to build train a chatbot using Forte. Follow the steps 
+We have designed this example to explain how to build train a chatbot using Forte. Follow the steps
 below
 
 ### Downloading the dataset
 
-We use the conversation dataset used in the paper 
-[Target-Guided Open-Domain Conversation](https://arxiv.org/abs/1905.11553). Refer to 
-[this](https://github.com/squareRoot3/Target-Guided-Conversation) repository to download the 
+We use the conversation dataset used in the paper
+[Target-Guided Open-Domain Conversation](https://arxiv.org/abs/1905.11553). Refer to
+[this](https://github.com/squareRoot3/Target-Guided-Conversation) repository to download the
 dataset. The dataset consists of several conversations between two entities A and B. Download and
 extract the dataset into `data/` folder
 
@@ -77,19 +94,19 @@ sed -i '810s/.*/2 mine too ! ! ! ! ! now i can play quake and feed my dogs\tnice
 
 You can use replace *"nice. do you have farms?"* with any text as long as the response is
 reasonable for the conversation.
-  
+
 ### Prepare the dataset
 
-We augment each `(sentenceA, sentenceB)` pair with historical context. The contextual information 
+We augment each `(sentenceA, sentenceB)` pair with historical context. The contextual information
 will improve the search results. We prepare the data through the following steps
 
-- To extract context, we augment historical information (of up to length 2) for every sentence 
-pairs. i.e say the conversation is `[A1, B1], [A2, B2], [A3, B3]...` then for sentence pair 
-`[A3, B3]` we create a pair using history of length 2 as `[(A1,B1,A2,B2,A3), B3]` where `(...)` 
+- To extract context, we augment historical information (of up to length 2) for every sentence
+pairs. i.e say the conversation is `[A1, B1], [A2, B2], [A3, B3]...` then for sentence pair
+`[A3, B3]` we create a pair using history of length 2 as `[(A1,B1,A2,B2,A3), B3]` where `(...)`
 indicates concatenation.
 
-- We generate negative examples for each context by randomly shuffling the responses i.e. for 
-sentence pair `[A,B]` we label the pair `(A,B)` as `positive` and the pair `(A, B')` as `negative` 
+- We generate negative examples for each context by randomly shuffling the responses i.e. for
+sentence pair `[A,B]` we label the pair `(A,B)` as `positive` and the pair `(A, B')` as `negative`
 where `B'` is randomly picked from the pool of responses. To prepare the dataset as above, run
 
 ```bash
@@ -98,29 +115,29 @@ python prepare_chatbot_data.py
 
 ### Finetune BERT for chatbot dataset
 
-We finetune BERT for sentence similarity task using the above dataset. In particular, we use a 
-Siamese BERT network structure and finetune for a binary classification task. This idea is inspired 
-from [Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks](https://arxiv.org/abs/1908.10084). 
-After this finetuning process, semantically related contexts and reponses would be geometrically 
+We finetune BERT for sentence similarity task using the above dataset. In particular, we use a
+Siamese BERT network structure and finetune for a binary classification task. This idea is inspired
+from [Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks](https://arxiv.org/abs/1908.10084).
+After this finetuning process, semantically related contexts and reponses would be geometrically
 close to each other. This finetuned model is used in the indexer to retrieve semantically meaningful
  responses. To run this finetuning,
 
 ```bash
 python finetune_bert_chatbot.py
 ```
 
-We finetune for 1 epoch using Adam optimizer. This process takes ~1.5 hours to train on a single 
+We finetune for 1 epoch using Adam optimizer. This process takes ~1.5 hours to train on a single
 GeForce GTX 1080 Ti with 11GB of GPU memory. After 1 epoch, test accuracy should be around `81%`.
 The model is saved in `model/`.
 
 ```bash
-Evaluating on eval dataset. Accuracy based on Cosine Similarity: 0.8014592933947773,Accuracy 
+Evaluating on eval dataset. Accuracy based on Cosine Similarity: 0.8014592933947773,Accuracy
 based on logits: 0.8031753897666931,nsamples: 7812
 step: 8050; loss: 0.2928136885166168
 step: 8100; loss: 0.4982462227344513
 step: 8150; loss: 0.5195412039756775
 step: 8200; loss: 0.15928469598293304
-Evaluating on test dataset. Accuracy based on Cosine Similarity: 0.8077812018489985,Accuracy 
+Evaluating on test dataset. Accuracy based on Cosine Similarity: 0.8077812018489985,Accuracy
 based on logits: 0.8115370273590088,nsamples: 7788
 Saving the model...
 ```
@@ -143,7 +160,7 @@ python chatbot_example.py
 
 to see your chatbot in action.
 
-Below is sample run of the chatbot example 
+Below is sample run of the chatbot example
 
 ![](example.gif)
 
@@ -156,4 +173,4 @@ following environment variable to use the processor
 ```
 export MICROSOFT_API_KEY=<YOUR_MICROSOFT_KEY>
 export LOCATION=<YOUR_API_RESOURCE_LOCATION>
-```
+```
diff --git a/examples/chatbot/chatbot_example.py b/examples/chatbot/chatbot_example.py
@@ -14,6 +14,7 @@
 import yaml
 from termcolor import colored
 import torch
+
 from fortex.nltk import NLTKSentenceSegmenter, NLTKWordTokenizer, NLTKPOSTagger
 from forte.common.configuration import Config
 from forte.data.multi_pack import MultiPack
@@ -123,6 +124,7 @@ def main(config: Config):
 
             input(colored("Press ENTER to continue...\n", "green"))
 
+
 if __name__ == "__main__":
     all_config = Config(yaml.safe_load(open("config.yml", "r")), None)
     main(all_config)
diff --git a/forte/processors/ir/search_processor.py b/forte/processors/ir/search_processor.py
@@ -59,8 +59,9 @@ def _process(self, input_pack: MultiPack):
     def default_configs(cls) -> Dict[str, Any]:
         return {
             "model_dir": None,
+            "query_pack_name": "query",
             "response_pack_name_prefix": "doc",
-            "indexer_class": "forte.faiss.embedding_based_indexer"
+            "indexer_class": "fortex.faiss.embedding_based_indexer"
             ".EmbeddingBasedIndexer",
             "indexer_configs": {
                 "index_type": "GpuIndexFlatIP",