Skip to content

Commit

Permalink
fix chatbot example (asyml#711)
Browse files Browse the repository at this point in the history
* fix chatbot example

* remove path insertion

* edit tutorial based on the review

* edit tutorial based on the review

Co-authored-by: Hector <[email protected]>
  • Loading branch information
hepengfe and hunterhector authored Mar 31, 2022
1 parent c5a6996 commit 3de4bee
Show file tree
Hide file tree
Showing 3 changed files with 54 additions and 34 deletions.
83 changes: 50 additions & 33 deletions examples/chatbot/README.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,40 @@
# Retrieval-based Chatbot

This example showcases the use of `Forte` to build a retrieval-based chatbot and perform text
analysis on the retrieved results. We use the dataset released as part of this paper
[Target-Guided Open-Domain Conversation](https://arxiv.org/abs/1905.11553). The dataset consists
of conversations between two entities A and B. We finetune a BERT model that helps retrieve a
This example showcases the use of `Forte` to build a retrieval-based chatbot and perform text
analysis on the retrieved results. We use the dataset released as part of this paper
[Target-Guided Open-Domain Conversation](https://arxiv.org/abs/1905.11553). The dataset consists
of conversations between two entities A and B. We finetune a BERT model that helps retrieve a
response for a context.

**Note**: All the commands below should be run from `examples/chatbot/` directory.

In this example, the user speaks in German and the bot extracts information stored in English. The
bot finally translates the response to German. For text analysis, we run a *Semantic Role
Labeler* (SRL) on the retrieved response to identify predicate mentions and arguments. Let us see a
In this example, the user speaks in German and the bot extracts information stored in English. The
bot finally translates the response to German. For text analysis, we run a *Semantic Role
Labeler* (SRL) on the retrieved response to identify predicate mentions and arguments. Let us see a
step-by-step guide on how to run this example.

## Packages
After setting up `forte` environment by installing packages in `requirements.txt` in the project root folder.
```
pip install termcolor
pip install tensorflow==2.4 # version compantible with numpy==1.19.5
pip install faiss-gpu
```

Then we need to install follow instructions in [forte-wrappers](https://github.com/asyml/forte-wrappers)
After cloning the repo, in the folder `forte-wrappers`, User needs to run the following command to install external packages.
```
pip install src/faiss
pip install src/huggingface
pip install src/nltk
```


## Using the example in inference mode

### Downloading the models

Before we run the chatbot, we need to download the models.
Before we run the chatbot, we need to download the models.

- Download chatbot model

Expand All @@ -37,8 +54,8 @@ python download_models.py --model-name indexer
python download_models.py --model-name srl
```

**Note**: All the models will be saved in `model/` directory. To change the save directory use
`--path` option in the commands above. If you change the model directory, please ensure that you
**Note**: All the models will be saved in `model/` directory. To change the save directory use
`--path` option in the commands above. If you change the model directory, please ensure that you
update the path for each of the processors in `config.yml` file.

### Running the example
Expand All @@ -49,22 +66,22 @@ Now to see the example in action, just run
python chatbot_example.py
```

This starts an interactive python program which prompts for an input in German from the user. The
program translates the input to English, retrieves the most relevant response from the corpus,
translates it back to German and runs analysis on the output. The user can quit the program by
This starts an interactive python program which prompts for an input in German from the user. The
program translates the input to English, retrieves the most relevant response from the corpus,
translates it back to German and runs analysis on the output. The user can quit the program by
pressing `Ctrl + D`.


## Training a Chatbot

We have designed this example to explain how to build train a chatbot using Forte. Follow the steps
We have designed this example to explain how to build train a chatbot using Forte. Follow the steps
below

### Downloading the dataset

We use the conversation dataset used in the paper
[Target-Guided Open-Domain Conversation](https://arxiv.org/abs/1905.11553). Refer to
[this](https://github.com/squareRoot3/Target-Guided-Conversation) repository to download the
We use the conversation dataset used in the paper
[Target-Guided Open-Domain Conversation](https://arxiv.org/abs/1905.11553). Refer to
[this](https://github.com/squareRoot3/Target-Guided-Conversation) repository to download the
dataset. The dataset consists of several conversations between two entities A and B. Download and
extract the dataset into `data/` folder

Expand All @@ -77,19 +94,19 @@ sed -i '810s/.*/2 mine too ! ! ! ! ! now i can play quake and feed my dogs\tnice

You can use replace *"nice. do you have farms?"* with any text as long as the response is
reasonable for the conversation.

### Prepare the dataset

We augment each `(sentenceA, sentenceB)` pair with historical context. The contextual information
We augment each `(sentenceA, sentenceB)` pair with historical context. The contextual information
will improve the search results. We prepare the data through the following steps

- To extract context, we augment historical information (of up to length 2) for every sentence
pairs. i.e say the conversation is `[A1, B1], [A2, B2], [A3, B3]...` then for sentence pair
`[A3, B3]` we create a pair using history of length 2 as `[(A1,B1,A2,B2,A3), B3]` where `(...)`
- To extract context, we augment historical information (of up to length 2) for every sentence
pairs. i.e say the conversation is `[A1, B1], [A2, B2], [A3, B3]...` then for sentence pair
`[A3, B3]` we create a pair using history of length 2 as `[(A1,B1,A2,B2,A3), B3]` where `(...)`
indicates concatenation.

- We generate negative examples for each context by randomly shuffling the responses i.e. for
sentence pair `[A,B]` we label the pair `(A,B)` as `positive` and the pair `(A, B')` as `negative`
- We generate negative examples for each context by randomly shuffling the responses i.e. for
sentence pair `[A,B]` we label the pair `(A,B)` as `positive` and the pair `(A, B')` as `negative`
where `B'` is randomly picked from the pool of responses. To prepare the dataset as above, run

```bash
Expand All @@ -98,29 +115,29 @@ python prepare_chatbot_data.py

### Finetune BERT for chatbot dataset

We finetune BERT for sentence similarity task using the above dataset. In particular, we use a
Siamese BERT network structure and finetune for a binary classification task. This idea is inspired
from [Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks](https://arxiv.org/abs/1908.10084).
After this finetuning process, semantically related contexts and reponses would be geometrically
We finetune BERT for sentence similarity task using the above dataset. In particular, we use a
Siamese BERT network structure and finetune for a binary classification task. This idea is inspired
from [Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks](https://arxiv.org/abs/1908.10084).
After this finetuning process, semantically related contexts and reponses would be geometrically
close to each other. This finetuned model is used in the indexer to retrieve semantically meaningful
responses. To run this finetuning,

```bash
python finetune_bert_chatbot.py
```

We finetune for 1 epoch using Adam optimizer. This process takes ~1.5 hours to train on a single
We finetune for 1 epoch using Adam optimizer. This process takes ~1.5 hours to train on a single
GeForce GTX 1080 Ti with 11GB of GPU memory. After 1 epoch, test accuracy should be around `81%`.
The model is saved in `model/`.

```bash
Evaluating on eval dataset. Accuracy based on Cosine Similarity: 0.8014592933947773,Accuracy
Evaluating on eval dataset. Accuracy based on Cosine Similarity: 0.8014592933947773,Accuracy
based on logits: 0.8031753897666931,nsamples: 7812
step: 8050; loss: 0.2928136885166168
step: 8100; loss: 0.4982462227344513
step: 8150; loss: 0.5195412039756775
step: 8200; loss: 0.15928469598293304
Evaluating on test dataset. Accuracy based on Cosine Similarity: 0.8077812018489985,Accuracy
Evaluating on test dataset. Accuracy based on Cosine Similarity: 0.8077812018489985,Accuracy
based on logits: 0.8115370273590088,nsamples: 7788
Saving the model...
```
Expand All @@ -143,7 +160,7 @@ python chatbot_example.py

to see your chatbot in action.

Below is sample run of the chatbot example
Below is sample run of the chatbot example

![](example.gif)

Expand All @@ -156,4 +173,4 @@ following environment variable to use the processor
```
export MICROSOFT_API_KEY=<YOUR_MICROSOFT_KEY>
export LOCATION=<YOUR_API_RESOURCE_LOCATION>
```
```
2 changes: 2 additions & 0 deletions examples/chatbot/chatbot_example.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
import yaml
from termcolor import colored
import torch

from fortex.nltk import NLTKSentenceSegmenter, NLTKWordTokenizer, NLTKPOSTagger
from forte.common.configuration import Config
from forte.data.multi_pack import MultiPack
Expand Down Expand Up @@ -123,6 +124,7 @@ def main(config: Config):

input(colored("Press ENTER to continue...\n", "green"))


if __name__ == "__main__":
all_config = Config(yaml.safe_load(open("config.yml", "r")), None)
main(all_config)
3 changes: 2 additions & 1 deletion forte/processors/ir/search_processor.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,8 +59,9 @@ def _process(self, input_pack: MultiPack):
def default_configs(cls) -> Dict[str, Any]:
return {
"model_dir": None,
"query_pack_name": "query",
"response_pack_name_prefix": "doc",
"indexer_class": "forte.faiss.embedding_based_indexer"
"indexer_class": "fortex.faiss.embedding_based_indexer"
".EmbeddingBasedIndexer",
"indexer_configs": {
"index_type": "GpuIndexFlatIP",
Expand Down

0 comments on commit 3de4bee

Please sign in to comment.