chatbot

Fine-tuning a chatbot on Messenger conversations

This was trained on a single 4090. For other GPUs, use different model sizes or LoRA parameters.

Getting Started

Framework

Set some environment variables:

export RUN_DIR=/path/to/run/dir  # This is where all your training logs and checkpoints will be written
export EVAL_RUN_DIR=/path/to/eval/run/dir  # This is where all your evaluation logs will be written

It's also a good idea to set these variables:

export DATA_DIR=/path/to/data/dir  # This is where your datasets are stored
export MODEL_DIR=/path/to/model/dir  # This is where your pretrained models are stored

Check out the documentation here.

Dataset

Create a new directory called $DATA_DIR/messenger
Create a new JSON dump of your Messenger conversations from here
Download the ZIP file to the newly-created directory; it should have a path like $DATA_DIR/messenger/facebook-{username}.zip
Run python -m chatbot.tasks.dataset to preprocess the dataset

So ultimately you should have a directory structure like this:

$DATA_DIR
└── messenger
    ├── facebook-{username}.zip
    ├-- packed
    │   └── rwkv.bin
    └── messages
        └── inbox
            ├── {conversation_1}
            │   ├── message_1.json
            │   ├── message_2.json
            │   ├── ...
            │   └── message_n.json
            ├── {conversation_2}
            │   ├── message_1.json
            │   ├── message_2.json
            │   ├── ...
            │   └── message_n.json
            ├── ...
            └── {conversation_n}
                ├── message_1.json
                ├── message_2.json
                ├── ...
                └── message_n.json

Train a model

Launch a training job:

runml train configs/rwkv.yaml

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github		.github
.vscode		.vscode
chatbot		chatbot
configs		configs
tests		tests
.darglint		.darglint
.env		.env
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

chatbot

Getting Started

Framework

Dataset

Train a model

About

Releases

Packages

Languages

License

codekansas/personalized-chatbot

Folders and files

Latest commit

History

Repository files navigation

chatbot

Getting Started

Framework

Dataset

Train a model

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages