Skip to content

XTuner is a toolkit for efficiently fine-tuning LLM

License

Notifications You must be signed in to change notification settings

felixstander/xtuner

Repository files navigation

MMChat

🌟 Demos

  • QLoRA fine-tune for InternLM-7B Open In Colab
  • Chat with Llama2-7B-Plugins Open In Colab
  • Use MMChat in HuggingFace training pipeline Open In Colab

🧭 Introduction

MMChat is a toolkit for quickly fine-tuning LLM, developed by the MMRazor and MMDeploy teams. It has the following core features:

  • Embrace HuggingFace and provide fast support for new models, datasets, and algorithms.
  • Provide a comprehensive solution and related models for MOSS plugins datasets.
  • Support arbitrary combinations of multiple datasets during fine-tuning.
  • Compatible with DeepSpeed, enabling the efficient fine-tuning of LLM on multiple GPUs.
  • Support QLoRA, enabling the efficient fine-tuning of LLM using free resources on Colab.

💥 MMRazor and MMDeploy teams have also collaborated in developing LMDeploy, a toolkit for for compressing, deploying, and serving LLM. Welcome to subscribe to stay updated with our latest developments.

🔥 Supports

Models Datasets Strategies Algorithms

🛠️ Quick Start

Installation

Below are quick steps for installation:

conda create -n mmchat python=3.10
conda activate mmchat
git clone XXX
cd MMChat
pip install -v -e .

Chat Open In Colab

MMChat provides the tools to chat with pretrained / fine-tuned LLMs.

  • For example, we can start the chat with Llama2-7B-Plugins by

    python ./tools/chat_hf.py meta-llama/Llama-2-7b --adapter XXX --bot-name Llama2 --prompt plugins --with-plugins --command-stop-word "<eoc>" --answer-stop-word "<eom>" --no-streamer

For more usages, please see TODO.

Fine-tune Open In Colab

MMChat supports the efficient fine-tune (e.g., QLoRA) for Large Language Models (LLM).

Taking the QLoRA fine-tuning as an example, we can start it by

  • For example, we can start the QLoRA fine-tuning of InternLM-7B with oasst1 dataset by
    # On a single GPU
    python ./tools/train.py ./configs/internlm/internlm_7b/internlm_7b_qlora_oasst1.py
    # On multiple GPUs
    bash ./tools/dist_train.sh ./configs/internlm/internlm_7b/internlm_7b_qlora_oasst1.py ${GPU_NUM}

For more usages, please see TODO.

Deploy

  • Step 0, convert the pth adapter to HuggingFace adapter, by

    python ./tools/model_converters/adapter_pth2hf.py \
    		${CONFIG_FILE} \
    		${PATH_TO_PTH_ADAPTER} \
    		${SAVE_PATH_TO_HF_ADAPTER}
  • Step 1, merge the HuggingFace adapter to the pretrained LLM, by

    python ./tools/model_converters/merge_lora_hf.py \
        ${NAME_OR_PATH_TO_HF_MODEL} \
        ${NAME_OR_PATH_TO_HF_ADAPTER} \
        ${SAVE_PATH}
  • Step 2, deploy the merged LLM with any other framework, such as LMDeploy 🚀.

    pip install lmdeploy
    python -m lmdeploy.pytorch.chat ${NAME_OR_PATH_TO_HF_MODEL} \
        --max_new_tokens 256 \
        --temperture 0.8 \
        --top_p 0.95 \
        --seed 0

    🎯 We are woking closely with LMDeploy, to implement the deployment of dialogues with plugins!

Evaluation

  • We recommend using OpenCompass, a comprehensive and systematic LLM evaluation library, which currently supports 50+ datasets with about 300,000 questions.

🔜 Roadmap

🎖️ Acknowledgement

🎫 License

About

XTuner is a toolkit for efficiently fine-tuning LLM

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%