Skip to content

Latest commit

 

History

History
405 lines (304 loc) · 15.6 KB

README_en.md

File metadata and controls

405 lines (304 loc) · 15.6 KB

TigerBot

Tiger

A cutting-edge foundation for your very own LLM.

🌐 TigerBot • 🤗 Hugging Face

English | Chinese

News

TigerBot is a multi-language and multitask LLM. We evaluated our MVP model on public NLP datasets and found that our model reached 96% of performance of OpenAI InstructGPT at the same model size. We hereby open-source our explorations as following:

  • Model:TigerBot-7B, TigerBot-7B-base,TigerBot-180B (research version),
  • Code:
    1. The whole training process codes including model pretraining and supervised fine-tuning.
    2. Model quantization with GPTQ.
    3. Inference on single GPU or multiple GPUs.
  • Data:
    1. Pre-training data: 100GB pretraining data deduplicated and filtered low quality content from 2TB corpus.
    2. SFT data: 1GB (millions of) textual instructions. This dataset consists of 10 major user-instruction categories and 120 subcategories.
    3. Domain-specific data: We provide data into different domains: finance, law, and wikipedia.
  • API: We provide APIs including chat, plugin, and finetune which allow users to create their own models and applications easily.

We pretrained and supervised fine-tuned our models, starting from a vanilla BLOOM, and made some algorithmic innovations so far:

  • A stronger yet more elegant supervised learning algorithms to achieve higher learnability in supervised fine-tuning.
  • We implemented a probabilistic modeling and ensemble approach to achieve better factuality and generativeness.
  • We improved the memory management and multi-node communication of distributed training with deepspeed. It guarantees months of training in a thousand-gpu enviroment with zero downtime.
  • We used a specialized tokenizer and supervised training algorithm better suited for otherwise more skewed Chinese language distribution.

Contents

Install

conda create --name tigerbot python=3.8
conda activate tigerbot
conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia

git clone https://github.com/TigerResearch/TigerBot
cd TigerBot
pip install -r requirements.txt

Model Weights

Tigerbot-7B
Tigerbot-7B Bits memory(GB)
tigerbot-7b-base 16 17.2
tigerbot-7b-sft 16 17.2
tigerbot-7b-sft-4bit-128g 4 8.5
Tigerbot-180B-Research
Tigerbot-180B-Research Bits memory(GB)
tigerbot-180b-sft 16 347.6
tigerbot-180b-sft-4bit-128g 4 108.5

Training and Inference

Pre-training

Install DeepSpeed

git clone https://github.com/microsoft/DeepSpeed/
cd DeepSpeed
rm -rf build
TORCH_CUDA_ARCH_LIST="8.0" DS_BUILD_CPU_ADAM=1 DS_BUILD_UTILS=1 pip install . \
--global-option="build_ext" --global-option="-j8" --no-cache -v \
--disable-pip-version-check 2>&1 | tee build.log

Edit TORCH_CUDA_ARCH_LIST to insert the code for the architectures of the GPU cards you intend to use.

CUDA_VISIBLE_DEVICES=0 python -c "import torch; print(torch.cuda.get_device_capability())"

So if you get 8, 0, then use TORCH_CUDA_ARCH_LIST="8.0".

command to start training

deepspeed \
--include="localhost:0,1,2,3" \
./train_clm.py \
--deepspeed ./ds_config/ds_config_zero3.json \
--model_name_or_path TigerResearch/tigerbot-7b-base \
--dataset_name TigerResearch/dev_pretrain \
--do_train \
--output_dir ./ckpt-clm \
--overwrite_output_dir \
--preprocess_num_workers 8 \
--num_train_epochs 5 \
--learning_rate 1e-5 \
--evaluation_strategy steps \
--eval_steps 10 \
--bf16 True \
--save_strategy steps \
--save_steps 10 \
--save_total_limit 2 \
--logging_steps 10 \
--tf32 True \
--per_device_train_batch_size 2 \
--per_device_eval_batch_size 2

Fine-tunes

command to start training

deepspeed \
--include="localhost:0,1,2,3" \
./train_sft.py \
--deepspeed ./ds_config/ds_config_zero3.json \
--model_name_or_path TigerResearch/tigerbot-7b-base \
--dataset_name TigerResearch/dev_sft \
--do_train \
--output_dir ./ckpt-sft \
--overwrite_output_dir \
--preprocess_num_workers 8 \
--num_train_epochs 5 \
--learning_rate 1e-5 \
--evaluation_strategy steps \
--eval_steps 10 \
--bf16 True \
--save_strategy steps \
--save_steps 10 \
--save_total_limit 2 \
--logging_steps 10 \
--tf32 True \
--per_device_train_batch_size 2 \
--per_device_eval_batch_size 2

Inference

You can infer with command line. Input clear to clean history and input exit to stop it.

命令行推理

Infer with single GPU

tigerbot-7b-sft can be loaded for inference on RXT3090 GPU

CUDA_VISIBLE_DEVICES=0 python infer.py --model_path ${MODEL_DIR}

Infer with multiple GPUS

tigerbot-180b-sft can be loaded for parallelism inference on 5 A100(80G) GPUs

CUDA_VISIBLE_DEVICES=0,1,2,3,4 python infer.py --model_path ${MODEL_DIR}

Quantization

We use GPTQ and GPTQ-for-LLaMa to quantize models.

go to the path of gptq

cd gptq

Model quantization

CUDA_VISIBLE_DEVICES=0 python tigerbot.py ${MODEL_DIR} c4 --wbits 4 --act-order --groupsize 128 --save ${MODEL_DIR}/tigerbot-7b-4bit-128g.pt

Quantized model infer with single GPU

tigerbot-7b-sft-4bit-128g can be loaded for inference on RXT3090 GPU

CUDA_VISIBLE_DEVICES=0 python tigerbot_infer.py ${MODEL_DIR} --wbits 4 --groupsize 128 --load ${MODEL_DIR}/tigerbot-7b-4bit-128g.pt

tigerbot-180b-research-4bit-128g can be loaded for parallelism inference on 2 A100(80G) GPUs

CUDA_VISIBLE_DEVICES=0,1 python tigerbot_infer.py ${MODEL_DIR} --wbits 4 --groupsize 128 --load {MODEL_DIR}/tigerbot-4bit-128g.pt

For quantized model shards

CUDA_VISIBLE_DEVICES=0,1 python tigerbot_infer.py ${MODEL_DIR} --wbits 4 --groupsize 128 --load "{MODEL_DIR}/tigerbot-4bit-128g-*.pt"

Datasets

Pretraining Datasets

Tiger

  • Distribution of zh-book and coding data.

中文书籍分类代码语言

Supervised Fine-tuning Datasets

Data collection

  • We collect SFT data as follows: a. self-instruct b. human-labeling c. open-source data cleaning

Data cleaning

We clean and filter data as follows:

  • rule-based and keyword-based ways to filter low quality and unsafe contents.
  • deduplicate

Datasets to open source

Domain-specific Data

Evaluation

We evaluate our SFT models on seven public NLP datasets, and compare these with OpenAI-InstructGPT. Results against OpenAI-InstructGPT-6B-SFT.

image

We evaluate our Pretrained models on seven public NLP datasets. Results against bloom-7b1.

image

API

TigerBot provide APIs including Chat-API,Plug-ins,Fine-Tunes.

How to Use APIs

import requests

url = "https://api.tigerbot.com/bot-service/ft/call"

headers = {
    'Authorization': 'Bearer ' + API_KEY
}
payload = {
    'ftId': 'Your ftId',
    'text': '将以下中文翻译为英文:对此美国的政策制定者目前陷入了困境:一方面要促进增长,另一方面又得降低总债务水平'
}

response = requests.post(url, headers=headers, json=payload)

print(response.text)
{
  "code": 200,
  "msg": "操作成功",
  "data": {
    "result": [
      "The dilemma facing US policymakers is how to stimulate growth while lowering the level of total debt."
    ]
  }
}

You can apply API on TigerBot

You can use Tigerbot-7B or Tigerbot-180B by Chat-API

Tailor a model to your specific training data

Cases

Chat Cases

image image image image image image

Join Us

Our product

https://www.tigerbot.com

Tel us

021-63888086

Email us

[email protected]

[email protected]

Wechat

Tiger

Limitations and Disclaimers

Current models may contain hallucinatory, misleading, or discriminatory content. Please use the content generated by TigerBot series models with caution, and do not spread the generated harmful content.

The project developer is not responsible for any harm or loss caused by the use of this project (including but not limited to data, models, codes, etc.).