GitHub - tzlby/FlagAI: FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensible toolkit for large-scale model.

简体中文

FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensible toolkit for large-scale model. Our goal is to support training, fine-tuning, and deployment of large-scale models on various downstream tasks with multi-modality.

Now it supports text-image representation model AltCLIP and text-to-image generation AltDiffusion . And it support WuDao GLM with a maximum of 10 billion parameters (see Introduction to GLM). It also supports EVA-CLIP, OPT, BERT, RoBERTa, GPT2, T5, ALM, and models from Huggingface Transformers.
It provides APIs to quickly download and use those pre-trained models on a given text, fine-tune them on widely-used datasets collected from SuperGLUE and CLUE benchmarks, and then share them with the community on our model hub. It also provides prompt-learning toolkit for few shot tasks.
These models can be applied to (Chinese/English) Text, for tasks like text classification, information extraction, question answering, summarization, and text generation.
FlagAI is backed by the three most popular data/model parallel libraries — PyTorch/Deepspeed/Megatron-LM — with seamless integration between them. Users can parallel their training/testing process with less than ten lines of code.

The code is partially based on GLM, Transformers and DeepSpeedExamples.

News

[28 Nov 2022] release v1.5.0, support 1.1B EVA-CLIP and [ALM: A large Arabic Language Model based on GLM], examples in ALM
[10 Nov 2022] release v1.4.0, support AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities, examples in AltCLIP and AltDiffusion
[29 Aug 2022] release v1.3.0, Added CLIP module and redesigned tokenizer apis in #81
[21 Jul 2022] release v1.2.0, ViTs are supported in #71
[29 Jun 2022] release v1.1.0, support OPTs downloading and inference/finetuning #63
[17 May 2022] made our first contribution in #1

Requirements and Installation
Quick Started
Pretrained Models and examples
Tutorials
Contributing
Contact us
License

Requirements and Installation

PyTorch version >= 1.8.0
Python version >= 3.8
For training/testing models on GPUs, you'll also need install CUDA and NCCL

To install FlagAI with pip:

pip install -U flagai

[Optional]To install FlagAI and develop locally:

git clone https://github.com/FlagAI-Open/FlagAI.git
python setup.py install

[Optional] For faster training install NVIDIA's apex

git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

[Optional] For ZeRO optimizers install DEEPSPEED

git clone https://github.com/microsoft/DeepSpeed
cd DeepSpeed
DS_BUILD_CPU_ADAM=1 DS_BUILD_AIO=1 DS_BUILD_UTILS=1 pip install -e .
ds_report # check the deespeed status

[Tips] For single-node docker enviroments, we need to setup ports for your ssh. e.g., [email protected] with port 7110

>>> vim ~/.ssh/config
Host 127.0.0.1
    Hostname 127.0.0.1
    Port 7110
    User root

[Tips] For multi-node docker enviroments, generate ssh keys and copy the public key to all nodes (in ~/.ssh/)

>>> ssh-keygen -t rsa -C "[email protected]"

Quick Start

We provide many models which are trained to perform different tasks. You can load these models by AutoLoader to make prediction. See more in FlagAI/quickstart.

Load model and tokenizer

We provide the AutoLoad class to load the model and tokenizer quickly, for example:

from flagai.auto_model.auto_loader import AutoLoader

auto_loader = AutoLoader(
    task_name="title-generation",
    model_name="BERT-base-en"
)
model = auto_loader.get_model()
tokenizer = auto_loader.get_tokenizer()

This example is for the title_generation task, and you can also model other tasks by modifying the task_name. Then you can use the model and tokenizer to finetune or test.

Predictor

We provide the Predictor class to predict for different tasks, for example:

from flagai.model.predictor.predictor import Predictor
predictor = Predictor(model, tokenizer)
test_data = [
    "Four minutes after the red card, Emerson Royal nodded a corner into the path of the unmarked Kane at the far post, who nudged the ball in for his 12th goal in 17 North London derby appearances. Arteta's misery was compounded two minutes after half-time when Kane held the ball up in front of goal and teed up Son to smash a shot beyond a crowd of defenders to make it 3-0.The goal moved the South Korea talisman a goal behind Premier League top scorer Mohamed Salah on 21 for the season, and he looked perturbed when he was hauled off with 18 minutes remaining, receiving words of consolation from Pierre-Emile Hojbjerg.Once his frustrations have eased, Son and Spurs will look ahead to two final games in which they only need a point more than Arsenal to finish fourth.",
]

for text in test_data:
    print(
        predictor.predict_generate_beamsearch(text,
                                              out_max_length=50,
                                              beam_size=3))

Pretrained Models and examples

This session explains how the base NLP classes work, how you can load pre-trained models to tag your text, how you can embed your text with different word or document embeddings, and how you can train your own language models, sequence labeling models, and text classification models. Let us know if anything is unclear. See more in FlagAI/examples.

Tutorials

We provide a set of quick tutorials to get you started with the library:

Contributing

Thanks for your interest in contributing! There are many ways to get involved; start with our contributor guidelines and then check these open issues for specific tasks.

Contact us

License

The majority of FlagAI is licensed under the Apache 2.0 license, however portions of the project are available under separate license terms:

Megatron-LM is licensed under the Megatron-LM license
GLM is licensed under the MIT license

Name		Name	Last commit message	Last commit date
Latest commit History 254 Commits
.github		.github
doc_zh		doc_zh
docs		docs
examples		examples
flagai		flagai
quickstart		quickstart
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CLA.md		CLA.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
COMMITTERS.csv		COMMITTERS.csv
CONTRIBUTING.md		CONTRIBUTING.md
GOVERNANCE.md		GOVERNANCE.md
LICENSE		LICENSE
README.md		README.md
README_zh.md		README_zh.md
SUPPORT.md		SUPPORT.md
flagai_wechat.png		flagai_wechat.png
logo.png		logo.png
prepare_test.sh		prepare_test.sh
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

News

Requirements and Installation

Quick Start

Load model and tokenizer

Predictor

Pretrained Models and examples

Tutorials

Contributing

Contact us

License

About

Releases

Packages

Languages

License

tzlby/FlagAI

Folders and files

Latest commit

History

Repository files navigation

News

Requirements and Installation

Quick Start

Load model and tokenizer

Predictor

Pretrained Models and examples

Tutorials

Contributing

Contact us

License

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages