Model

Model List

To support the rapid progress of PLMs on text generation, TextBox 2.0 incorporates 47 models/modules. The following table lists the name and reference of each model/module. Click the model/module name for detailed usage instructions.

Category		Model Name	Reference
General	CLM	OpenAI-GPT	(Radford et al., 2018)
		GPT2	(Radford et al., 2019)
		GPT_Neo	(Gao et al., 2021)
		OPT	(Artetxe et al., 2022)
	Seq2Seq	BART	(Lewis et al., 2020)
		T5	(Raffel et al., 2020)
		UniLM	(Dong et al., 2019)
		MASS	(Song et al., 2019)
		Pegasus	(Zhang et al., 2019)
		ProphetNet	(Qi et al., 2020)
		MVP	(Tang et al., 2022)
		BERT2BERT	(Rothe et al., 2020)
		BigBird-Pegasus	(Zaheer et al., 2020)
		LED	(Beltagy et al., 2020)
		LongT5	(Guo et al., 2021)
		PegasusX	(Phang et al., 2022)
Multilingual Models		mBART	(Liu et al., 2020)
		mT5	(Xue et al., 2020)
		Marian	(Tiedemann et al., 2020)
		M2M_100	(Fan et al., 2020)
		NLLB	(NLLB Team, 2022)
		XLM	(Lample et al., 2019)
		XLM-RoBERTa	(Conneau et al., 2019)
		XLM-ProphetNet	(Qi et al., 2020)
Chinese Models		CPM	(Zhang et al., 2020)
		CPT	(Shao et al., 2021)
		Chinese-BART	(Shao et al., 2021)
		Chinese-GPT2	(Zhao et al., 2019)
		Chinese-T5
		Chinese-Pegasus
Dialogue Models		Blenderbot	(Roller et al., 2020)
		Blenderbot-Small	(Roller et al., 2020)
		DialoGPT	(Zhang et al., 2019)
Conditional Models		CTRL	(Keskar et al., 2019)
Conditional Models		PPLM	(Dathathri et al., 2019)
Distilled Models		DistilGPT2	(Sanh et al., 2019)
Distilled Models		DistilBART	(Shleifer et al., 2020)
Prompting Models		PTG	(Li et al., 2022a)
Prompting Models		Context-Tuning	(Tang et al., 2022)
Lightweight Modules		Adapter	(Houlsby et al., 2019)
		Prefix-tuning	(Li and Liang, 2021)
		Prompt tuning	(Lester et al., 2021)
		LoRA	(Hu et al., 2021)
		BitFit	(Ben-Zaken et al. ,2021)
		P-Tuning v2	(Liu et al., 2021a)
Non-Pre-training Models		RNN	(Sutskever et al., 2014)
Non-Pre-training Models		Transformer	(Vaswani et al., 2017b)

Pre-trained Model Parameters

TextBox 2.0 is compatible with Hugging Face, so model_path receives a name of model on Hugging Face like facebook/bart-base or just a local path. config_path and tokenizer_path (same value as model_path by default) also receive a Hugging Face model or a local path.

Besides, config_kwargs and tokenizer_kwargs are useful when additional parameters are required.

For example, when building a Task-oriented Dialogue System, special tokens can be added with additional_special_tokens; fast tokenization can also be switched with use_fast:

config_kwargs: {}
tokenizer_kwargs: {'use_fast': False, 'additional_special_tokens': ['[db_0]', '[db_1]', '[db_2]'] }

Other commonly used parameters include label_smoothing: <smooth-loss-weight>

The full keyword arguments should be found in PreTrainedTokenizer or documents of the corresponding tokenizer.

Generation Parameters

The pre-trained model can perform generation using various methods by combining different parameters. By default, beam search is adapted:

generation_kwargs: {'num_beams': 5, 'early_stopping': True}

Nucleus sampling is also supported by pre-trained model:

generation_kwargs: {'do_sample': True, 'top_k': 10, 'top_p': 0.9}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

model.md

model.md

Model

Model List

Pre-trained Model Parameters

Generation Parameters

Files

model.md

Latest commit

History

model.md

File metadata and controls

Model

Model List

Pre-trained Model Parameters

Generation Parameters