GitHub - basusourya/mirostat: Code for the paper-"Mirostat: A Perplexity-Controlled Neural Text Decoding Algorithm" (https://arxiv.org/abs/2007.14966).

Code for mirostat sampling algorithm proposed in our paper "Mirostat: A Perplexity-Controlled Neural Text Decoding Algorithm" in ICLR 2021. The paper is available here.

Tl;dr: We provide a new text decoding algorithm that directly controls generated text statistics and hence generates more human-like texts using large language models like GPT-2, CTRL, etc.

Installation requirement: (Tested with version 4.16.2)

pip install transformers

Example Use:

python mirostat.py --num_tokens 200 --tau 3.0 --context "/context.txt"

where num_tokens reperesent the number of tokens to be generated, tau reperesent the average surprise value (i.e. log of perplexity), and context.txt is a text file containing the context.

If you find the code useful in your work, please cite it as:

@inproceedings{
BasuRKV2021,
title={MIROSTAT: A NEURAL TEXT DECODING ALGORITHM THAT DIRECTLY CONTROLS PERPLEXITY},
author={Sourya Basu and Govardana Sachitanandam Ramachandran and Nitish Shirish Keskar and Lav R. Varshney},
booktitle={International Conference on Learning Representations},
year={2021},
url={https://openreview.net/forum?id=W1G1JZEIy5_}
}

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
LICENSE		LICENSE
Mirostat_sampling_experiments.ipynb		Mirostat_sampling_experiments.ipynb
README.md		README.md
mirostat.py		mirostat.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

License

basusourya/mirostat

Folders and files

Latest commit

History

Repository files navigation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages