Small research project - how much it would cost to create Alpaca-like dataset using slightly different approach. All data byproducts are CC0-licensed.
Remember that developing a model based on data you generated via model API might violate the terms of service of the model API provider.
- Clone the repo
git clone https://github.com/mobarski/alpaca-libre && cd alpaca-libre
- Install required python modules
pip install -r requirements.txt
- View / edit generate.py
- Set API_KEY
export OPENAI_KEY=...
- Run the script
python3 generate.py
GitHub repos:
- https://github.com/tatsu-lab/stanford_alpaca
- https://github.com/yizhongw/self-instruct
- https://github.com/orhonovich/unnatural-instructions
Papers: