len_gen_lm

Prepare Experiment Environment (Only for the first time)

Clone this repository

git clone [email protected]:kazemnejad/len_gen_lm.git

Create a conda environment

conda create -n len_gen_lm python=3.9
conda activate len_gen_lm

Install requirements

# Install pytorch
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
# Install other requirements
pip install -r requirements.txt

Fill the environment variables in env.sh with

# We save checkpoints and logs here. It should be shared network storage accessible from all nodes.
export PROJECT_DIR=/path/to/network/storage/projects/len_gen_lm

# Go to comet.ml and get your API token
export COMET_API_KEY="..."

Train

./run_training.sh <pe> <size>

<pe> can be chosen from:

alibi: Alibi
none: NoPE

<size> can be chosen from:

100m
300m
1b

What compute resources should be used?

at least:

CPU: 6 cores
Memory: 32GB

it will use all gpus available on the node. So, the more gpus we have, the faster it will be.

Name	Name	Last commit message	Last commit date
Latest commit kazemnejad Fix Oct 21, 2023 674a841 · Oct 21, 2023 History 71 Commits
.vscode	.vscode	Add inference script	May 15, 2023
configs	configs	Fix	Oct 17, 2023
deepspeed_configs	deepspeed_configs	Fix	Oct 7, 2023
docker/ngc-nope-training	docker/ngc-nope-training	Add new training scripts	Oct 7, 2023
instruction_tuning	instruction_tuning	Fix	Oct 21, 2023
model_configs/1b	model_configs/1b	Add new training scripts	Oct 7, 2023
results_zip	results_zip	results added	May 16, 2023
scripts	scripts	Fix bug	Aug 5, 2023
.gitignore	.gitignore	Add cleaner training script	Aug 3, 2023
README.md	README.md	Update readme	Aug 3, 2023
create_santacoder_tokenizer.py	create_santacoder_tokenizer.py	Add new training scripts	Oct 7, 2023
download_and_subsample_starcode_data.py	download_and_subsample_starcode_data.py	Add new training scripts	Oct 7, 2023
download_starcoder_data.py	download_starcoder_data.py	Add new training scripts	Oct 7, 2023
inference.py	inference.py	Add support for windowed attention	Aug 7, 2023
inference.py.old	inference.py.old	Add inference	Aug 5, 2023
inference_llm.py	inference_llm.py	Fix	Oct 18, 2023
inference_script.sh.template	inference_script.sh.template	Add inference script	May 15, 2023
model.py	model.py	Fix	Oct 17, 2023
modeling_t5.py	modeling_t5.py	Init repo	May 13, 2023
prepare_compute_canada.sh	prepare_compute_canada.sh	Add cleaner training script	Aug 3, 2023
requirements.txt	requirements.txt	Update with new changes	Aug 3, 2023
run_inference.sh	run_inference.sh	Fix	Oct 17, 2023
run_instruction_tuning.sh	run_instruction_tuning.sh	Fix wandb	Oct 20, 2023
run_training.sh	run_training.sh	Fix	Oct 7, 2023
subsample_starcoder_data.py	subsample_starcoder_data.py	Add new training scripts	Oct 7, 2023
sync_checkpoints_to_network.sh	sync_checkpoints_to_network.sh	Fix	Oct 7, 2023
tokenize_dolma.py	tokenize_dolma.py	Add new training scripts	Oct 7, 2023
tokenize_santacoder_data.py	tokenize_santacoder_data.py	Add new training scripts	Oct 7, 2023
train.py	train.py	Fix bug	Aug 8, 2023
train_llm.py	train_llm.py	Fix	Oct 7, 2023
trainer_script.sh.template	trainer_script.sh.template	Fix	Oct 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

len_gen_lm

Prepare Experiment Environment (Only for the first time)

Train

What compute resources should be used?

About

Releases

Packages

Languages

kazemnejad/len_gen_lm

Folders and files

Latest commit

History

Repository files navigation

len_gen_lm

Prepare Experiment Environment (Only for the first time)

Train

What compute resources should be used?

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages