Name		Name	Last commit message	Last commit date
parent directory ..
images		images
README.md		README.md
constants.py		constants.py
contrastive.py		contrastive.py
data.py		data.py
main.py		main.py
network.py		network.py
requirements.txt		requirements.txt
run.sh		run.sh
supervised.py		supervised.py

README.md

Contrastive learning of general purpose audio representations

This Python library allows pre-training and fine-tuning contrastive embeddings of audio with the COLA method. In particular, one can:

Pre-train COLA embeddings, which use a simple contrastive learning method
Train a linear classifier on pre-trained embeddings
Train a supervised neural network from scratch
Initialize a classifier with pre-trained COLA embeddings and fine-tune on a new dataset

Dependencies

Quickstart

Training has three modes:

SSL, to pre-train a model with self-supervised contrastive learning
DS to fine-tune a pre-trained model on a downstream task
SUP to train a simple supervised system

Pre-train a COLA embedding on a dataset from tensorflow_datasets (here, LibriSpeech):

python -m main --experiment_id=cola_pretrain --model_dir=/tmp/cola \
--training_mode=SSL --ssl_dataset=LBS --strategy=gpu

Note that so far labels are not necessary. After pre-training, the model is saved in /tmp/cola/librispeech/cola_pretrain.

One can train a linear classifier on these embeddings, on the Speech Commands dataset, in a supervised fashion:

python -m main --experiment_id=cola_downstream --ssl_checkpoint_id=cola_pretrain \
--model_dir=/tmp/cola --training_mode=DS --ssl_dataset=LBS --ds_dataset=SPCV2 \
--strategy=gpu --freeze_encoder=true

The flags --ssl_checkpoint_id and --ssl_dataset indicate that the pre-trained model is stored in /tmp/cola/librispeech/cola_pretrain.

Note the --freeze_encoder flag. If set to False, the entire network is fine-tuned.

Advanced usage

Pre-training and fine-tuning only handle tfds datasets, for simplicity. One can easily use arbitrary datasets by overriding the get_self_supervised_data and get_downstream_dataset methods in data.py.

Reference

If you use this repository, please consider citing:

@misc{saeed2020contrastive,
      title={Contrastive Learning of General-Purpose Audio Representations},
      author={Aaqib Saeed and David Grangier and Neil Zeghidour},
      year={2020},
      eprint={2010.10915},
      archivePrefix={arXiv},
      primaryClass={cs.SD}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cola

cola

README.md

Contrastive learning of general purpose audio representations

Dependencies

Quickstart

Advanced usage

Reference

Files

cola

Directory actions

More options

Directory actions

More options

Latest commit

History

cola

Folders and files

parent directory

README.md

Contrastive learning of general purpose audio representations

Dependencies

Quickstart

Advanced usage

Reference