This is the official code release for OPERA: OPEn Respiratory Acoustic foundation models.
OPERA is an OPEn Respiratory Acoustic foundation model pretraining and benchmarking system. We curate large-scale respiratory audio datasets (136K samples, 440 hours), pretrain three pioneering foundation models, and build a benchmark consisting of 19 downstream respiratory health tasks for evaluation. Our pretrained models demonstrate superior performance (against existing acoustic models pretrained with general audio on 16 out of 19 tasks) and generalizability (to unseen datasets and new respiratory audio modalities). This highlights the great promise of respiratory acoustic foundation models and encourages more studies using OPERA as an open resource to accelerate research on respiratory audio for health.
To reproduce the results in our paper, develop your own foundation models, or deploy our pretrained models for downstream healthcare applications, please follow the guideline below.
The environment with all the needed dependeciescan be easily created on a Linux machine by running:
git clone https://github.com/evelyn0414/OPERA.git
cd ./OPERA
conda env create --file environment.yml
sh ./prepare_env.sh
source ~/.bashrc
conda init
conda activate audio
sh ./prepare_code.sh
*After installation, next time to run the code, you only need to acivate the audio env by conda activate audio
.
Dataset | Source | Access | License |
---|---|---|---|
UK COVID-19 | IC | https://zenodo.org/records/10043978 | OGL 3.0 |
COVID-19 Sounds | UoC | https://covid-19-sounds.org/blog/neurips_dataset | Custom license |
CoughVID | EPFL | https://zenodo.org/records/4048312 | CC BY 4.0 |
ICBHI | * | https://bhichallenge.med.auth.gr | CC0 |
HF Lung | * | https://gitlab.com/techsupportHF/HF_Lung_V1 | CC BY-NC 4.0 |
https://gitlab.com/techsupportHF/HF_Lung_V1_IP | |||
Coswara | IISc | https://github.com/iiscleap/Coswara-Data | CC BY 4.0 |
KAUH | KAUH | https://data.mendeley.com/datasets/jwyy9np4gv/3 | CC BY 4.0 |
Respiratory@TR | ITU | https://data.mendeley.com/datasets/p9z4h98s6j/1 | CC BY 4.0 |
SSBPR | WHU | https://github.com/xiaoli1996/SSBPR | CC BY 4.0 |
MMlung | UoS | https://github.com/MohammedMosuily/mmlung | Custom license |
NoseMic | UoC | https://github.com/evelyn0414/OPERA/tree/main/datasets/nosemic | Custom license |
*ICBHI and HF Lung datasets come from multiple sources. COVID-19 Sounds, SSBPR, MMLung and NoseMic are available upon request, while other data can be downloaded using the above url. Custom license is detailed in the DTA (data transfer agreement).
We provided some curated datasets which can be downloaded from the Google drive (replace the datasets
folder).
Example training can be found in cola_pretraining.py
and mae_pretraining.py
.
Start by running
sh scripts/multiple_pretrain.sh
The pretrained weights are available at: Zenodo or HuggingFace
our pretrained model checkpoints: OPERA-CT, OPERA-CE, OPERA-GT.
They will be audomatically downloaded before feature extraction.
Run example of Task 10:
sh datasets/KAUH/download_data.sh
sh scripts/kauh_eval.sh > cks/logs/Test_Task10_results.log
Run example of Task 11:
sh datasets/copd/download_data.sh
sh scripts/copd_eval.sh > cks/logs/Test_Task11_results.log
The log is included under 'cks/logs/' for reference. The results for all tasks are summarised in Table 4 and 5.
sh scripts/benchmark.sh
If you use OPERA, please consider citing: