Audio Spoof Detection: A Comprehensive Analysis of Multi-Model Approaches in SafeSpeak-2024

Overview

This project investigates advanced techniques for detecting audio spoofing using state-of-the-art self-supervised learning (SSL) and transformer architectures. It evaluates various models on ASVspoof benchmark datasets, demonstrating significant improvements in distinguishing between genuine and synthesized speech.

Models Evaluated

Wav2Vec 2.0
HuBERT
SSL Wav2Vec 2.0 with PSFAN Backend
Audio Spectral Transformer (AST)
Sound Event Detection Model (EfficientNet-B0)
WavLM Base

Key Data Augmentation Techniques

Gaussian noise injection
Signal-to-noise ratio modifications
Dynamic gain variations
Background noise injection

Loss Functions & Optimization

Focal Loss, Cross-Entropy Loss, BCEWithLogits Loss
Optimizer: AdamW with linear/cosine schedulers

Experimental Results

The Audio Spectral Transformer and SSL Wav2Vec outperformed other models with near-perfect precision and recall, demonstrating the power of transformer architectures in spoof detection.

Model	Public LB EER	Precision	Recall	F1-Score
Wav2Vec 2.0	0.46516	0.888	0.788	0.835
SSL Wav2Vec	0.02925	-	-	-
Audio Spectral Transformer	0.01384	0.999	0.999	0.999
HuBERT	8.11672	0.877	0.764	0.817
Pretrained Wav2Vec	0.77492	0.845	0.725	0.780
WavLM Base	1.87658	0.820	0.690	0.750

Links

Notebooks

AST - 0.01384 EER on Public LB
SSL - 0.02925 EER on Public LB

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
configs		configs
notebooks		notebooks
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
main.py		main.py
report.pdf		report.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audio Spoof Detection: A Comprehensive Analysis of Multi-Model Approaches in SafeSpeak-2024

Overview

Models Evaluated

Key Data Augmentation Techniques

Loss Functions & Optimization

Experimental Results

Links

Notebooks

About

Releases

Packages

Languages

l1ghtsource/asvspoof-airi

Folders and files

Latest commit

History

Repository files navigation

Audio Spoof Detection: A Comprehensive Analysis of Multi-Model Approaches in SafeSpeak-2024

Overview

Models Evaluated

Key Data Augmentation Techniques

Loss Functions & Optimization

Experimental Results

Links

Notebooks

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages