Stars
Audio Deepfake Detection using the Stationary Wavelet Transform
A Flask-based content recommendation system that suggests movies based on user preferences using content similarity
Food/Diet Recommendation system using machine learning
LookSync is an intelligent product recommendation system that leverages cutting-edge machine learning and deep learning technologies to provide personalized recommendations. As a bonus, it includes…
real time face swap and one-click video deepfake with only a single image
Real-time face swap for PC streaming or video calls
Official implementation of the INTERSPEECH 2024 paper: Temporal-Channel Modeling in Multi-head Self-Attention for Synthetic Speech Detection
This repository includes the code to reproduce our paper "End-to-end anti-spoofing with RawNet2" (https://arxiv.org/abs/2011.01108) published in ICASSP '21.
[ECCV2024 - Oral] Adaptive Parametric Activation
Official repository for RawNet, RawNet2, and RawNet3
This repository is related to our Dataset and Detection code from the paper: AI-Synthesized Voice Detection Using Neural Vocoder Artifacts accepted in CVPR Workshop on Media Forensic 2023.
Pytorch implementation of "LEVERAGING POSITIONAL-RELATED LOCAL-GLOBAL DEPENDENCY FOR SYNTHETIC SPEECH DETECTION"
Learning discriminative and robust time-frequency representations for environmental sound classification: Convolutional neural networks (CNN) are one of the best-performing neural network architect…
2D residual U-Net (ResUNet) and a lead combiner (LC) for 12-lead ECG Abnormality Classification
This code is for our accepted manuscript to 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
Simple synthetic audio feature extractor
This repository contains the code for the INTERSPEECH2023 paper: "Alzheimer Disease Classification through ASR-based Transcriptions: Exploring the Impact of Punctuation and Pauses"
Baseline method for sound event localization task of DCASE 2020 challenge
PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
MetricGAN+ PyTorch Implementation
Counter speech classification using adversarial training
NeuroMelNet - End-to-End binary classification model for AI speech detection with PyTorch implementation. Python API & Tg-bot API
Real-time binaural target sound extraction model.
Official Pytorch Implementation of "Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation"