Stars
A library for calculating the FLOPs in the forward() process based on torch.fx
An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.
Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement
wsj0-{2, 3, 4, 5} mix generation scripts, in Python.
INTERSPEECH 2023-2024 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023-24 conference. Explore the latest advances in speech and language processin…
Use ChatGPT to summarize the arXiv papers. 全流程加速科研,利用chatgpt进行论文全文总结+专业翻译+润色+审稿+审稿回复
This repo contains the scripts, models, and required files for the Deep Noise Suppression (DNS) Challenge.
PyTorch implementation of the Quasi-Recurrent Neural Network - up to 16 times faster than NVIDIA's cuDNN LSTM
Python implementation of performance metrics in Loizou's Speech Enhancement book
transformer based neural network for speech enhancement in time domain
implementation of "DCCRN-Deep Complex Convolution Recurrent Network for Phase-Aware Speech Enhancement" by pytorch
Uformer: A Unet based dilated complex & real dual-path conformer network for simultaneous speech enhancement and dereverberation
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…
语音算法相关资源汇总 Resource for Speech Processing || NEWS: official link of VoxCeleb fails recently and an external link is added for download
State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
speech enhancement\speech seperation\sound source localization
PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
context-aware Unet based on transformer for speech denoising
Implement Wave-U-Net by PyTorch, and migrate it to the speech enhancement.
The Microsoft Scalable Noisy Speech Dataset (MS-SNSD) is a noisy speech dataset that can scale to arbitrary sizes depending on the number of speakers, noise types, and Speech to Noise Ratio (SNR) l…
Implementation of paper "DPCRN: Dual-Path Convolution Recurrent Network for Single Channel Speech Enhancement"
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
You can find the speech algorithms you want here
The PyTorch-based audio source separation toolkit for researchers
Speech Enhancement Generative Adversarial Network in PyTorch