Stars
Official repository for the paper "xLSTM-SENet: xLSTM for Single-Channel Speech Enhancement"
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR benchmarks, while also offering outstanding singing lyrics rec…
An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.
Modular implementation of the Steered Response Power method and its variants
⚡️鸿蒙Next Hap安装包合集,如果您觉得有帮助,还请点亮一下 Star 🌟 哦~ 万分感谢!
Bolt is a deep learning library with high performance and heterogeneous flexibility.
The official repo of NBC & SpatialNet for multichannel speech separation, denoising, and dereverberation
SonicSim: A customizable simulation platform for speech processing in moving sound source scenarios
[NIPS2023] Code and Model for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset
[TPAMI2024] Codes and Models for VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
Deep model with built-in self-attention alignment for acoustic echo cancellation, Pytorch implement
This is an open-source implementation of the ITU P.808 standard for "Subjective evaluation of speech quality with a crowdsourcing approach" (see https://www.itu.int/rec/T-REC-P.808/en). It uses Ama…
Noise supression using deep filtering
pyMetaheuristic: A Comprehensive Python Library for Optimization
关于语音信号声源定位DOA估计所用的一些传统算法
Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.
Some useful features of speech process, such as MFCC, gammatone filterbank, GFCC, spectrum(power spectrum and log-power spectrum), Amplitude Modulation Spectrum(AMS) and so on.
Python loaders for many Real Room Impulse Response databases
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
sound source localization algorithm experiment
A framework for quick testing and comparing multi-channel speech enhancement and separation methods, such as DSB, MVDR, LCMV, GEVD beamforming and ICA, FastICA, IVA, AuxIVA, OverIVA, ILRMA, FastMNMF.