Stars
MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!
A python package to build AI-powered real-time audio applications
Python re-implementation of the (constrained) spectral clustering algorithms used in Google's speaker diarization papers.
刷算法全靠套路,认准 labuladong 就够了!English version supported! Crack LeetCode, not only how, but also why.
Unsupervised domain adaptation for conversational speech enhancement using RemixIT
Dual-Path Attention and Recurrent Network for speech separation
Implementation of paper "DPCRN: Dual-Path Convolution Recurrent Network for Single Channel Speech Enhancement"
This is a mandarin version of speech separation dataset like WSJMix and LibriMix
spatial signal processing toolkit a.k.a beamforming toolkit 2.0 (BTK2.0)
Implementation of the paper "SNR-Based Progressive Learning of Deep Neural Network for Speech Enhancement."
🎤 Microphone sound source localization by SRP-PHAT and others numerical methods.(基于SRP-PHAT的麦克风声源定位)
MADE: Masked Autoencoder for Distribution Estimation
Analyze, visualize, and process sound field data recorded by spherical microphone arrays.
A Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain
Python implementation of the Multiple Hypothesis Tracking algorithm
A MATLAB implementation of CHiME4 baseline Beamformit
Speech Enhancement Generative Adversarial Network in TensorFlow
Speech Enhancement based on DNN (Spectral-Mapping, TF-Masking), DNN-NMF, NMF