Lists (4)
Sort Name ascending (A-Z)
Stars
A tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.
implementation of "DCCRN-Deep Complex Convolution Recurrent Network for Phase-Aware Speech Enhancement" by pytorch
A PyTorch implementation of Conv-TasNet described in "TasNet: Surpassing Ideal Time-Frequency Masking for Speech Separation" with Permutation Invariant Training (PIT).
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
ethanjperez / film
Forked from facebookresearch/clevr-iepFiLM: Visual Reasoning with a General Conditioning Layer
attention-based scaling adaptation for target speech extraction
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation Pytorch's Implement
Code for the paper Hybrid Spectrogram and Waveform Source Separation
《代码随想录》LeetCode 刷题攻略:200道经典题目刷题顺序,共60w字的详细图解,视频难点剖析,50余张思维导图,支持C++,Java,Python,Go,JavaScript等多语言版本,从此算法学习不再迷茫!🔥🔥 来看看,你会发现相见恨晚!🚀
A tool for transcribing orthographic text as IPA (International Phonetic Alphabet)
Multilingual G2P in 100 languages
利用HuggingFace的官方下载工具从镜像网站进行高速下载。
Keyword spotting and forced alignment in any language
AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning (ICLR 2023).
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deplo…
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Open-Sora: Democratizing Efficient Video Production for All
Pytorch implementation of Diffusion Models (https://arxiv.org/pdf/2006.11239.pdf)
Wav2Keyword is keyword spotting(KWS) based on Wav2Vec 2.0. This model shows state-of-the-art in Speech commands dataset V1 and V2.
Seeing Wake Words: Audio-visual Keyword Spotting
[CVPR'24] UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
This repository contains code for applying Data2Vec to pretrain Keyword Transformer model as described in "Improving Label-Deficient Keyword Spotting Through Self-Supervised Pretraining".
Flops counter for convolutional networks in pytorch framework