Highlights
- Pro
Stars
List of diffusion related active submissions on OpenReview for ICLR 2025.
Easily train a good VC model with voice data <= 10 mins!
zero-shot voice conversion & singing voice conversion, with real-time support
Bittensor's Voice Guard subnet functioning as a anti-voice deepfake.
Apply Score diffusion to improve speech signals recorded under various adverse conditions and distortions, including noise, reverberation, clipping, equalization (EQ) distortion, packet loss, codec…
Word alignments generated by the Montreal Forced Aligner for the Librispeech dataset
StoRM: A Diffusion-based Stochastic Regeneration Model for Speech Enhancement and Dereverberation
Conditional Diffusion Probabilistic Model for Speech Enhancement
Score-based Generative Models (Diffusion Models) for Speech Enhancement and Dereverberation
Release for Improved Denoising Diffusion Probabilistic Models
Pytorch Reimplementation of DiffWave unconditional generation: a high quality waveform synthesizer.
The official code of the paper "Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration Rate".
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
Comprehensive Gradio WebUI for audio processing, powered by Whisper engines (Whisper, Faster-Whisper, Whisper-Timestamped). Features Voice Changer, zero-shot Voice Cloning (E2, F5-TTS), YouTube dow…
在没有sudo权限的情况下,在linux上使用clash
SA-toolkit: Speaker speech anonymization toolkit in python
State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
Optimized implementation for color-icon-matrix barcodes
Remote Sensing Image Classification Dataset for Aircraft Fine-Grained Recognition
Official repository for "Speaking Style Conversion With Discrete Self-Supervised Units" (EMNLP 2023). https://arxiv.org/abs/2212.09730
A curated list of awesome audio adversarial examples papers(with code & demo if available).
[NeurIPS 2024 spotlight] Offical implementation of MSFA and release of SARDet_100K dataset for Large-Scale Synthetic Aperture Radar (SAR) Object Detection
Awesome-LLM-Tabular: a curated list of Large Language Model applied to Tabular Data