Skip to content
View deeesp's full-sized avatar

Organizations

@SAPL-SSG

Block or report deeesp

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusion

162 10 Updated Sep 27, 2024

a list of demo websites for automatic music generation research

643 43 Updated Nov 20, 2024

High-quality Text-to-Audio Generation with Efficient Diffusion Transformer

Python 250 9 Updated Nov 12, 2024

SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.

Python 71 6 Updated Nov 14, 2024

The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.

Python 1,298 90 Updated Aug 13, 2024

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

Python 1,990 510 Updated Jul 27, 2024

PyTorch Implementation of GenerSpeech (NeurIPS'22): a text-to-speech model towards zero-shot style transfer of OOD custom voice.

Python 321 45 Updated Feb 9, 2024

A benchmarking suite for disentanglement algorithms, suited for evaluating robustness to correlated factors. Codebase for the paper "Disentanglement of Correlated Factors via Hausdorff Factorized S…

Python 71 9 Updated Feb 25, 2023

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Python 5,031 427 Updated Aug 10, 2024

Implementation of Band Split Roformer, SOTA Attention network for music source separation out of ByteDance AI Labs

Python 443 16 Updated Aug 6, 2024
Python 64 60 Updated Oct 13, 2023

Official Implementation of StyleTTS-VC

Python 164 22 Updated Apr 23, 2023

Demucs Lightning: A PyTorch lightning version of Demucs with Hydra and Tensorboard features

Python 85 10 Updated May 3, 2023

A self-supervised learning framework for audio-visual speech

Python 859 137 Updated Dec 7, 2023

g2pK: g2p module for Korean

Python 237 43 Updated Mar 1, 2022

PyTorch Implementation of ProDiff (ACM-MM'22) with a Extremely-Fast diffusion speech synthesis pipeline

Python 434 55 Updated Apr 19, 2023

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

Python 4,361 716 Updated May 2, 2023

Korean TTS, Tacotron2, Wavenet

Python 165 96 Updated Jun 15, 2020

Unofficial PyTorch implementation of Google AI's VoiceFilter system

Python 1,103 228 Updated Jul 25, 2024

Avocodo: Generative Adversarial Network for Artifact-free Vocoder

Python 115 15 Updated Jul 14, 2022

DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.

Python 789 113 Updated Mar 26, 2024

PyTorch Implementation of FastDiff (IJCAI'22)

Python 410 64 Updated Jun 20, 2024

[Zoom & Facebook Live] Weekly AI Arxiv 시즌2

972 41 Updated Aug 27, 2023

Code for the paper Hybrid Spectrogram and Waveform Source Separation

Python 8,455 1,085 Updated Apr 24, 2024

거꾸로 읽는 self-supervised learning 파트 1

49 8 Updated Oct 30, 2022

Trends, Tools, News timeline ...

17 1 Updated Nov 4, 2024

This is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.

Python 155 28 Updated Jul 24, 2024

The official PyTorch implementation of "FullSubNet+: Channel Attention FullSubNet with Complex Spectrograms for Speech Enhancement".

Python 247 56 Updated Apr 23, 2024

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch

Jupyter Notebook 1,579 343 Updated Apr 22, 2024
Next