- Russia, Khabarovsk
Stars
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Robust Speech Recognition via Large-Scale Weak Supervision
Clone a voice in 5 seconds to generate arbitrary speech in real-time
OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library for Python
Faster Whisper transcription with CTranslate2
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translatio…
TensorFlow CNN for fast style transfer ⚡🖥🎨🖼
Manipulate audio with a simple and easy high level interface
so-vits-svc fork with realtime support, improved interface and more features.
Code for the paper "Jukebox: A Generative Model for Music"
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
Real-Time High-Resolution Background Matting
ECCV18 Workshops - Enhanced SRGAN. Champion PIRM Challenge on Perceptual Super-Resolution. The training codes are in BasicSR.
Single Image to 3D using Cross-Domain Diffusion for 3D Generation
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
😝 TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
An unofficial PyTorch implementation of the audio LM VALL-E
A python package to analyze and compare voices with deep learning
This is an open source project (formerly named Listen, Attend and Spell - PyTorch Implementation) for end-to-end ASR implemented with Pytorch, the well known deep learning toolkit.
🤖💬 Transformer TTS: Implementation of a non-autoregressive Transformer based neural network for text to speech.
AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss
Large-scale pretrained models for goal-directed dialog
A Generative Flow for Text-to-Speech via Monotonic Alignment Search