- Shenzhen
Lists (1)
Sort Name ascending (A-Z)
Stars
Robust Speech Recognition via Large-Scale Weak Supervision
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…
Clone a voice in 5 seconds to generate arbitrary speech in real-time
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
TensorFlow code and pre-trained models for BERT
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
Generative Models by Stability AI
A generative world for general-purpose robotics & embodied AI learning.
Open-Sora: Democratizing Efficient Video Production for All
Magenta: Music and Art Generation with Machine Intelligence
PyTorch implementations of Generative Adversarial Networks.
Fast and memory-efficient exact attention
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
InstantID: Zero-shot Identity-Preserving Generation in Seconds 🔥
Train transformer language models with reinforcement learning.
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
TensorFlow-based neural network library
Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!
ImageBind One Embedding Space to Bind Them All
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…
vits2 backbone with multilingual-bert
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech