Stars
Instant voice cloning by MIT and MyShell. Audio foundation model.
Image super resolution models for PyTorch.
A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library for Python
The fastai book, published as Jupyter Notebooks
This repository aims at providing efficient CNNs for Audio Tagging. We provide AudioSet pre-trained models ready for downstream training and extraction of audio embeddings.
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Unofficial PyTorch implementation of Google AI's VoiceFilter system
Controllable and fast Text-to-Speech for over 7000 languages!
Easy to use, state-of-the-art Neural Machine Translation for 100+ languages
KAN-TTS is a speech-synthesis training framework, please try the demos we have posted at https://modelscope.cn/models?page=1&tasks=text-to-speech
ModelScope: bring the notion of Model-as-a-Service to life.
Neural Machine Translation (NMT) tutorial. Data preprocessing, model training, evaluation, and deployment.
Code for the paper Hybrid Spectrogram and Waveform Source Separation
Dockerized Facebook Demucs library to make it easy its execution
news-please - an integrated web crawler and information extractor for news that just works