Lists (7)
Sort Name ascending (A-Z)
Stars
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Learning audio concepts from natural language supervision
Source code for models described in the paper "AudioCLIP: Extending CLIP to Image, Text and Audio" (https://arxiv.org/abs/2106.13043)
An unofficial PyTorch implementation of the audio LM VALL-E
PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
In defence of metric learning for speaker recognition
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
A python package to analyze and compare voices with deep learning
Repository for the paper: VoiceMe: Personalized voice generation in TTS
RROS is a dual-kernel OS for satellites or other scenarios that need both real-time and general-purpose abilities. RROS = RTOS (Rust) + Linux (C).
Use ChatGPT to summarize the arXiv papers. 全流程加速科研,利用chatgpt进行论文全文总结+专业翻译+润色+审稿+审稿回复
基于标贝数据继续训练,同时对原本的FastSpeech2模型做了改进,引入了韵律表征以及韵律预测模块,使中文发音更生动且富有节奏
Manipulate audio with a simple and easy high level interface
Implementation of the VITS model
基于PaddlePaddle实现端到端中文语音识别,从入门到实战,超简单的入门案例,超实用的企业项目。支持当前最流行的DeepSpeech2、Conformer、Squeezeformer模型
This is a python API which allows you to get the transcript/subtitles for a given YouTube video. It also works for automatically generated subtitles and it does not require an API key nor a headles…
PESQ (Perceptual Evaluation of Speech Quality) Wrapper for Python Users (narrow band and wide band)
vits chinese, tts chinese, tts mandarin 史上训练最简单,音质最好的语音合成系统
AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data
基于PyTorch的VITS-BigVGAN的tts中文模型,加入韵律预测模型。