Robust Speech Recognition via Large-Scale Weak Supervision
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Making large AI models cheaper, faster and more accessible
基于大模型搭建的聊天机器人,同时支持 微信公众号、企业微信应用、飞书、钉钉 等接入,可选择GPT3.5/GPT-4o/GPT-o1/ Claude/文心一言/讯飞星火/通义千问/ Gemini/GLM-4/Claude/Kimi/LinkAI,能处理文本、语音和图片,访问操作系统和互联网,支持基于自有知识库进行定制企业智能客服。
Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
Deezer source separation library including pretrained models.
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…
利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.
A Deep Learning based project for colorizing and restoring old images (and video!)
Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
<⚡️> SuperAGI - A dev-first open source autonomous AI agent framework. Enabling developers to build, manage & run useful autonomous agents quickly and reliably.
SMSBoom - Deprecate: Due to judicial reasons, the repository has been suspended!
Bringing Old Photo Back to Life (CVPR 2020 oral)
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
text2vec, text to vector. 文本向量表征工具,把文本转化为向量矩阵,实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型,开箱即用。
A TensorFlow Implementation of the Transformer: Attention Is All You Need
Production First and Production Ready End-to-End Speech Recognition Toolkit
人像卡通化探索项目 (photo-to-cartoon translation project)
Official tensorflow implementation for CVPR2020 paper “Learning to Cartoonize Using White-box Cartoon Representations”
[CVPR 2023] DepGraph: Towards Any Structural Pruning
Noise supression using deep filtering
Data manipulation and transformation for audio signal processing, powered by PyTorch