Starred repositories
Label Studio is a multi-type data labeling and annotation tool with standardized output format
(Realtime) Temporal Convolutions in PyTorch
Make a Wake word detection engine like "Ok, google!"
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
This repository contains the Code for SOTA model on Google Speech Command V2 dataset.
Conversion between Traditional and Simplified Chinese
Robust Speech Recognition via Large-Scale Weak Supervision
Streamlit — A faster way to build and share data apps.
Speech-to-text server framework with next-gen Kaldi
The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.
GRAFX: An Open-Source Library for Audio Processing Graphs in PyTorch
Official inference repo for FLUX.1 models
This repository is a curated list of awesome Speech Keyword Spotting (Wake-Up Word Detection).
Real time interactive streaming digital human
Master programming by recreating your favorite technologies from scratch.
A one-of-a-kind resume builder that keeps your privacy in mind. Completely secure, customizable, portable, open-source and free forever. Try it out today!
StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
程序员在家做饭方法指南。Programmer's guide about how to cook at home (Simplified Chinese only).
🚀 A curated list of awesome articles, videos, and other resources to learn and practice software architecture, patterns, and principles.
Enjoy the magic of Diffusion models!
Official code for MUSE: Flexible Voiceprint Receptive Fields and Multi-Path Fusion Enhanced Taylor Transformer for U-Net-based Speech Enhancemen
This repository contains the official implementation of GhostFaceNets, State-Of-The-Art lightweight face recognition models.