Lists (8)
Sort Name ascending (A-Z)
Stars
Translate manga/image 一键翻译各类图片内文字 https://cotrans.touhou.ai/
Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚
An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.
Summary of anime face detection methods based on Python.
✨✨VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
An open-source remote desktop application designed for self-hosting, as an alternative to TeamViewer.
Go compiler for small places. Microcontrollers, WebAssembly (WASM/WASI), and command-line tools. Based on LLVM.
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
kaldi-asr/kaldi is the official location of the Kaldi project.
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
Emscripten: An LLVM-to-WebAssembly Compiler
Speech-to-text, text-to-speech, speaker diarization, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC…
ModelScope: bring the notion of Model-as-a-Service to life.
Production First and Production Ready End-to-End Keyword Spotting Toolkit
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
👻 Ghostty is a fast, feature-rich, and cross-platform terminal emulator that uses platform-native UI and GPU acceleration.
🍒 Cherry Studio is a desktop client that supports for multiple LLM providers. Support deepseek-r1
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
📖 Go 程序员面试笔试宝典 | 从问题切入,串连 Go 语言相关的所有知识,融会贯通。 https://golang.design/go-questions
🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time
rust-analyzer extension for coc.nvim