Stars
Stable Diffusion web UI
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Robust Speech Recognition via Large-Scale Weak Supervision
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and…
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Official Code for DragGAN (SIGGRAPH 2023)
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
A generative world for general-purpose robotics & embodied AI learning.
Chat with your documents on your local device using GPT models. No data leaves your device and 100% private.
DeepFaceLab is the leading software for creating deepfakes.
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Open source code for AlphaFold 2.
Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://gpt-docs.h2o.ai/
Open Source Image and Video Restoration Toolbox for Super-resolution, Denoise, Deblurring, etc. Currently, it includes EDSR, RCAN, SRResNet, SRGAN, ESRGAN, EDVR, BasicVSR, SwinIR, ECBSR, etc. Also …
Arch Linux installer - guided, templates etc.
Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"
Reverse Engineering: Decompiling Binary Code with Large Language Models
Segment Anything for Stable Diffusion WebUI
Edit anything in images powered by segment-anything, ControlNet, StableDiffusion, etc. (ACM MM)
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
Inofficial Qualcomm Firehose / Sahara / Streaming / Diag Tools :)
Fast & Simple repository for pre-training and fine-tuning T5-style models
A Dataset of Python Challenges for AI Research
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis