Highlights
- Pro
Starred repositories
LLMPerf is a library for validating and benchmarking LLMs
A pipeline parallel training script for diffusion models.
The official implementation of paper "BrushEdit: All-In-One Image Inpainting and Editing"
deepbeepmeep / Cosmos1GP
Forked from NVIDIA/CosmosCosmos1GP for the GPU Poor by DeepBeepMeep
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Official repository for our work on micro-budget training of large-scale diffusion models.
Live2Diff: A Pipeline that processes Live video streams by a uni-directional video Diffusion model.
STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution
NitroFusion: High-Fidelity Single-Step Diffusion through Dynamic Adversarial Training
[WIP] The all in one inference optimization solution for ComfyUI, universal, flexible, and fast.
Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. C…
UI components and hooks for building video/audio players on the web. Robust, customizable, and accessible. Modern alternative to JW Player and Video.js.
Segment Anything 2, 100% in the browser (with WebGPU!)
A lightweight, object-oriented finite state machine implementation in Python with many extensions
A frontend for transitions state machines
A Python-based voice assistant integrating speech-to-text (STT), text-to-speech (TTS), and powerful AI capabilities using either a local LLM via llama.cpp or OpenAI API. Includes clipboard integrat…
The Hugging Face Course on Transformers for Audio
A set of ComfyUI nodes providing additional control for the LTX Video model
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation
Android application for running Windows applications with Wine and Box86/Box64
Official codes of VEnhancer: Generative Space-Time Enhancement for Video Generation