Stars
LLM Frontend for Power Users.
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
The “Quite OK Image Format” for fast, lossless image compression
A node-based image processing GUI aimed at making chaining image processing tasks easy and customizable. Born as an AI upscaling application, chaiNNer has grown into an extremely flexible and power…
A repository collecting image and video upscaling resources as well as my own super resolution models.
High-Resolution 3D Assets Generation with Large Scale Hunyuan3D Diffusion Models.
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
[ACL 2024] Code for "MoPS: Modular Story Premise Synthesis for Open-Ended Automatic Story Generation"
RWKV-SpeechChat is a real-time dialogue script based on a frozen 3B RWKV model with trained adapters and initial states. Various trained weights can be applied to perform a range of audio tasks, in…
It is an Extension feature used in the WebUI for Stable Diffusion. You can create simple comics with it.
A highly integrated, high end, open source laptop. Attempt the impossible.
Edit, preview and share mermaid charts/diagrams. New implementation of the live editor.
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
Ultimate camera streaming application with support RTSP, RTMP, HTTP-FLV, WebRTC, MSE, HLS, MP4, MJPEG, HomeKit, FFmpeg, etc.
A suite of image and video neural tokenizers
Make ad blocking dns using Cloudflare Gateway Zero Trust
Python tool for converting files and office documents to Markdown.
GameStream client for PCs (Windows, Mac, Linux, and Steam Link)
Self-hosted game stream host for Moonlight.
🔥 Cloudflare (Workers + R2) edge container image repository
A sampler base on Euler, aim at generating better picture/一种基于Euler的采样方法,旨在生成更好的图片
Official Implementation of "ADOPT: Modified Adam Can Converge with Any β2 with the Optimal Rate"
Tencent Hunyuan3D-1.0: A Unified Framework for Text-to-3D and Image-to-3D Generation