Lists (1)
Sort Name ascending (A-Z)
Stars
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
SPAR3D: Stable Point-Aware Reconstruction of 3D Objects from Single Images
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
Arbitrary-steps Image Super-resolution via Diffusion Inversion
A simple screen parsing tool towards pure vision based GUI agent
Quantized training of Stable Diffusion 3 Medium to significantly reduce memory usage.
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
End-to-end recipes for optimizing diffusion models with torchao and diffusers (inference and FP8 training).
rensortino / ColorizeNet
Forked from lllyasviel/ControlNetLet us control diffusion models for colorization!
This repository aims to implement an Image Search engine powered by the CLIP model.
Instant voice cloning by MIT and MyShell. Audio foundation model.
Foundational model for human-like, expressive TTS
A RAG LLM co-pilot for browsing the web, powered by local LLMs
A unified framework for 3D content generation.
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
Finetune ModelScope's Text To Video model using Diffusers 🧨
21 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.