Generative
Text2Cinemagraph: Text-Guided Synthesis of Eulerian Cinemagraphs [SIGGRAPH ASIA 2023]
Community interface for generative AI
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…
MATLABER: Material-Aware Text-to-3D via LAtent BRDF auto-EncodeR
Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"
[ICCV 2023] One-shot Implicit Animatable Avatars with Model-based Priors
[3DV 2024] Official repo of "TeCH: Text-guided Reconstruction of Lifelike Clothed Humans"
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/
[ICCV 2023] ProPainter: Improving Propagation and Transformer for Video Inpainting
✨ Hotshot-XL: State-of-the-art AI text-to-GIF model trained to work alongside Stable Diffusion XL
[CVPR 2024] PAIR Diffusion: A Comprehensive Multimodal Object-Level Image Editor
🔊 Text-Prompted Generative Audio Model
FreeU: Free Lunch in Diffusion U-Net (CVPR2024 Oral)
Unofficial implementation of RealFill
3D to Photo is an open-source package by Dabble, that combines threeJS and Stable diffusion to build a virtual photo studio for product photography. Load a 3D model into the browser and virtual sho…
Single Image to 3D using Cross-Domain Diffusion for 3D Generation
A prompting enhancement library for transformers-type text embedding systems
FLUX, Stable Diffusion, SDXL, SD3, LoRA, Fine Tuning, DreamBooth, Training, Automatic1111, Forge WebUI, SwarmUI, DeepFake, TTS, Animation, Text To Video, Tutorials, Guides, Lectures, Courses, Comfy…
Latent Consistency Model for AUTOMATIC1111 Stable Diffusion WebUI
A free and open-source inpainting & image-upscaling tool powered by webgpu and wasm on the browser。| 基于 Webgpu 技术和 wasm 技术的免费开源 inpainting & image-upscaling 工具, 纯浏览器端实现。
Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation
Code for the paper "Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models"
Let us democratise high-resolution generation! (CVPR 2024)