A linear estimator on top of clip to predict the aesthetic quality of pictures
InstantIR: Blind Image Restoration with Instant Generative Reference 🔥
OmniGen: Unified Image Generation.
The Dawn of Video Generation: Preliminary Explorations with SORA-like Models
A streamlined implementation of Grounding DINO and SAM for advanced image segmentation. This lightweight solution simplifies the integration of powerful object detection and segmentation models, of…
EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…
CSGO: Content-Style Composition in Text-to-Image Generation 🔥
More suitable IP-Adapter for the DiT architecture
[AAAI 2025]👔IMAGDressing👔: Interactive Modular Apparel Generation for Virtual Dressing. It enables customizable human image generation with flexible garment, pose, and scene control, ensuring high …
Inference pipeline for some Text-to-Image metrics.
ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation 🔥
Official code for "Style Aligned Image Generation via Shared Attention"
The human face subset of LAION-400M for large-scale face pretraining.
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.
[NeurIPS 2022 Spotlight] A Unified Model for Multi-class Anomaly Detection