Stars
[CAAI AIR'24] Bilateral Reference for High-Resolution Dichotomous Image Segmentation
Official model and network release for my CVPR2022 paper.
Repo for the paper "Intrinsic Single-Image HDR Reconstruction" (ECCV 2024)
Public code release for: ColorfulCurves: Palette-Aware Lightness Control and Color Editing via Sparse Optimization (SIGGRAPH 2023) [Ted Chao, Jason Klein, Jianchao Tan, Jose Echevarria, Yotam Gingold]
A Python library for extracting color palettes from supplied images.
[Information Fusion (Vol.103, Mar. '24)] Boosting Image Matting with Pretrained Plain Vision Transformers
[Image and Vision Computing (Vol.147 Jul. '24)] Interactive Natural Image Matting with Segment Anything Models
Implementation of "Automatic Portrait Segmentation" and "Deep Automatic Portrait Matting" with Chainer.
Matting Anything Model (MAM), an efficient and versatile framework for estimating the alpha matte of any instance in an image with flexible and interactive visual or linguistic user prompt guidance.
[NeurIPS 2024] SHMT: Self-supervised Hierarchical Makeup Transfer via Latent Diffusion Models
Pytorch Implementation of "Stable-Makeup: When Real-World Makeup Transfer Meets Diffusion Model"
[ICCV 2023] DDColor: Towards Photo-Realistic Image Colorization via Dual Decoders
Official implementation of OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on
Code for "Coloring with Words: Guiding Image Colorization through Text-based Palette Generation" - ECCV 2018
The official implementation of SIGGRAPH 2023 conference paper, FashionTex: Controllable Virtual Try-on with Text and Texture.
Official code for paper "PICTURE: PhotorealistIC virtual Try-on from UnconstRained dEsigns"
Outfit Anyone: Ultra-high quality virtual try-on for Any Clothing and Any Person
✨✨VITA: Towards Open-Source Interactive Omni Multimodal LLM
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
GPT4V-level open-source multi-modal model based on Llama3-8B
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Fast and flexible image augmentation library. Paper about the library: https://www.mdpi.com/2078-2489/11/2/125
Image augmentation for machine learning experiments.
PyTorch code and models for the DINOv2 self-supervised learning method.
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
InstantID: Zero-shot Identity-Preserving Generation in Seconds 🔥