Skip to content
View CHUNYUWANG's full-sized avatar

Block or report CHUNYUWANG

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

HunyuanVideo: A Systematic Framework For Large Video Generation Model

Python 5,797 406 Updated Dec 13, 2024

Generalist and Lightweight Model for Named Entity Recognition (Extract any entity types from texts) @ NAACL 2024

Python 1,534 130 Updated Dec 12, 2024

[IJCV-2021] FairMOT: On the Fairness of Detection and Re-Identification in Multi-Object Tracking

Python 4,040 932 Updated Sep 19, 2023

Mini-DALLE3: Interactive Text to Image by Prompting Large Language Models

Python 303 28 Updated Dec 28, 2023

[CVPR 2024 - Oral, Best Paper Award Candidate] Marigold: Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation

Python 2,446 135 Updated Dec 14, 2024

Latent-based SR using MoE and frequency augmented VAE decoder

Python 152 4 Updated Nov 26, 2023

Consistency Distilled Diff VAE

Python 2,143 76 Updated Nov 7, 2023

Generative Representational Instruction Tuning

Jupyter Notebook 574 41 Updated Nov 18, 2024

Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis

Jupyter Notebook 413 12 Updated May 24, 2024

Learning from synthetic data - code and models

Python 303 13 Updated Jan 6, 2024

ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment

Python 1,103 57 Updated Jul 17, 2024
Python 5,399 892 Updated Dec 9, 2024

Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110). This framework is also used to evaluate text-to-image …

Python 1,980 260 Updated Dec 14, 2024

A large-scale text-to-image prompt gallery dataset based on Stable Diffusion

Python 1,221 68 Updated Jul 11, 2024

CLAY: A Controllable Large-scale Generative Model for Creating High-quality 3D Assets

849 11 Updated Jun 21, 2024

Davidsonian Scene Graph (DSG) for Text-to-Image Evaluation (ICLR 2024)

Jupyter Notebook 78 5 Updated Dec 9, 2024

TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering

Python 140 10 Updated Apr 29, 2024

AutoStudio: Crafting Consistent Subjects in Multi-turn Interactive Image Generation

Jupyter Notebook 417 31 Updated Oct 31, 2024

Data-Efficient Multimodal Fusion on a Single GPU

Python 48 7 Updated May 7, 2024

Lumina-T2X is a unified framework for Text to Any Modality Generation

Python 2,108 88 Updated Aug 6, 2024

Kolors Team

Python 4,000 289 Updated Nov 13, 2024

[NeurIPS 2024] Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single Image

Python 3,134 252 Updated Sep 18, 2024

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,796 117 Updated Oct 30, 2024

GPT4V-level open-source multi-modal model based on Llama3-8B

Python 2,165 147 Updated Sep 3, 2024

From anything to mesh like human artists. Official impl. of "MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers"

Python 2,068 91 Updated Aug 5, 2024

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.

Python 26,635 5,488 Updated Dec 14, 2024

Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'

Python 1,359 101 Updated Oct 8, 2024

The official repo for the paper "VeCLIP: Improving CLIP Training via Visual-enriched Captions"

Jupyter Notebook 235 13 Updated Aug 24, 2024

Densely Captioned Images (DCI) dataset repository.

Python 162 5 Updated Jul 1, 2024

Data release for the ImageInWords (IIW) paper.

JavaScript 202 9 Updated Nov 17, 2024
Next