Skip to content
View zhang0jhon's full-sized avatar

Block or report zhang0jhon

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

[NeurIPS 2024] Official code for PuLID: Pure and Lightning ID Customization via Contrastive Alignment

Python 3,039 214 Updated Nov 27, 2024

Official implementation code of the paper <AnyText: Multilingual Visual Text Generation And Editing>

Python 4,495 290 Updated Jun 21, 2024

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Python 11,862 1,046 Updated Jan 20, 2025

We introduce a novel approach for parameter generation, named neural network parameter diffusion (p-diff), which employs a standard latent diffusion model to synthesize a new set of parameters

Python 834 45 Updated Jan 3, 2025

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Python 10,467 981 Updated Jan 22, 2025

SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer

Python 3,186 191 Updated Jan 30, 2025

The best OSS video generation models

Python 2,799 289 Updated Jan 8, 2025

Official repository for LTX-Video

Python 2,680 227 Updated Jan 3, 2025

Let's finetune video generation models!

Python 376 15 Updated Jan 30, 2025

[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…

Jupyter Notebook 6,478 423 Updated Jan 12, 2025

Code of Pyramidal Flow Matching for Efficient Video Generative Modeling

Python 2,731 268 Updated Dec 21, 2024

code for "Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion"

Python 698 35 Updated Dec 11, 2024

HunyuanVideo: A Systematic Framework For Large Video Generation Model

Python 7,949 640 Updated Jan 24, 2025

Official inference repo for FLUX.1 models

Python 19,824 1,390 Updated Jan 9, 2025

PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

Python 1,748 86 Updated Oct 31, 2024

SUPIR aims at developing Practical Algorithms for Photo-Realistic Image Restoration In the Wild. Our new online demo is also released at suppixel.ai.

Python 4,781 404 Updated Jul 30, 2024

Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference

Python 4,435 232 Updated Jun 14, 2024

Open-Sora: Democratizing Efficient Video Production for All

Python 23,194 2,283 Updated Jan 22, 2025

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Python 6,760 600 Updated May 31, 2024

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 13,838 1,381 Updated Dec 25, 2024

Utilities intended for use with Llama models.

Python 5,703 956 Updated Jan 29, 2025

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

Jupyter Notebook 15,642 1,443 Updated Sep 5, 2024

[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"

Python 7,273 734 Updated Aug 12, 2024

Collection of AWESOME vision-language models for vision tasks

2,460 195 Updated Dec 3, 2024
Python 2 1 Updated May 28, 2024

A collection of resources and papers on Diffusion Models

HTML 11,364 957 Updated Aug 1, 2024

ICLR2024 Spotlight: curation/training code, metadata, distribution and pre-trained models for MetaCLIP; CVPR 2024: MoDE: CLIP Data Experts via Clustering

Python 1,343 58 Updated Dec 10, 2024

EVA Series: Visual Representation Fantasies from BAAI

Python 2,403 174 Updated Aug 1, 2024

MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.

Jupyter Notebook 7,147 468 Updated Nov 6, 2024

[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization

Python 533 61 Updated Jun 7, 2024
Next