Skip to content
View WenxuanZhu1103's full-sized avatar

Highlights

  • Pro

Block or report WenxuanZhu1103

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A procedural Blender pipeline for photorealistic training image generation

Python 2,873 454 Updated Dec 16, 2024

glTF – Runtime 3D Asset Delivery

HTML 7,219 1,144 Updated Dec 14, 2024

glTF Sample Models

Mathematica 3,157 1,308 Updated Dec 22, 2023

Sketchfab python library & CLI

Python 3 Updated Jul 26, 2021

A small Python module for downloading models from Sketchfab.

Python 10 4 Updated Dec 6, 2019

[NeurIPS 2024] Animate3D: Animating Any 3D Model with Multi-view Video Diffusion

Python 150 6 Updated Oct 24, 2024

Let your Claude able to think

TypeScript 10,033 1,145 Updated Dec 3, 2024

一个基于 LLM 的学术写作辅助工具

Python 33 1 Updated Aug 19, 2024

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

Jupyter Notebook 40,040 4,246 Updated Jul 28, 2024

Official code for PointOdyssey: A Large-Scale Synthetic Dataset for Long-Term Point Tracking (ICCV 2023)

Python 132 6 Updated Nov 16, 2024

DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos

Python 1,054 51 Updated Dec 10, 2024

Computer vision utils for Blender (generate instance annoatation, depth and 6D pose by one line code)

Python 474 59 Updated Mar 11, 2024

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 6,398 493 Updated Dec 10, 2024

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Python 3,617 225 Updated Dec 4, 2024

Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.

Python 2,164 168 Updated Dec 16, 2024

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Python 14,766 1,195 Updated Dec 12, 2024

🔥🔥MLVU: Multi-task Long Video Understanding Benchmark

Python 164 Updated Nov 28, 2024

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 20,715 2,279 Updated Aug 12, 2024

Official repo for Detecting, Explaining, and Mitigating Memorization in Diffusion Models (ICLR 2024)

Python 59 7 Updated Apr 3, 2024

Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI

Python 814 33 Updated Dec 4, 2024

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Python 16,722 1,654 Updated Dec 13, 2024

📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).

503 14 Updated Dec 7, 2024

中文版 llm-numbers

113 5 Updated Dec 25, 2023

[CVPR 2024] On the Content Bias in Fréchet Video Distance

Python 99 6 Updated Sep 28, 2024

4D-fy: Text-to-4D Generation Using Hybrid Score Distillation Sampling

Python 317 8 Updated Dec 10, 2024

Implements VAR+CLIP for text-to-image (T2I) generation

Python 95 2 Updated Dec 16, 2024

Code base of SynthCLIP: CLIP training with purely synthetic text-image pairs from LLMs and TTIs.

Python 90 2 Updated Mar 24, 2024

[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-sim…

Python 6,242 421 Updated Dec 6, 2024

Open source implementation of "A Self-Supervised Descriptor for Image Copy Detection" (SSCD).

Python 267 20 Updated Aug 2, 2022
Python 14 2 Updated Sep 28, 2024
Next