wzk1015

Follow

😎

Zhaokai Wang wzk1015

😎

Follow

PhD candidate @ SJTU & Shanghai AI Lab; B.Eng @ BUAA

118 followers · 76 following

Shanghai Jiao Tong University
Shanghai
www.wzk.plus
https://scholar.google.com/citations?user=W0zVf-oAAAAJ

Achievements

Achievements

Highlights

Pro

Starred repositories

IAAR-Shanghai / SurveyX

Academic Survey Paper Generation.

Python 42 3 Updated Feb 24, 2025

LittleNyima / llm-from-scratch

2 Updated Feb 23, 2025

TiffanyBlews / MozartsTouch

Official implementation of Mozart's Touch: A Lightweight Multi-modal Music Generation Framework Based on Pre-Trained Large Models

Python 29 6 Updated Dec 19, 2024

OpenGVLab / LORIS

Long-Term Rhythmic Video Soundtracker, ICML2023

Python 56 1 Updated Jul 5, 2024

om-ai-lab / VLM-R1

Solve Visual Understanding with Reinforced VLMs

Python 3,041 166 Updated Feb 24, 2025

realtimeqa / realtimeqa_public

Python 71 8 Updated Jan 24, 2024

freshllms / freshqa

Data and code for FreshLLMs (https://arxiv.org/abs/2310.03214)

Jupyter Notebook 343 16 Updated Feb 24, 2025

daixiangzi / Awesome-Token-Compress

A paper list of some recent works about Token Compress for Vit and VLM

328 16 Updated Feb 9, 2025

PKU-YuanGroup / MoE-LLaVA

Mixture-of-Experts for Large Vision-Language Models

Python 2,092 132 Updated Dec 3, 2024

Yaxin9Luo / Gamma-MOD

[ICLR2025] γ -MOD: Mixture-of-Depth Adaptation for Multimodal Large Language Models

Python 31 3 Updated Feb 14, 2025

showlab / UniMoD

The code repository of UniMoD

7 Updated Feb 10, 2025

zhijie-group / Show-o-Turbo

Python 28 1 Updated Feb 14, 2025

zxxwxyyy / sonique

Video Background Music Generation Using Unpaired Audio-Visual Data

Python 23 3 Updated Oct 8, 2024

VisionXLab / mllm-mmrotate

A Simple Aerial Detection Baseline of Multimodal Language Models.

Python 49 2 Updated Feb 18, 2025

OthersideAI / self-operating-computer

A framework to enable multimodal models to operate a computer.

Python 9,338 1,264 Updated Feb 3, 2025

deepseek-ai / DeepSeek-R1

81,128 10,473 Updated Feb 24, 2025

deepseek-ai / DeepSeek-V3

Python 88,133 14,229 Updated Feb 24, 2025

chouliuzuo / GVMGen

Python 16 Updated Jan 21, 2025

wzk1015 / video-bgm-generation

[ACM MM 2021 Best Paper Award] Video Background Music Generation with Controllable Music Transformer

Python 303 35 Updated Dec 15, 2024

OpenGVLab / V2PE

[ArXiv] V2PE: Improving Multimodal Long-Context Capability of Vision-Language Models with Variable Visual Position Encoding

Python 29 1 Updated Dec 13, 2024

ActiveVisionLab / Awesome-LLM-3D

Awesome-LLM-3D: a curated list of Multi-modal Large Language Model in 3D world Resources

1,468 88 Updated Feb 14, 2025

shansongliu / MuMu-LLaMA

This is the official repository for M2UGen

Jupyter Notebook 475 37 Updated Jan 2, 2025

Lionelsy / Conference-Accepted-Paper-List

Some Conferences' accepted paper lists (including AI, ML, Robotic)

Python 1,024 75 Updated Jan 23, 2025

yueyang130 / DeeR-VLA

Official code of paper "DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution"

Python 70 6 Updated Feb 14, 2025

OS-Agent-Survey / OS-Agent-Survey

This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use".

203 9 Updated Feb 18, 2025

friedrichor / Awesome-Multimodal-Papers

A curated list of awesome Multimodal studies.

HTML 138 16 Updated Feb 24, 2025

wbs2788 / VMB

Multimodal Music Generation with Explicit Bridges and Retrieval Augmentation: A framework for generating multimodal music by bridging different representations and enhancing generation with RAG.

23 2 Updated Jan 21, 2025

langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…

TypeScript 71,685 10,460 Updated Feb 24, 2025

jina-ai / reader

Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/

TypeScript 7,995 627 Updated Feb 24, 2025

opendatalab / magic-html

Python 386 35 Updated Nov 22, 2024

Starred topics

Awesome Lists