- Guangzhou/Shenzhen, China
- [email protected]
- https://yeungchenwa.github.io/
Highlights
- Pro
Starred repositories
Official Code for IJCV 2024 paper — Globally Correlation-Aware Hard Negative Generation
A generative world for general-purpose robotics & embodied AI learning.
[AAAI2025] Predicting the Original Appearance of Damaged Historical Documents
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Taming FLUX for Image Inversion & Editing; OpenSora for Video Inversion & Editing! (Official implementation for Taming Rectified Flow for Inversion and Editing.)
Official implementation of OneDiffusion paper
Code for ChatRex: Taming Multimodal LLM for Joint Perception and Understanding
[NeurIPS 2024 Spotlight] Official repository of the CycleNet paper: "CycleNet: Enhancing Time Series Forecasting through Modeling Periodic Patterns". This work is developed by the Lab of Professor …
This repo contains the code for 1D tokenizer and generator
Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model
Tencent Hunyuan3D-1.0: A Unified Framework for Text-to-3D and Image-to-3D Generation
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
Various AI scripts. Mostly Stable Diffusion stuff.
Pytorch implementation of MIMO, Controllable Character Video Synthesis with Spatial Decomposed Modeling, from Alibaba Intelligence Group
Official inference repo for FLUX.1 models
[ICCV 2023] Consistent Image Synthesis and Editing
Official implementation of "MIMO: Controllable Character Video Synthesis with Spatial Decomposed Modeling"
Kolmogorov-Arnold Transformer: A PyTorch Implementation with CUDA kernel
Run Segment Anything Model 2 on a live video stream
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Official Pytorch implementation of StreamV2V.
Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.
A general fine-tuning kit geared toward diffusion models.
This repository compiles a list of papers related to the application of video technology in the field of robotics! Star⭐ the repo and follow me if you like what you see🤩.
✨✨Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
[ICML 2024 Oral] Official repository of the SparseTSF paper: "SparseTSF: Modeling Long-term Time Series Forecasting with 1k Parameters". This work is developed by the Lab of Professor Weiwei Lin (l…