-
Nanjing University
- Nanjing, China
- https://z-jiaming.github.io/
Highlights
- Pro
Starred repositories
Phantom: Subject-Consistent Video Generation via Cross-Modal Alignment
A pipeline parallel training script for diffusion models.
Enjoy the magic of Diffusion models!
Wan: Open and Advanced Large-Scale Video Generative Models
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism
This is a repo to track the latest autoregressive visual generation papers.
Video Generation Foundation Models: https://saiyan-world.github.io/goku/
A collection of vision foundation models unifying understanding and generation.
Xiaomi Home Integration for Home Assistant
LoRAT_pytracking: reproduction of [ECCV2024] LoRAT
VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.
text window manager, shell multiplexer, integrated DevOps environment
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Official PyTorch implementation of Learning to (Learn at Test Time): RNNs with Expressive Hidden States
A series of basic algorithms that are useful for video understanding, including Single Object Tracking (SOT), Video Object Segmentation (VOS) and so on.
[NeurIPS2024 Spotlight] The official implementation of GrootVL: Tree Topology is All You Need in State Space Model
This repository is the official implementation of our Autoregressive Pretraining with Mamba in Vision
提取微信聊天记录,将其导出成HTML、Word、Excel文档永久保存,对聊天记录进行分析生成年度聊天报告,用聊天数据训练专属于个人的AI聊天助手
[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…
Automatically update arXiv papers about SOT & VLT, Multi-modal Learning, LLM and Video Understanding using Github Actions.
Official PyTorch Implementation of "The Hidden Attention of Mamba Models"
[ECCV2024] VideoMamba: State Space Model for Efficient Video Understanding
[ICCV 2023] StableVideo: Text-driven Consistency-aware Diffusion Video Editing
Diffusion Model-Based Image Editing: A Survey (arXiv)