Starred repositories
llama3 implementation one matrix multiplication at a time
WiLoR: End-to-end 3D hand localization and reconstruction in-the-wild
A simple and flexible PyTorch implementation of StableDiffusion-3 based on diffusers for DIY and finetuning.
Official PyTorch implementation of ECCV 2024 Paper: ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback.
Code for "GVHMR: World-Grounded Human Motion Recovery via Gravity-View Coordinates", Siggraph Asia 2024
Official implementation of "MIMO: Controllable Character Video Synthesis with Spatial Decomposed Modeling"
利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.
Official implementation of "ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis"
HaMeR: Reconstructing Hands in 3D with Transformers
[TPAMI'23] Unifying Flow, Stereo and Depth Estimation
[ECCV 2024] HumanRefiner: Benchmarking Abnormal Human Generation and Refining with Coarse-to-fine Pose-Reversible Guidance
A collection of awesome video generation studies.
Code of Pyramidal Flow Matching for Efficient Video Generative Modeling
📹 A more flexible CogVideoX that can generate videos at any resolution and creates videos from images.
High-resolution models for human tasks.
基于AI的图片/视频硬字幕去除、文本水印去除,无损分辨率生成去字幕、去水印后的图片/视频文件。无需申请第三方API,本地实现。AI-based tool for removing hard-coded subtitles and text-like watermarks from videos or Pictures.
Inpaint Anything performs stable diffusion inpainting on a browser UI using masks from Segment Anything.
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
The Data and Code of Prompt2Sign: A Comprehensive Multilingual Sign Language Dataset.
Effortless data labeling with AI support from Segment Anything and other awesome models.
MuseV: Infinite-length and High Fidelity Virtual Human Video Generation with Visual Conditioned Parallel Denoising
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
Open-Sora: Democratizing Efficient Video Production for All