Starred repositories
Command-line program to download videos from YouTube.com and other video sites
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…
🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time
OpenMMLab Detection Toolbox and Benchmark
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
Open-Sora: Democratizing Efficient Video Production for All
Graph Neural Network Library for PyTorch
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Image inpainting tool powered by SOTA AI Model. Remove any unwanted object, defect, people from your pictures or erase and replace(powered by stable diffusion) any thing on your pictures.
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
手写实现李航《统计学习方法》书中全部算法
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
Emulator for rapid prototyping of Software Defined Networks
A concise but complete full-attention transformer with a set of promising experimental features from various papers
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
Transfer Learning Library for Domain Adaptation, Task Adaptation, and Domain Generalization
Release for Improved Denoising Diffusion Probabilistic Models
Edit anything in images powered by segment-anything, ControlNet, StableDiffusion, etc. (ACM MM)
DeepFill v1/v2 with Contextual Attention and Gated Convolution, CVPR 2018, and ICCV 2019 Oral
AnimateDiff for AUTOMATIC1111 Stable Diffusion WebUI
Jittor is a high-performance deep learning framework based on JIT compiling and meta-operators.
【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
Resources and Implementations of Generative Adversarial Nets: GAN, DCGAN, WGAN, CGAN, InfoGAN
[ECCV 2024, Oral] DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors