Lists (3)
Sort Last updated
Stars
HunyuanVideo: A Systematic Framework For Large Video Generation Model
We present StableAnimator, the first end-to-end ID-preserving video diffusion framework, which synthesizes high-quality videos without any post-processing, conditioned on a reference image and a se…
Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022
Inpaint anything using Segment Anything and inpainting models.
ControlSpeech: Towards Simultaneous Zero-shot Speaker Cloning and Zero-shot Language Style Control With Decoupled Codec
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Multilingual Voice Understanding Model
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
A generative speech model for daily dialogue.
Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
unofficial inplementation of paper Underexposed Photo Enhancement using Deep Illumination Estimation(2019 CVPR)
Official implementation of OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on
Code for "Joint Denoising and Demosaicking with Green Channel Prior for Real-world Burst Images", TIP2021
Code of "ResDiff: Combining CNN and Diffusion Model for Image Super-Resolution"
Diff tool for comparing Win32 resources in PE images
End-to-End Learning for Joint Image Demosaicing, Denoising and Super-Resolution
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
Open-Sora: Democratizing Efficient Video Production for All
[CVPR2024] SeeSR: Towards Semantics-Aware Real-World Image Super-Resolution