Skip to content
View robbsaber's full-sized avatar

Block or report robbsaber

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expressiveness and quality on par with—or even surpassing—top TTS …

Python 5,775 585 Updated Feb 18, 2025

STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution

Python 4 Updated Jan 31, 2025

The official implementation of paper "ColorFlow: Retrieval-Augmented Image Sequence Colorization"

Python 365 32 Updated Dec 23, 2024

Taming Stable Diffusion for Lip Sync!

Python 2,766 407 Updated Jan 19, 2025

Pixel manipulation tools using deep learning.

Python 26 4 Updated Jan 29, 2025

STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution

Python 1,035 58 Updated Jan 22, 2025

DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion

Python 1,201 68 Updated Dec 7, 2024

Tencent Hunyuan3D-1.0: A Unified Framework for Text-to-3D and Image-to-3D Generation

Python 3,325 251 Updated Jan 21, 2025

Allegro is a powerful text-to-video model that generates high-quality videos up to 6 seconds at 15 FPS and 720p resolution from simple text input.

Python 1,056 62 Updated Feb 7, 2025

Depth Any Video with Scalable Synthetic Data (ICLR 2025)

Python 454 29 Updated Dec 4, 2024

Hey, Computer, Make Me a Font

Python 474 27 Updated Nov 18, 2023

Synchronized Translation for Videos. Video dubbing

Python 1,036 208 Updated Jan 30, 2025

Convert ebooks to audiobooks with chapters and metadata using dynamic AI models and voice cloning. Supports 1,107+ languages!

Python 8,827 617 Updated Mar 3, 2025

[NeurIPS 2024] Generalizable and Animatable Gaussian Head Avatar

Python 418 38 Updated Feb 20, 2025

Memory-optimized training scripts for video models based on Diffusers

Python 900 97 Updated Mar 3, 2025

PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation (ECCV 2024)

Python 279 14 Updated Oct 24, 2024

The official repository of UniMuMo

Python 103 9 Updated Jan 9, 2025

[ICLR 2025] Official implementation of Posterior-Mean Rectified Flow: Towards Minimum MSE Photo-Realistic Image Restoration

Python 601 36 Updated Feb 5, 2025

Select a portrait, click to move the head around (please use your own space / GPU!)

JavaScript 851 89 Updated Nov 21, 2024

Official PyTorch implementation of "Expressive Whole-Body 3D Gaussian Avatar", ECCV 2024.

Python 526 43 Updated Dec 17, 2024

Tag manager and captioner for image datasets

Python 18 Updated Aug 27, 2024

StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusion

171 11 Updated Sep 27, 2024

[NeurIPS 2024] Official code for PuLID: Pure and Lightning ID Customization via Contrastive Alignment

Python 3,134 225 Updated Nov 27, 2024

SD.Next: All-in-one for AI generative image

Python 6,052 462 Updated Mar 2, 2025

Official implementation of EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars

Jupyter Notebook 360 19 Updated Feb 26, 2025

Dead simple FLUX LoRA training UI with LOW VRAM support

Python 2,075 219 Updated Feb 21, 2025

Bring portraits to life!

Python 24 2 Updated Jul 13, 2024
Next