Stars
Official implementation of "Sonic: Shifting Focus to Global Audio Perception in Portrait Animation"
Memory-Guided Diffusion for Expressive Talking Video Generation
视频号、小程序、抖音、快手、小红书、直播流、m3u8、酷狗、QQ音乐等常见网络资源下载!
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Official implementation of the paper "TANGO: Co-Speech Gesture Video Reenactment with Hierarchical Audio-Motion Embedding and Diffusion Interpolation"
Fay is an open-source digital human framework integrating language models and digital characters. It offers retail, assistant, and agent versions for diverse applications like virtual shopping guid…
[CVPR'24] MESA: Matching Everything by Segmenting Anything
High-resolution models for human tasks.
Background Remover lets you Remove Background from images and video using AI with a simple command line interface that is free and open source.
Workflow-to-APP、ScreenShare&FloatingVideo、GPT & 3D、SpeechRecognition&TTS
Rotation & scale invariant template matching
C++ implementation of a ScienceDirect paper "An accelerating cpu-based correlation-based image alignment for real-time automatic optical inspection"
[CVPR 2024] Official implementation of the paper "Visual In-context Learning"
Easy to use stem (e.g. instrumental/vocals) separation from CLI or as a python package, using a variety of amazing pre-trained models (primarily from UVR)
real time face swap and one-click video deepfake with only a single image
docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
Collection of training data management explorations for large language models
Python library for analysing faces using PyTorch
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
A set of nodes for ComfyUI that can composite layer and mask to achieve Photoshop like functionality.