Skip to content
View starsky0426's full-sized avatar

Block or report starsky0426

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A generative world for general-purpose robotics & embodied AI learning.

Python 17,387 1,219 Updated Dec 22, 2024

CVPR-24 | Official codebase for ZONE: Zero-shot InstructiON-guided Local Editing

Python 68 1 Updated Nov 21, 2024

OneTrainer is a one-stop solution for all your stable diffusion training needs.

Python 1,867 158 Updated Dec 22, 2024

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 8,569 832 Updated Dec 18, 2024

DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

Python 478 20 Updated Dec 18, 2024

A port of muerrilla's sd-webui-Detail-Daemon as a node for ComfyUI, to adjust sigmas that control detail.

Python 432 12 Updated Nov 4, 2024

code for finetuning vae

Python 16 Updated Sep 8, 2024

ChatGPT Advanced Voice Mode Gets an Avatar!

JavaScript 8 5 Updated Sep 29, 2024
Jupyter Notebook 112 12 Updated Sep 11, 2022

Netflix-level subtitle cutting, translation, alignment, and even dubbing - one-click fully automated AI video subtitle team | Netflix级字幕切割、翻译、对齐、甚至加上配音,一键全自动视频搬运AI字幕组

Python 8,610 832 Updated Dec 20, 2024

[ECCV 2024] PowerPaint, a versatile image inpainting model that supports text-guided object inpainting, object removal, image outpainting and shape-guided object inpainting with only a single model…

Python 721 45 Updated Sep 8, 2024

[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-sim…

Jupyter Notebook 6,370 427 Updated Dec 22, 2024

Official Implementation of HumanEdit: A High-Quality Human-Rewarded Dataset for Instruction-based Image Editing

Python 20 Updated Dec 6, 2024

The offical repository of "SIDA: Social Media Image Deepfake Detection, Localization and Explanation with Large Multimodal Model"

7 Updated Dec 6, 2024

Quantized Attention that achieves speedups of 2.1-3.1x and 2.7-5.1x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.

Cuda 713 35 Updated Dec 21, 2024

TextCtrl: Diffusion-based Scene Text Editing with Prior Guidance Control

Python 36 3 Updated Dec 16, 2024

unofficial implementation of Comfyui magic clothing

Python 530 44 Updated Sep 4, 2024

HunyuanVideo: A Systematic Framework For Large Video Generation Model

Python 6,486 479 Updated Dec 19, 2024

Cog inference for flux models

Python 308 35 Updated Dec 21, 2024

[arXiv 2024] Novel View Extrapolation with Video Diffusion Priors

Python 87 2 Updated Dec 12, 2024

Finetune Llama 3.3, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 70% less memory

Python 19,553 1,374 Updated Dec 21, 2024

ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment

Python 1,108 58 Updated Jul 17, 2024

WWW2025 Multimodal Intent Recognition for Dialogue Systems Challenge

Python 110 11 Updated Nov 11, 2024

[ECCV 2024 - Oral] HiT-SR: Hierarchical Transformer for Efficient Image Super-Resolution

Python 93 2 Updated Nov 14, 2024

🌈一个跨平台的划词翻译和OCR软件 | A cross-platform software for text translation and recognition.

JavaScript 10,859 486 Updated Nov 16, 2024

The state-of-the-art image restoration model without nonlinear activation functions.

Python 2,298 291 Updated Jul 3, 2024

LLM-powered multiagent persona simulation for imagination enhancement and business insights.

Python 5,076 390 Updated Dec 17, 2024

[RSS 2023] Diffusion Policy Visuomotor Policy Learning via Action Diffusion

Python 1,788 344 Updated Nov 19, 2024

GIF is a photorealistic generative face model with explicit 3D geometric and photometric control.

Python 409 63 Updated Sep 13, 2022
Next