Lists (12)
Sort Name ascending (A-Z)
Stars
Janus-Series: Unified Multimodal Understanding and Generation Models
Use PEFT or Full-parameter to finetune 450+ LLMs (Qwen2.5, InternLM3, GLM4, Llama3.3, Mistral, Yi1.5, Baichuan2, DeepSeek-R1, ...) and 150+ MLLMs (Qwen2.5-VL, Qwen2-Audio, Llama3.2-Vision, Llava, I…
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
woct0rdho / triton-windows
Forked from triton-lang/tritonFork of the Triton language and compiler for Windows support
Finetune Llama 3.3, DeepSeek-R1 & Reasoning LLMs 2x faster with 70% less memory! 🦥
[ACM Multimedia 2023] Taming the Power of Diffusion Models for High-Quality Virtual Try-On with Appearance Flow.
[CVPR 2025] Learning Flow Fields in Attention for Controllable Person Image Generation
Official repository of "TryOffAnyone: Tiled Cloth Generation from a Dressed Person"
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…
[AAAI 2025]👔IMAGDressing👔: Interactive Modular Apparel Generation for Virtual Dressing. It enables customizable human image generation with flexible garment, pose, and scene control, ensuring high …
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
👮♂️The sensitive word tool for java.(敏感词/违禁词/违法词/脏词。基于 DFA 算法实现的高性能 java 敏感词过滤工具框架。内置支持单词标签分类分级。请勿发布涉及政治、广告、营销、翻墙、违反国家法律法规等内容。高性能敏感词检测过滤组件,附带繁体简体互换,支持全角半角互换,汉字转拼音,模糊搜索等功能。)
FastGPT is a knowledge-based platform built on the LLMs, offers a comprehensive suite of out-of-the-box capabilities such as data processing, RAG retrieval, and visual AI workflow orchestration, le…
Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 2, and other large language models.
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
💬 Ready-to-use & flexible RAG Chatbot, supporting mainstream large language models (LLMs) such as DeepSeek-R1, Llama 3.3, Qwen2, OpenAI and more.
Multilingual Voice Understanding Model
Automatically remove the mosaics in images and videos, or add mosaics to them.
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…
MimicTalk: Mimicking a personalized and expressive 3D talking face in minutes; NeurIPS 2024; Official code
Unofficial PyTorch Implementation for FaceShifter (https://arxiv.org/abs/1912.13457)
A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library for Python
Avatars for Zoom, Skype and other video-conferencing apps.
Official Code for DragGAN (SIGGRAPH 2023)
利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.