Stars
Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 2, and other large language models.
⚡️SwanLab - an open-source, modern-design AI training tracking and visualization tool. Supports Cloud / Self-hosted use. Integrated with PyTorch / Transformers / LLaMA Factory / Ultralytics / veRL …
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.
Fully open reproduction of DeepSeek-R1
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Train transformer language models with reinforcement learning.
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
SGLang is a fast serving framework for large language models and vision language models.
Use PEFT or Full-parameter to finetune 450+ LLMs (Qwen2.5, InternLM3, GLM4, Llama3.3, Mistral, Yi1.5, Baichuan2, DeepSeek-R1, ...) and 150+ MLLMs (Qwen2.5-VL, Qwen2-Audio, Llama3.2-Vision, Llava, I…
A high-throughput and memory-efficient inference and serving engine for LLMs
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
提取微信聊天记录,将其导出成HTML、Word、Excel文档永久保存,对聊天记录进行分析生成年度聊天报告,用聊天数据训练专属于个人的AI聊天助手
Collection of AWESOME vision-language models for vision tasks
Tencent Pre-training framework in PyTorch & Pre-trained Model Zoo
Interactive Video Generation via Masked-Diffusion
[CVPR 2023 Highlight] Freestyle Layout-to-Image Synthesis
[CVPR 2024 Highlight] PLACE: Adaptive Layout-Semantic Fusion for Semantic Image Synthesis
[SIGGRAPH Asia 2024] TrailBlazer: Trajectory Control for Diffusion-Based Video Generation
LVDM: Latent Video Diffusion Models for High-Fidelity Long Video Generation
An innovative method designed to augment the capabilities of existing video diffusion models
[ICCV 2023] Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
[ECCV 2024 Oral] MotionDirector: Motion Customization of Text-to-Video Diffusion Models.
A paper list of some recent Transformer-based CV works.
VMC: Video Motion Customization using Temporal Attention Adaption for Text-to-Video Diffusion Models (CVPR 2024)
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
[ICCV 2023] Tracking Anything with Decoupled Video Segmentation