Stars
A high-throughput and memory-efficient inference and serving engine for LLMs
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
A modular graph-based Retrieval-Augmented Generation (RAG) system
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 12+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
Mobile-Agent: The Powerful Mobile Device Operation Assistant Family
4 bits quantization of LLaMA using GPTQ
A repo lists papers related to LLM based agent
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
A Massively Parallel Large Scale Self-Play Framework
🌍 Repository for "AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agent", ACL'24 Best Resource Paper.
LlamaTouch: A Faithful and Scalable Testbed for Mobile UI Task Automation
An environment for mobile angets to interact with realistic android device or android emulator