Lists (3)
Sort Name ascending (A-Z)
Stars
MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
official implementation of "Interpreting CLIP's Image Representation via Text-Based Decomposition"
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)
The paper list of the 86-page paper "The Rise and Potential of Large Language Model Based Agents: A Survey" by Zhiheng Xi et al.
🤖 AgentVerse 🪐 is designed to facilitate the deployment of multiple LLM-based agents in various applications, which primarily provides two frameworks: task-solving and simulation
A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
ConvoKit is a toolkit for extracting conversational features and analyzing social phenomena in conversations. It includes several large conversational datasets along with scripts exemplifying the u…
Source code for end-to-end dialogue model from the MultiWOZ paper (Budzianowski et al. 2018, EMNLP)
🐝 GPTSwarm: LLM agents as (Optimizable) Graphs
Platform to experiment with the AI Software Engineer. Terminal based. NOTE: Very different from https://gptengineer.app
A Python library to extract tabular data from PDFs
MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.
PDF解析(文字,章节,表格,图片,参考),基于大模型(ChatGLM2-6B, RWKV)+langchain+streamlit的PDF问答,摘要,信息抽取
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Instant neural graphics primitives: lightning fast NeRF and more
[EMNLP'24] EHRAgent: Code Empowers Large Language Models for Complex Tabular Reasoning on Electronic Health Records
An incremental parsing system for programming tools
Evolutionary algorithm toolbox and framework with high performance for Python
The official implementation of the paper "Read to Play (R2-Play): Decision Transformer with Multimodal Game Instruction".
A Database of Real Faults and an Experimental Infrastructure to Enable Controlled Experiments in Software Engineering Research