Stars
🌐 WebWalker: Benchmarking LLMs in Web Traversal
Repo for LinkNER: Linking Local Named Entity Recognition Models to Large Language Models using Uncertainty
A framework for few-shot evaluation of language models.
Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Open source annotation tool for machine learning practitioners.
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
A high-throughput and memory-efficient inference and serving engine for LLMs
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
E-NER: Evidential Deep Learning for Trustworthy Named Entity Recognition
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
Winner system (DAMO-NLP) of SemEval 2022 MultiCoNER shared task over 10 out of 13 tracks.
BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)
Data and code for FreshLLMs (https://arxiv.org/abs/2310.03214)
Contriever: Unsupervised Dense Information Retrieval with Contrastive Learning
This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi.
LlamaIndex is the leading framework for building LLM-powered agents over your data.
RARR: Researching and Revising What Language Models Say, Using Language Models
Use PEFT or Full-parameter to finetune 450+ LLMs (Qwen2.5, InternLM3, GLM4, Llama3.3, Mistral, Yi1.5, Baichuan2, DeepSeek-R1, ...) and 150+ MLLMs (Qwen2.5-VL, Qwen2-Audio, Llama3.2-Vision, Llava, I…
The project page of paper: Trusted Multi-View Classification [ICLR'2021 paper]
DAMO-ConvAI: The official repository which contains the codebase for Alibaba DAMO Conversational AI.