- Beijing, China
Stars
Data and code for our paper "Why Does the Effective Context Length of LLMs Fall Short?"
The code of our paper "InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory"
ToMBench: Benchmarking Theory of Mind in Large Language Models, ACL 2024.
[ACL 2024 Findings] The official repo for "ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large Language Models".
[ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
[ACL'24 Outstanding] Data and code for L-Eval, a comprehensive long context language models evaluation benchmark
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
Use ChatGPT to summarize the arXiv papers. 全流程加速科研,利用chatgpt进行论文全文总结+专业翻译+润色+审稿+审稿回复
GPT4 & LangChain Chatbot for large PDF docs
arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv
Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.
a large-scale Chinese parabank via machine translation
Improving Non-autoregressive Generation with Mixup Training
Official Implementation for the ICML2022 paper "Directed Acyclic Transformer for Non-Autoregressive Machine Translation"
A concise but complete full-attention transformer with a set of promising experimental features from various papers
Tracking the progress in non-autoregressive generation (translation, transcription, etc.)
The entmax mapping and its loss, a family of sparse softmax alternatives.
CUDA kernels for generalized matrix-multiplication in PyTorch
Development repository for the Triton language and compiler
A simple tool to update bib entries with their official information (e.g., DBLP or the ACL anthology).