Update README.md

j2vdeveloper · Oct 2, 2024 · 7c11557 · 7c11557
1 parent b4fb0df
commit 7c11557
Showing 1 changed file with 1 addition and 2 deletions.
diff --git a/README.md b/README.md
@@ -135,8 +135,7 @@ If you're interested in the field of LLM, you may find the above list of milesto
 - [BeHonest](https://gair-nlp.github.io/BeHonest/#leaderboard) - A pioneering benchmark specifically designed to assess honesty in LLMs comprehensively. 
 - [Berkeley Function-Calling Leaderboard](https://gorilla.cs.berkeley.edu/leaderboard.html) - evaluates LLM's ability to call external functions/tools.
 - [Chinese Large Model Leaderboard](https://github.com/jeinlee1991/chinese-llm-benchmark) - an expert-driven benchmark for Chineses LLMs.
-- [CompassBench Large Language Model Leaderboard
-](https://rank.opencompass.org.cn/leaderboard-llm) - OpenCompass is an LLM evaluation platform, supporting a wide range of models (InternLM2, GPT-4, LLaMa 2, Qwen, GLM, Claude, etc) over 100+ datasets.
+- [CompassRank](https://rank.opencompass.org.cn) - CompassRank is dedicated to exploring the most advanced language and visual models, offering a comprehensive, objective, and neutral evaluation reference for the industry and research.
 - [CompMix](https://qa.mpi-inf.mpg.de/compmix) - a benchmark evaluating QA methods that operate over a mixture of heterogeneous input sources (KB, text, tables, infoboxes).
 - [DreamBench++](https://dreambenchplus.github.io/#leaderboard) - a benchmark for evaluating the performance of large language models (LLMs) in various tasks related to both textual and visual imagination.
 - [FELM](https://hkust-nlp.github.io/felm) - a meta-benchmark that evaluates how well factuality evaluators assess the outputs of large language models (LLMs).