Diego Darcos923

💭

🤖 Al human on working!

Engineer and Data Scientist ✔️ I am currently working as an AI/ML developer 🤖📊 Enthusiastic to keep growing!

Stars

🧪 Evaluator | LLM

5 repositories

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

Python 15,583 2,670 Updated Dec 18, 2024

Evaluation tool for LLM QA chains

Python 1,070 93 Updated May 10, 2023

Build, evaluate, understand, and fix LLM-based apps

Jupyter Notebook 484 33 Updated Jan 16, 2024

Supercharge Your LLM Application Evaluations 🚀

Python 8,337 855 Updated Feb 24, 2025

ACL 2023: Evaluating Open-Domain Question Answering in the Era of Large Language Models

Python 43 1 Updated Jan 12, 2024