Skip to content

LlamaEval is a rapid prototype developed during a hackathon to provide a user-friendly dashboard for evaluating and comparing Llama models using the TogetherAI API.

License

Notifications You must be signed in to change notification settings

Paraskevi-KIvroglou/Hackathon-LlamaEval

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Llama Impact Hackathon

LlamaEval: Quick Evaluation Dashboard

LlamaEval is a rapid prototype developed during a hackathon to provide a user-friendly dashboard for evaluating and comparing Llama models using the TogetherAI API.

Features

Model Selection: Choose from various Llama models available through TogetherAI. Benchmark Tasks: Evaluate models on predefined tasks such as question answering and text summarization. Performance Metrics: View accuracy, BLEU scores, and other relevant metrics for each model. User-Friendly Interface: Simple web interface for inputting prompts and viewing results. Quick Comparison: Easily compare the performance of different Llama models side-by-side.

Note

For the prototype, we kept the size of the benchmark small. In later, steps we plan to iterate on top of real-world datasets.

Installation

1.Clone the repository

2.Install dependencies

bash
pip install -r requirements.txt
  1. Set up your TogetherAI API key:
bash
export TOGETHERAI_API_KEY=your_api_key_here

Usage

Run the application:

bash
streamlit run app.py

Open your web browser and navigate to http://localhost:8501. Select a Llama model, choose a benchmark task, and input your prompt. View the results and performance metrics on the dashboard.

Future Development

  • Custom dataset uploads

  • Support for additional AI models

  • Advanced visualization of performance metrics

  • Integration with other AI model providers

Contributors

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

LlamaEval is a rapid prototype developed during a hackathon to provide a user-friendly dashboard for evaluating and comparing Llama models using the TogetherAI API.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published