Skip to content

xashru/cti-bench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

cti-bench

This repository contains the data and evaluation scripts for the paper CTIBench: A Benchmark for Evaluating LLMs in Cyber Threat Intelligence, accepted at NeurIPS 2024. CTIBench is a comprehensive suite of benchmark tasks and datasets designed to evaluate Large Language Models (LLMs) in the field of Cyber Threat Intelligence (CTI).

Dataset details can be found at huggingface: https://huggingface.co/datasets/AI4Sec/cti-bench

evaluation directory contains scripts to evaluate model performance and the response for 5 LLMs - ChatGPT3.5, ChatGPT4, Gemini-1.5, LLAMA3-70B, LLAMA3-8B.

logs directory contains the unprocessed response from ChatGPT3.5, ChatGPT4 and Gemini-1.5 for the tasks.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published