👨💻
Stars
attack
3 repositories
Improving Alignment and Robustness with Circuit Breakers
A high-throughput and memory-efficient inference and serving engine for LLMs
JailbreakBench: An Open Robustness Benchmark for Jailbreaking Language Models [NeurIPS 2024 Datasets and Benchmarks Track]