A library for efficient patching and automatic circuit discovery.
Transformer Circuit Metrics are not Robust (Oral spotlight, COLM 2024)
pip install auto-circuit
https://github.com/UFO-101/auto-circuit/blob/5777057b445d55b7d4695eed8aface9d6bcefbe3/experiments/demos/patch_some_edges.py#L49
@inproceedings{
miller2024transformer,
title={Transformer Circuit Evaluation Metrics Are Not Robust},
author={Joseph Miller and Bilal Chughtai and William Saunders},
booktitle={First Conference on Language Modeling},
year={2024},
url={https://openreview.net/forum?id=zSf8PJyQb2}
}