Skip to content

Latest commit

 

History

History
3 lines (2 loc) · 188 Bytes

230724 RLCD.md

File metadata and controls

3 lines (2 loc) · 188 Bytes

https://arxiv.org/abs/2307.12950

RLCD: Reinforcement Learning from Contrast Distillation for Language Model Alignment (Kevin Yang, Dan Klein, Asli Celikyilmaz, Nanyun Peng, Yuandong Tian)