https://arxiv.org/abs/2307.12950
RLCD: Reinforcement Learning from Contrast Distillation for Language Model Alignment (Kevin Yang, Dan Klein, Asli Celikyilmaz, Nanyun Peng, Yuandong Tian)
https://arxiv.org/abs/2307.12950
RLCD: Reinforcement Learning from Contrast Distillation for Language Model Alignment (Kevin Yang, Dan Klein, Asli Celikyilmaz, Nanyun Peng, Yuandong Tian)