Popular repositories Loading
-
rlhf-poisoning
rlhf-poisoning PublicForked from ethz-spylab/rlhf-poisoning
Code for paper "Universal Jailbreak Backdoors from Poisoned Human Feedback"
Python
-
HALOs
HALOs PublicForked from ContextualAI/HALOs
A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).
Python
-
-
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.