Stars
LLM-RedTeam
2 repositories
Papers about red teaming LLMs and Multimodal models.
Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"