We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
patch
fix type error
allow for an external LLM to play as reward model, as in DAP
address #15
fix misnamed hyperparameter, and add validation function for parsed r… …eward, project management
make sure nucleus sampling and its threshold is customizable