We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
optimizer_cls_and_kwargs
PPOTrainer
RLOOTrainer
_save_checkpoint
eval_dataset
max_new_tokens
KTOTrainer
processing_class
tokenizer
LogCompletionsCallback
get_batch_sample
num_items_in_batch
compute_loss
remove_unused_columns
[SFT/DPO/Reward]ScriptArguments
ScriptArguments
PPOv2
PPO
ORPOTrainer
trl env
"none"
decoder_input_ids
DPOTrainer
skip_prompt=True
TextIteratorStreamer
CPOTrainer
dataset_text_field
"text"