Note, that you can also subclass or override the create_optimizer and create_scheduler methods create_optimizer_and_scheduler - Sets up the optimizer and learning rate scheduler if they were not passed at.log - Logs information on the various objects watching training.get_test_dataloader - Creates the test DataLoader.get_eval_dataloader - Creates the evaluation DataLoader. get_train_dataloader - Creates the training DataLoader.To inject custom behavior you can subclass them and override the following methods: The Trainer contains the basic training loop which supports the above features. The API supports distributed training on multiple GPUs/TPUs, mixed precision through NVIDIA Apex and Native AMP for PyTorch. On the other hand, the Trainer is a more versatile option, suitable for a broader spectrum of tasks.īefore instantiating your Trainer, create a TrainingArguments to access all the points of customization during training. The SFTTrainer wraps the Trainer and is specially optimized for this particular task and supports sequence packing, LoRA, quantization, and DeepSpeed for efficient scaling to any model size. If you’re looking to fine-tune a language model like Llama-2 or Mistral on a text dataset using autoregressive techniques, consider using trl’s SFTTrainer.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |