Rapidformer Trainer¶
rapidformer.trainer.finetuner module¶
- class rapidformer.trainer.finetuner.Finetuner(engine)¶
Bases:
rapidformer.trainer.trainer.Trainer
A Trainer for model finetuning.
- train()¶
Run finetune process
- metrics_provider(model, valid_dataloader)¶
Abstract method to provide eval metrics.
For example:
def metrics_provider(self, model, eval_dataloader): args = get_args() model = model[0] metric = load_metric(args.dataset_path, args.dataset_name) for step, batch in enumerate(eval_dataloader): with torch.no_grad(): outputs = model(**batch) predictions = outputs.logits.argmax(dim=-1) metric.add_batch( predictions=self.gather(predictions), references=self.gather(batch["labels"]), ) eval_metric = metric.compute() return eval_metric
Arguments: valid_dataloader
Returns: metric
- build_data_loader(dataset, micro_batch_size, num_workers, task_collate_fn=None)¶
Data loader..
- Parameters
dataset -- A PyTorch dataset object.
micro_batch_size -- the per GPU batch-size.
num_workers -- Number workers.
task_collate_fn -- Collate fn.
Returns: PyTorch DataLoader
- gather(tensor)¶
Gather the values in tensor accross all processes and concatenate them on the first dimension. Useful to regroup the predictions from all processes when doing evaluation. .. note:: This gather happens in all processes.
- Parameters
tensor (
torch.Tensor
, or a nested tuple/list/dictionary oftorch.Tensor
) -- The tensors to gather across all processes.- Returns
The gathered tensor(s). Note that the first dimension of the result is num_processes multiplied by the first dimension of the input tensors.
- Return type
torch.Tensor
, or a nested tuple/list/dictionary oftorch.Tensor
rapidformer.trainer.pretrainer module¶
- rapidformer.trainer.pretrainer.cyclic_iter(iter)¶
- class rapidformer.trainer.pretrainer.PreTrainer(engine)¶
Bases:
rapidformer.trainer.trainer.Trainer
A Trainer for model pretraining.
- train()¶
Run pretrain process
rapidformer.trainer.trainer module¶
- class rapidformer.trainer.trainer.Trainer(engine)¶
Bases:
object
An abstract class for pretrain and finetne
- engine¶
__init__ method.
- Parameters
engine -- Using rapidformer engine to initialize finetuner.
- run_forward_step(batch_or_iterator, model)¶
Abstract method that implemented by user to run forward step.
For example:
def run_forward_step(self, batch, model): output_tensor = model(**batch) return output_tensor.loss
- Parameters
batch_or_iterator -- In finetuner mode the input is batch, pretrainer mode input is iterator
model -- huggingface, etm or megatron model
Returns: loss
- model_optimizer_lr_scheduler_provider(pre_process=True, post_process=True)¶
Abstract method that implemented by user to build model optimizer lr_scheduler
For example:
def model_optimizer_lr_scheduler_provider(self): args = get_args() model = BertForSequenceClassification.from_pretrained(args.pretrained_model_name_or_path) return model, None, None
Returns: model, optimizer, lr_scheduer
- train_valid_test_datasets_provider(train_val_test_num_samples=None)¶
Abstract method that implemented by user to build train valid test datesets
For example:
def train_valid_test_datasets_provider(self): args = get_args() tokenizer = AutoTokenizer.from_pretrained("bert-base-cased") def tokenize_function(examples): outputs = tokenizer(examples["sentence1"], examples["sentence2"], truncation=True, max_length=None) return outputs datasets = load_dataset(args.dataset_path, args.dataset_name) tokenized_datasets = datasets.map(tokenize_function, batched=True, remove_columns=["idx", "sentence1", "sentence2"]) tokenized_datasets.rename_column_("label", "labels") train_dataset = tokenized_datasets["train"] valid_dataset = tokenized_datasets['validation'] test_dataset = tokenized_datasets['test'] def collate_fn(examples): return tokenizer.pad(examples, padding="longest", return_tensors="pt") return train_dataset, valid_dataset, test_dataset, collate_fn
Returns: train_dataset, valid_dataset, test_dataset, collate_fn(Finetune will return collate_fn)
- train_step(run_forward_step, data_batch_or_iterator, model, optimizer, lr_scheduler)¶
Single training step.
- Parameters
run_forward_step -- this function object provided by user
data_batch_or_iterator -- data batch or iterator
model -- huggingface, etm or megatron model
optimizer -- optmizer
lr_scheduler -- learning rate scheduler
- forward_step(run_forward_step, data_iterator, model, input_tensor, losses_reduced)¶
Forward step for passed-in model.
If first stage, input tensor is obtained from data_iterator, otherwise passed-in input_tensor is used.
Returns output tensor.
- backward_step(optimizer, input_tensor=None, output_tensor=None, output_tensor_grad=None, is_gradient_accumulation_boundary=True)¶
Backward step through passed-in output tensor.
If last stage, output_tensor_grad is None, otherwise gradient of loss with respect to stage's output tensor.
Returns gradient of loss with respect to input tensor (None if first stage).
- forward_backward_no_pipelining(run_forward_step, data_iterator, model, optimizer, timers, forward_only)¶
Run forward and backward passes with no pipeline parallelism (no inter-stage communication).
Returns dictionary with losses.
- forward_backward_pipelining_without_interleaving(run_forward_step, data_iterator, model, optimizer, timers, forward_only)¶
Run non-interleaved 1F1B schedule, with communication between pipeline stages.
Returns dictionary with losses if the last stage, empty dict otherwise.
- forward_backward_pipelining_with_interleaving(run_forward_step, data_iterator, model, optimizer, timers, forward_only)¶
Run interleaved 1F1B schedule (model split into model chunks), with communication between pipeline stages as needed.
Returns dictionary with losses if the last stage, empty dict otherwise.
- evaluate(run_forward_step, data_iterator, model, prefix, verbose=False)¶
Evaluation.