Rapidformer Trainer

rapidformer.trainer.finetuner module

class rapidformer.trainer.finetuner.Finetuner(engine)

Bases: rapidformer.trainer.trainer.Trainer

A Trainer for model finetuning.

train()

Run finetune process

metrics_provider(model, valid_dataloader)

Abstract method to provide eval metrics.

For example:

def metrics_provider(self, model, eval_dataloader):
    args = get_args()
    model = model[0]
    metric = load_metric(args.dataset_path, args.dataset_name)
    for step, batch in enumerate(eval_dataloader):
        with torch.no_grad():
            outputs = model(**batch)
        predictions = outputs.logits.argmax(dim=-1)

        metric.add_batch(
            predictions=self.gather(predictions),
            references=self.gather(batch["labels"]),
        )

    eval_metric = metric.compute()
    return eval_metric

Arguments: valid_dataloader

Returns: metric

build_data_loader(dataset, micro_batch_size, num_workers, task_collate_fn=None)

Data loader..

Parameters
  • dataset -- A PyTorch dataset object.

  • micro_batch_size -- the per GPU batch-size.

  • num_workers -- Number workers.

  • task_collate_fn -- Collate fn.

Returns: PyTorch DataLoader

gather(tensor)

Gather the values in tensor accross all processes and concatenate them on the first dimension. Useful to regroup the predictions from all processes when doing evaluation. .. note:: This gather happens in all processes.

Parameters

tensor (torch.Tensor, or a nested tuple/list/dictionary of torch.Tensor) -- The tensors to gather across all processes.

Returns

The gathered tensor(s). Note that the first dimension of the result is num_processes multiplied by the first dimension of the input tensors.

Return type

torch.Tensor, or a nested tuple/list/dictionary of torch.Tensor

rapidformer.trainer.pretrainer module

rapidformer.trainer.pretrainer.cyclic_iter(iter)
class rapidformer.trainer.pretrainer.PreTrainer(engine)

Bases: rapidformer.trainer.trainer.Trainer

A Trainer for model pretraining.

train()

Run pretrain process

rapidformer.trainer.trainer module

class rapidformer.trainer.trainer.Trainer(engine)

Bases: object

An abstract class for pretrain and finetne

engine

__init__ method.

Parameters

engine -- Using rapidformer engine to initialize finetuner.

run_forward_step(batch_or_iterator, model)

Abstract method that implemented by user to run forward step.

For example:

def run_forward_step(self, batch, model):
    output_tensor = model(**batch)
    return output_tensor.loss
Parameters
  • batch_or_iterator -- In finetuner mode the input is batch, pretrainer mode input is iterator

  • model -- huggingface, etm or megatron model

Returns: loss

model_optimizer_lr_scheduler_provider(pre_process=True, post_process=True)

Abstract method that implemented by user to build model optimizer lr_scheduler

For example:

def model_optimizer_lr_scheduler_provider(self):
    args = get_args()
    model = BertForSequenceClassification.from_pretrained(args.pretrained_model_name_or_path)
    return model, None, None

Returns: model, optimizer, lr_scheduer

train_valid_test_datasets_provider(train_val_test_num_samples=None)

Abstract method that implemented by user to build train valid test datesets

For example:

def train_valid_test_datasets_provider(self):
    args = get_args()
    tokenizer = AutoTokenizer.from_pretrained("bert-base-cased")

    def tokenize_function(examples):
        outputs = tokenizer(examples["sentence1"], examples["sentence2"], truncation=True, max_length=None)
        return outputs

    datasets = load_dataset(args.dataset_path, args.dataset_name)

    tokenized_datasets = datasets.map(tokenize_function, batched=True, remove_columns=["idx", "sentence1", "sentence2"])
    tokenized_datasets.rename_column_("label", "labels")
    train_dataset = tokenized_datasets["train"]
    valid_dataset = tokenized_datasets['validation']
    test_dataset = tokenized_datasets['test']

    def collate_fn(examples):
        return tokenizer.pad(examples, padding="longest", return_tensors="pt")

    return train_dataset, valid_dataset, test_dataset, collate_fn

Returns: train_dataset, valid_dataset, test_dataset, collate_fn(Finetune will return collate_fn)

train_step(run_forward_step, data_batch_or_iterator, model, optimizer, lr_scheduler)

Single training step.

Parameters
  • run_forward_step -- this function object provided by user

  • data_batch_or_iterator -- data batch or iterator

  • model -- huggingface, etm or megatron model

  • optimizer -- optmizer

  • lr_scheduler -- learning rate scheduler

forward_step(run_forward_step, data_iterator, model, input_tensor, losses_reduced)

Forward step for passed-in model.

If first stage, input tensor is obtained from data_iterator, otherwise passed-in input_tensor is used.

Returns output tensor.

backward_step(optimizer, input_tensor=None, output_tensor=None, output_tensor_grad=None, is_gradient_accumulation_boundary=True)

Backward step through passed-in output tensor.

If last stage, output_tensor_grad is None, otherwise gradient of loss with respect to stage's output tensor.

Returns gradient of loss with respect to input tensor (None if first stage).

forward_backward_no_pipelining(run_forward_step, data_iterator, model, optimizer, timers, forward_only)

Run forward and backward passes with no pipeline parallelism (no inter-stage communication).

Returns dictionary with losses.

forward_backward_pipelining_without_interleaving(run_forward_step, data_iterator, model, optimizer, timers, forward_only)

Run non-interleaved 1F1B schedule, with communication between pipeline stages.

Returns dictionary with losses if the last stage, empty dict otherwise.

forward_backward_pipelining_with_interleaving(run_forward_step, data_iterator, model, optimizer, timers, forward_only)

Run interleaved 1F1B schedule (model split into model chunks), with communication between pipeline stages as needed.

Returns dictionary with losses if the last stage, empty dict otherwise.

evaluate(run_forward_step, data_iterator, model, prefix, verbose=False)

Evaluation.