easynlp.core

Trainer

class easynlp.core.trainer.Trainer(model, train_dataset, evaluator, **kwargs)[source]
model_module
learning_rate
set_model_and_optimizer(model, args)[source]
resume_from_ckpt(model_module, args)[source]
set_tensorboard()[source]
set_train_loader(train_dataset, args)[source]
log_train_infos()[source]
before_epoch(_epoch)[source]
after_epoch()[source]
before_iter()[source]
autocast_context_manager()[source]
optimizer_step()[source]
after_iter(_step, _epoch, loss_dict)[source]
after_train()[source]
save_checkpoint(save_best=False)[source]
train()[source]

VanillaTrainer

class easynlp.core.trainer_vanilla.VanillaTrainer(model, train_dataset, evaluator, **kwargs)[source]
model_module
learning_rate
set_model_and_optimizer(model, args)[source]
resume_from_ckpt(model_module, args)[source]
set_train_loader(train_dataset, args)[source]
log_train_infos()[source]
optimizer_step()[source]
after_train()[source]
save_checkpoint(save_best=False)[source]
train()[source]

Evaluator

class easynlp.core.evaluator.Evaluator(valid_dataset, **kwargs)[source]
evaluate(model)[source]
eval_metrics

Optimizer

class easynlp.core.optimizers.BertAdam(params, lr=<required parameter>, warmup=-1, t_total=-1, schedule='warmup_linear', b1=0.9, b2=0.999, e=1e-06, weight_decay=0.01, max_grad_norm=1.0, **kwargs)[source]

Implements BERT version of Adam algorithm with weight decay fix. Params:

lr: learning rate warmup: portion of t_total for the warmup, -1 means no warmup. Default: -1 t_total: total number of training steps for the learning

rate schedule, -1 means constant learning rate of 1. (no warmup regardless of warmup setting). Default: -1
schedule: schedule to use for the warmup (see above).
Can be 'warmup_linear', 'warmup_constant', 'warmup_cosine', 'none', None or a _LRSchedule object (see below). If None or 'none', learning rate is always kept constant. Default : 'warmup_linear'

b1: Adams b1. Default: 0.9 b2: Adams b2. Default: 0.999 e: Adams epsilon. Default: 1e-6 weight_decay: Weight decay. Default: 0.01 max_grad_norm: Maximum norm for the gradients (-1 means no clipping). Default: 1.0

get_lr()[source]
get_current_lr()[source]
step(closure=None)[source]

Performs a single optimization step.

Parameters:closure (callable, optional) -- A closure that reevaluates the model and returns the loss.