LR Schedulers#
All schedulers in thunder are subclasses of torch.optim.lr_scheduler.LRScheduler
.
However, during initialization they do not require optimizer to be passed.
Usage#
We will use Switch as an example.
Let's see how it works after being assembled.
optimizer = Adam(...)
scheduler(optimizer) # binds optimizer to scheduler
# or
# scheduler = scheduler(optimizer)
# You can also retrieve optimizer:
opt = scheduler.optimizer
Initial LR#
All schedulers have lr_init
parameters, if specified, it will be used as lr value on
0th step.
Reference#
thunder.policy.Multiply
#
Bases: MappingPolicy
Multiplies learning rate value on the specified factor in mapping
.
Example:
Parameters#
mapping: Union[List[Dict[int, float]], Dict[int, float]] Maps epoch to factor, keeping the last value between the epochs. lr_init: Union[List[float], float]] Initial learning rate for each group of parameters.
Source code in thunder/policy.py
thunder.policy.Schedule
#
Bases: MappingPolicy
Assigns learning rate values received from callable mapping. Example:
lr will have values of np.cos(epoch_number)Parameters#
mapping: Union[List[Callable], Callable]] Maps epoch to value. lr_init: Union[List[float], float]] Initial learning rate for each group of parameters.
Source code in thunder/policy.py
thunder.policy.Switch
#
Bases: MappingPolicy
Assigns learning rate values received from dict mapping. Example:
lr: 1e-4, 1e-4, 1e-10, 1e-10, ...Parameters#
mapping: Union[List[Dict[int, float]], Dict[int, float]] Maps specified epochs to specified values, preserving learning rate between epochs. lr_init: Union[List[float], float]] Initial learning rate for each group of parameters.
Source code in thunder/policy.py
Base classes#
thunder.policy
#
Policy
#
Bases: _LRScheduler
Policy base class.
Source code in thunder/policy.py
get_lr()
abstractmethod
#
load_state_dict(state_dict)
abstractmethod
#
Loads state dict of scheduler Parameters
state_dict: Dict[str, Any] State dict of scheduler.
prepare_state_dict(*keys)
#
Creates state dict of scheduler, excluding optimizer and specified keys. Be aware that this method does not save state_dict. And only useful for preparing it. Parameters
keys: str Names of attributes to be excluded from state_dict
Returns#
Dict[str, Any]
Source code in thunder/policy.py
MappingPolicy
#
Bases: Policy
Source code in thunder/policy.py
__init__(mapping, lr_init=0.001)
#
Base class for policy with mapping. Mapping can be a dict or a function (it should also be a list of latter types in case of multiple param groups). Mapping is the binding between epoch or step number and learning rate value. Parameters
mapping Binding of epoch or step number and learning rate. lr_init: Union[List[float], float]] Initial learning rate for each group of parameters.