merlin.models.tf.MultiOptimizer#
- class merlin.models.tf.MultiOptimizer(optimizers_and_blocks: Sequence[merlin.models.tf.blocks.optimizer.OptimizerBlocks], default_optimizer: Union[str, keras.optimizers.legacy.optimizer_v2.OptimizerV2] = 'rmsprop', name: str = 'MultiOptimizer', **kwargs)[source]#
Bases:
keras.optimizers.legacy.optimizer_v2.OptimizerV2
An optimizer that composes multiple individual optimizers.
It allows different optimizers to be applied to different subsets of the model’s variables. For example, it is possible to apply one optimizer to the blocks which contains the model’s embeddings (sparse variables) and another optimizer to the rest of its variables (other blocks).
To specify which optimizer should apply to each block, pass a list of pairs of (optimizer instance, blocks the optimizer should apply to).
import merlin.models.tf as ml user_tower = ml.InputBlock(schema.select_by_tag(Tags.USER)).connect(ml.MLPBlock([512, 256])) item_tower = ml.InputBlock(schema.select_by_tag(Tags.ITEM)).connect(ml.MLPBlock([512, 256])) third_tower = ml.InputBlock(schema.select_by_tag(Tags.ITEM)).connect(ml.MLPBlock([64])) three_tower = ml.ParallelBlock({“user”: user_tower, “item”: item_tower, “third”: third_tower}) model = ml.Model(three_tower, ml.BinaryClassificationTask(“click”))
# The third_tower would be assigned the default_optimizer (“adagrad” in this example) optimizer = ml.MultiOptimizer(default_optimizer=tf.keras.optimizers.legacy.Adagrad(),
- optimizers_and_blocks=[
ml.OptimizerBlocks(tf.keras.optimizers.legacy.SGD(), user_tower), ml.OptimizerBlocks(tf.keras.optimizers.legacy.Adam(), item_tower),
])
# The string identification of optimizer is also acceptable, here “sgd” for the third_tower # The variables of BinaryClassificationTask(“click”) would still use the default_optimizer optimizer = ml.MultiOptimizer(default_optimizer=”adam”,
- optimizers_and_blocks=[
ml.OptimizerBlocks(“sgd”, [user_tower, third_tower]), ml.OptimizerBlocks(“adam”, item_tower),
])
- __init__(optimizers_and_blocks: Sequence[merlin.models.tf.blocks.optimizer.OptimizerBlocks], default_optimizer: Union[str, keras.optimizers.legacy.optimizer_v2.OptimizerV2] = 'rmsprop', name: str = 'MultiOptimizer', **kwargs)[source]#
Initializes an MultiOptimizer instance.
- Parameters
optimizers_and_blocks (Sequence[OptimizerBlocks]) – List of OptimizerBlocks(dataclass), the OptimizerBlocks contains two items, one is optimizer, another one is a list of blocks or a block that the optimizer should apply to. See ‘class OptimizerBlocks’
default_optimizer (Union[str, tf.keras.optimizers.legacy.Optimizer]) – Default optimizer for the rest variables not specified in optimizers_and_blocks, by default “rmsprop”.
name (str) – The name of MultiOptimizer.
Methods
__init__
(optimizers_and_blocks[, ...])Initializes an MultiOptimizer instance.
add
(optimizer_blocks)add another optimzier and specify which block to apply this optimizer to
add_slot
(var, slot_name[, initializer, shape])Add a new slot variable for var.
add_weight
(name, shape[, dtype, ...])apply_gradients
(grads_and_vars[, name, ...])from_config
(config)get_gradients
(loss, params)Returns gradients of loss with respect to params.
get_slot
(var, slot_name)get_slot_names
()A list of names for this optimizer's slots.
get_updates
(loss, params)get_weights
()Returns the current weights of the optimizer.
minimize
(loss, var_list[, grad_loss, name, tape])Minimize loss by updating var_list.
set_weights
(weights)Set the weights of the optimizer.
update
(optimizer_blocks)update the optimzier of a block, which would update the block's optimizer no matter what optimizer it used to utilize.
Returns the optimizer's variables.
Attributes
clipnorm
float or None.
clipvalue
float or None.
global_clipnorm
float or None.
See base class.
default_optimizer is included here
Returns the optimizer's variables.
- apply_gradients(grads_and_vars: Sequence[Tuple[Union[tensorflow.python.framework.ops.Tensor, tensorflow.python.framework.sparse_tensor.SparseTensor, tensorflow.python.ops.ragged.ragged_tensor.RaggedTensor], Union[tensorflow.python.framework.ops.Tensor, tensorflow.python.framework.sparse_tensor.SparseTensor, tensorflow.python.ops.ragged.ragged_tensor.RaggedTensor]]], name: Optional[str] = None, experimental_aggregate_gradients: bool = True) None [source]#
- add(optimizer_blocks: merlin.models.tf.blocks.optimizer.OptimizerBlocks)[source]#
add another optimzier and specify which block to apply this optimizer to
- update(optimizer_blocks: merlin.models.tf.blocks.optimizer.OptimizerBlocks)[source]#
update the optimzier of a block, which would update the block’s optimizer no matter what optimizer it used to utilize. If the block is not specified with an optimizer before, this functions would have the same functionality as self.add()
Note: the optimizer_blocks would be kept in self.update_optimizers_and_blocks, instead of self.optimizers_and_blocks
- property iterations#
See base class.
- property weights: List[tensorflow.python.ops.variables.Variable]#
Returns the optimizer’s variables.
- property optimizers: List[keras.optimizers.legacy.optimizer_v2.OptimizerV2]#
default_optimizer is included here
- Type
Returns the optimizers in MultiOptimizer (in the original order). Note