merlin.models.tf.MultiOptimizer#

class merlin.models.tf.MultiOptimizer(optimizers_and_blocks: Sequence[merlin.models.tf.blocks.optimizer.OptimizerBlocks], default_optimizer: Union[str, keras.optimizers.legacy.optimizer_v2.OptimizerV2] = 'rmsprop', name: str = 'MultiOptimizer', **kwargs)[source]#

Bases: keras.optimizers.legacy.optimizer_v2.OptimizerV2

An optimizer that composes multiple individual optimizers.

It allows different optimizers to be applied to different subsets of the model’s variables. For example, it is possible to apply one optimizer to the blocks which contains the model’s embeddings (sparse variables) and another optimizer to the rest of its variables (other blocks).

To specify which optimizer should apply to each block, pass a list of pairs of (optimizer instance, blocks the optimizer should apply to).

For example: ```python

import merlin.models.tf as ml user_tower = ml.InputBlock(schema.select_by_tag(Tags.USER)).connect(ml.MLPBlock([512, 256])) item_tower = ml.InputBlock(schema.select_by_tag(Tags.ITEM)).connect(ml.MLPBlock([512, 256])) third_tower = ml.InputBlock(schema.select_by_tag(Tags.ITEM)).connect(ml.MLPBlock([64])) three_tower = ml.ParallelBlock({“user”: user_tower, “item”: item_tower, “third”: third_tower}) model = ml.Model(three_tower, ml.BinaryClassificationTask(“click”))

# The third_tower would be assigned the default_optimizer (“adagrad” in this example) optimizer = ml.MultiOptimizer(default_optimizer=tf.keras.optimizers.legacy.Adagrad(),

optimizers_and_blocks=[
ml.OptimizerBlocks(tf.keras.optimizers.legacy.SGD(), user_tower), ml.OptimizerBlocks(tf.keras.optimizers.legacy.Adam(), item_tower),

])

# The string identification of optimizer is also acceptable, here “sgd” for the third_tower # The variables of BinaryClassificationTask(“click”) would still use the default_optimizer optimizer = ml.MultiOptimizer(default_optimizer=”adam”,

optimizers_and_blocks=[
ml.OptimizerBlocks(“sgd”, [user_tower, third_tower]), ml.OptimizerBlocks(“adam”, item_tower),

])

```

__init__(optimizers_and_blocks: Sequence[merlin.models.tf.blocks.optimizer.OptimizerBlocks], default_optimizer: Union[str, keras.optimizers.legacy.optimizer_v2.OptimizerV2] = 'rmsprop', name: str = 'MultiOptimizer', **kwargs)[source]#

Initializes an MultiOptimizer instance.

Parameters

optimizers_and_blocks (Sequence[OptimizerBlocks]) – List of OptimizerBlocks(dataclass), the OptimizerBlocks contains two items, one is optimizer, another one is a list of blocks or a block that the optimizer should apply to. See ‘class OptimizerBlocks’
default_optimizer (Union[str, tf.keras.optimizers.legacy.Optimizer]) – Default optimizer for the rest variables not specified in optimizers_and_blocks, by default “rmsprop”.
name (str) – The name of MultiOptimizer.

Methods

`__init__`(optimizers_and_blocks[, ...])	Initializes an MultiOptimizer instance.
`add`(optimizer_blocks)	add another optimzier and specify which block to apply this optimizer to
`add_slot`(var, slot_name[, initializer, shape])	Add a new slot variable for var.
`add_weight`(name, shape[, dtype, ...])
`apply_gradients`(grads_and_vars[, name, ...])
`from_config`(config)
`get_config`()
`get_gradients`(loss, params)	Returns gradients of loss with respect to params.
`get_slot`(var, slot_name)
`get_slot_names`()	A list of names for this optimizer's slots.
`get_updates`(loss, params)
`get_weights`()	Returns the current weights of the optimizer.
`minimize`(loss, var_list[, grad_loss, name, tape])	Minimize loss by updating var_list.
`set_weights`(weights)	Set the weights of the optimizer.
`update`(optimizer_blocks)	update the optimzier of a block, which would update the block's optimizer no matter what optimizer it used to utilize.
`variables`()	Returns the optimizer's variables.

Attributes

`clipnorm`	float or None.
`clipvalue`	float or None.
`global_clipnorm`	float or None.
`iterations`	See base class.
`optimizers`	default_optimizer is included here
`weights`	Returns the optimizer's variables.

apply_gradients(grads_and_vars: Sequence[Tuple[Union[tensorflow.python.framework.ops.Tensor, tensorflow.python.framework.sparse_tensor.SparseTensor, tensorflow.python.ops.ragged.ragged_tensor.RaggedTensor], Union[tensorflow.python.framework.ops.Tensor, tensorflow.python.framework.sparse_tensor.SparseTensor, tensorflow.python.ops.ragged.ragged_tensor.RaggedTensor]]], name: Optional[str] = None, experimental_aggregate_gradients: bool = True) → None[source]#

add(optimizer_blocks: merlin.models.tf.blocks.optimizer.OptimizerBlocks)[source]#: add another optimzier and specify which block to apply this optimizer to

update(optimizer_blocks: merlin.models.tf.blocks.optimizer.OptimizerBlocks)[source]#

update the optimzier of a block, which would update the block’s optimizer no matter what optimizer it used to utilize. If the block is not specified with an optimizer before, this functions would have the same functionality as self.add()

Note: the optimizer_blocks would be kept in self.update_optimizers_and_blocks, instead of self.optimizers_and_blocks

get_config()[source]#

classmethod from_config(config)[source]#

property iterations#: See base class.

variables()[source]#: Returns the optimizer’s variables.

property weights: List[tensorflow.python.ops.variables.Variable]#: Returns the optimizer’s variables.

property optimizers: List[keras.optimizers.legacy.optimizer_v2.OptimizerV2]#

default_optimizer is included here

Type: Returns the optimizers in MultiOptimizer (in the original order). Note