merlin.models.tf.MultiOptimizer#

class merlin.models.tf.MultiOptimizer(optimizers_and_blocks: Sequence[merlin.models.tf.blocks.optimizer.OptimizerBlocks], default_optimizer: Union[str, keras.optimizers.legacy.optimizer_v2.OptimizerV2] = 'rmsprop', name: str = 'MultiOptimizer', **kwargs)[source]#

Bases: keras.optimizers.legacy.optimizer_v2.OptimizerV2

An optimizer that composes multiple individual optimizers.

It allows different optimizers to be applied to different subsets of the model’s variables. For example, it is possible to apply one optimizer to the blocks which contains the model’s embeddings (sparse variables) and another optimizer to the rest of its variables (other blocks).

To specify which optimizer should apply to each block, pass a list of pairs of (optimizer instance, blocks the optimizer should apply to).

For example: ```python

import merlin.models.tf as ml user_tower = ml.InputBlock(schema.select_by_tag(Tags.USER)).connect(ml.MLPBlock([512, 256])) item_tower = ml.InputBlock(schema.select_by_tag(Tags.ITEM)).connect(ml.MLPBlock([512, 256])) third_tower = ml.InputBlock(schema.select_by_tag(Tags.ITEM)).connect(ml.MLPBlock([64])) three_tower = ml.ParallelBlock({“user”: user_tower, “item”: item_tower, “third”: third_tower}) model = ml.Model(three_tower, ml.BinaryClassificationTask(“click”))

# The third_tower would be assigned the default_optimizer (“adagrad” in this example) optimizer = ml.MultiOptimizer(default_optimizer=tf.keras.optimizers.legacy.Adagrad(),

optimizers_and_blocks=[

ml.OptimizerBlocks(tf.keras.optimizers.legacy.SGD(), user_tower), ml.OptimizerBlocks(tf.keras.optimizers.legacy.Adam(), item_tower),

])

# The string identification of optimizer is also acceptable, here “sgd” for the third_tower # The variables of BinaryClassificationTask(“click”) would still use the default_optimizer optimizer = ml.MultiOptimizer(default_optimizer=”adam”,

optimizers_and_blocks=[

ml.OptimizerBlocks(“sgd”, [user_tower, third_tower]), ml.OptimizerBlocks(“adam”, item_tower),

])

```

__init__(optimizers_and_blocks: Sequence[merlin.models.tf.blocks.optimizer.OptimizerBlocks], default_optimizer: Union[str, keras.optimizers.legacy.optimizer_v2.OptimizerV2] = 'rmsprop', name: str = 'MultiOptimizer', **kwargs)[source]#

Initializes an MultiOptimizer instance.

Parameters
  • optimizers_and_blocks (Sequence[OptimizerBlocks]) – List of OptimizerBlocks(dataclass), the OptimizerBlocks contains two items, one is optimizer, another one is a list of blocks or a block that the optimizer should apply to. See ‘class OptimizerBlocks’

  • default_optimizer (Union[str, tf.keras.optimizers.legacy.Optimizer]) – Default optimizer for the rest variables not specified in optimizers_and_blocks, by default “rmsprop”.

  • name (str) – The name of MultiOptimizer.

Methods

__init__(optimizers_and_blocks[, ...])

Initializes an MultiOptimizer instance.

add(optimizer_blocks)

add another optimzier and specify which block to apply this optimizer to

add_slot(var, slot_name[, initializer, shape])

Add a new slot variable for var.

add_weight(name, shape[, dtype, ...])

apply_gradients(grads_and_vars[, name, ...])

from_config(config)

get_config()

get_gradients(loss, params)

Returns gradients of loss with respect to params.

get_slot(var, slot_name)

get_slot_names()

A list of names for this optimizer's slots.

get_updates(loss, params)

get_weights()

Returns the current weights of the optimizer.

minimize(loss, var_list[, grad_loss, name, tape])

Minimize loss by updating var_list.

set_weights(weights)

Set the weights of the optimizer.

update(optimizer_blocks)

update the optimzier of a block, which would update the block's optimizer no matter what optimizer it used to utilize.

variables()

Returns the optimizer's variables.

Attributes

clipnorm

float or None.

clipvalue

float or None.

global_clipnorm

float or None.

iterations

See base class.

optimizers

default_optimizer is included here

weights

Returns the optimizer's variables.

apply_gradients(grads_and_vars: Sequence[Tuple[Union[tensorflow.python.framework.ops.Tensor, tensorflow.python.framework.sparse_tensor.SparseTensor, tensorflow.python.ops.ragged.ragged_tensor.RaggedTensor], Union[tensorflow.python.framework.ops.Tensor, tensorflow.python.framework.sparse_tensor.SparseTensor, tensorflow.python.ops.ragged.ragged_tensor.RaggedTensor]]], name: Optional[str] = None, experimental_aggregate_gradients: bool = True) None[source]#
add(optimizer_blocks: merlin.models.tf.blocks.optimizer.OptimizerBlocks)[source]#

add another optimzier and specify which block to apply this optimizer to

update(optimizer_blocks: merlin.models.tf.blocks.optimizer.OptimizerBlocks)[source]#

update the optimzier of a block, which would update the block’s optimizer no matter what optimizer it used to utilize. If the block is not specified with an optimizer before, this functions would have the same functionality as self.add()

Note: the optimizer_blocks would be kept in self.update_optimizers_and_blocks, instead of self.optimizers_and_blocks

get_config()[source]#
classmethod from_config(config)[source]#
property iterations#

See base class.

variables()[source]#

Returns the optimizer’s variables.

property weights: List[tensorflow.python.ops.variables.Variable]#

Returns the optimizer’s variables.

property optimizers: List[keras.optimizers.legacy.optimizer_v2.OptimizerV2]#

default_optimizer is included here

Type

Returns the optimizers in MultiOptimizer (in the original order). Note