merlin.models.tf.YoutubeDNNRetrievalModelV2#

merlin.models.tf.YoutubeDNNRetrievalModelV2(schema: merlin.schema.schema.Schema, candidate_id_tag=Tags.ITEM_ID, top_block: typing.Optional[keras.engine.base_layer.Layer] = MLPBlock(   (layers): List(     (0): _Dense(       (dense): Dense(64, activation=relu, use_bias=True)     )   ) ), post: typing.Optional[keras.engine.base_layer.Layer] = None, inputs: typing.Optional[keras.engine.base_layer.Layer] = None, outputs: typing.Optional[typing.Union[merlin.models.tf.outputs.base.ModelOutput, typing.List[merlin.models.tf.outputs.base.ModelOutput]]] = None, logits_temperature: float = 1.0, num_sampled: int = 100, min_sampled_id: int = 0, **kwargs) merlin.models.tf.models.base.RetrievalModelV2[source]#

Build the Youtube-DNN retrieval model. More details of the architecture can be found in 1. Training with sampled_softmax is enabled by default 2 3 4.

Example Usage::

model = YoutubeDNNRetrievalModelV2(schema, num_sampled=100) model.compile(optimizer=”adam”) model.fit(train_data, epochs=10)

References

1

Covington, Paul, Jay Adams, and Emre Sargin. “Deep neural networks for youtube recommendations.” Proceedings of the 10th ACM conference on recommender systems. 2016.

2

Yoshua Bengio and Jean-Sébastien Sénécal. 2003. Quick Training of Probabilistic Neural Nets by Importance Sampling. In Proceedings of the conference on Artificial Intelligence and Statistics (AISTATS).

3

Y. Bengio and J. S. Senecal. 2008. Adaptive Importance Sampling to Accelerate Training of a Neural Probabilistic Language Model. Trans. Neur. Netw. 19, 4 (April 2008), 713–722. https://doi.org/10.1109/TNN.2007.912312

4

Jean, Sébastien, et al. “On using very large target vocabulary for neural machine translation.” arXiv preprint arXiv:1412.2007 (2014).

Parameters
  • schema (Schema) – The Schema with the input features

  • candidate_id_tag (Tag) – The tag to select candidate-id feature, by default Tags.ITEM_ID

  • top_block (tf.keras.layers.Layer) – The hidden layers to apply on top of the features representation vector.

  • inputs (tf.keras.layers.Layer, optional) – The input layer to encode input features (sparse and context features) If not specified, the input layer is inferred from the schema By default None

  • post (Optional[tf.keras.layers.Layer], optional) – The optional layer to apply on top of the query encoder.

  • logits_temperature (float, optional) – Parameter used to reduce model overconfidence, so that logits / T. Defaults to 1.

  • num_sampled (int, optional) – When sampled_softmax is enabled, specify the number of negative candidates to generate for each batch. By default 100

  • min_sampled_id (int, optional) – The minimum id value to be sampled with sampled softmax. Useful to ignore the first categorical encoded ids, which are usually reserved for <nulls>, out-of-vocabulary or padding. By default 0.