merlin.models.tf.YoutubeDNNRetrievalModel#
- merlin.models.tf.YoutubeDNNRetrievalModel(schema: merlin.schema.schema.Schema, aggregation: str = 'concat', top_block: merlin.models.tf.core.base.Block = MLPBlock( (layers): List( (0): _Dense( (dense): Dense(64, activation=relu, use_bias=True) ) ) ), l2_normalization: bool = True, extra_pre_call: typing.Optional[merlin.models.tf.core.base.Block] = None, task_block: typing.Optional[merlin.models.tf.core.base.Block] = None, logits_temperature: float = 1.0, sampled_softmax: bool = True, num_sampled: int = 100, min_sampled_id: int = 0, embedding_options: merlin.models.tf.inputs.embedding.EmbeddingOptions = EmbeddingOptions(embedding_dims=None, embedding_dim_default=64, infer_embedding_sizes=False, infer_embedding_sizes_multiplier=2.0, infer_embeddings_ensure_dim_multiple_of_8=False, embeddings_initializers=None, embeddings_l2_reg=0.0, combiner='mean')) merlin.models.tf.models.base.Model[source]#
- Build the Youtube-DNN retrieval model. More details of the architecture can be found in 1. The sampled_softmax is enabled by default 2 3 4. - Example Usage::
- model = YoutubeDNNRetrievalModel(schema, num_sampled=100) model.compile(optimizer=”adam”) model.fit(train_data, epochs=10) 
 - References - 1
- Covington, Paul, Jay Adams, and Emre Sargin. “Deep neural networks for youtube recommendations.” Proceedings of the 10th ACM conference on recommender systems. 2016. 
- 2
- Yoshua Bengio and Jean-Sébastien Sénécal. 2003. Quick Training of Probabilistic Neural Nets by Importance Sampling. In Proceedings of the conference on Artificial Intelligence and Statistics (AISTATS). 
- 3
- Y. Bengio and J. S. Senecal. 2008. Adaptive Importance Sampling to Accelerate Training of a Neural Probabilistic Language Model. Trans. Neur. Netw. 19, 4 (April 2008), 713–722. https://doi.org/10.1109/TNN.2007.912312 
- 4
- Jean, Sébastien, et al. “On using very large target vocabulary for neural machine translation.” arXiv preprint arXiv:1412.2007 (2014). 
 - Parameters
- schema (Schema) – The Schema with the input features 
- aggregation (str) – The aggregation method to use for the sequence of features. Defaults to concat. 
- top_block (Block) – The Block that combines the top features 
- l2_normalization (bool) – Whether to apply L2 normalization before computing dot interactions. Defaults to True. 
- extra_pre_call (Optional[Block]) – The optional Block to apply before the model. 
- task_block (Optional[Block]) – The optional Block to apply on the model. 
- logits_temperature (float) – Parameter used to reduce model overconfidence, so that logits / T. Defaults to 1. 
- sampled_softmax (bool) – Compute the logits scores over all items of the catalog or generate a subset of candidates Defaults to False 
- num_sampled (int) – When sampled_softmax is enabled, specify the number of negative candidates to generate for each batch. Defaults to 100 
- min_sampled_id (int) – The minimum id value to be sampled with sampled softmax. Useful to ignore the first categorical encoded ids, which are usually reserved for <nulls>, out-of-vocabulary or padding. Defaults to 0. 
- embedding_options (EmbeddingOptions, optional) – An EmbeddingOptions instance, which allows for a number of options for the embedding table, by default EmbeddingOptions()