API Documentation

TensorFlow Models

Ranking Model Constructors

DCNModel(schema, depth[, deep_block, …])

Create a model using the architecture proposed in DCN V2: Improved Deep & Cross Network [1].

DLRMModel(schema, embedding_dim[, …])

DLRM-model architecture.

Retrieval Model Constructors

MatrixFactorizationModel(schema, dim[, …])

Builds a matrix factorization model.

TwoTowerModel(schema, query_tower[, …])

Builds the Two-tower architecture, as proposed in [1].

YoutubeDNNRetrievalModel(schema, max_seq_length)

Build the Youtube-DNN retrieval model.

Input Block Constructors

InputBlock(schema[, branches, post, …])

The entry block of the model to process input features from a schema.

ContinuousFeatures(*args, **kwargs)

Input block for continuous features.

ContinuousEmbedding(inputs, embedding_block)

EmbeddingFeatures(*args, **kwargs)

Input block for embedding-lookups for categorical features.

SequenceEmbeddingFeatures(*args, **kwargs)

Input block for embedding-lookups for categorical features. This module produces 3-D tensors, this is useful for sequential models like transformers. :param feature_config: This specifies what TableConfig to use for each feature. For shared embeddings, the same TableConfig can be used for multiple features. :type feature_config: Dict[str, FeatureConfig] :param item_id: The name of the feature that’s used for the item_id. :type item_id: str, optional :param padding_idx: The symbol to use for padding. :type padding_idx: int :param pre: Transformations to apply on the inputs when the module is called (so before call). :type pre: Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional :param post: Transformations to apply on the inputs after the module is called (so after call). :type post: Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional :param aggregation: Aggregation to apply after processing the call-method to output a single Tensor.

Model Building Block Constructors

DLRMBlock(schema, embedding_dim[, …])

Builds the DLRM architecture, as proposed in the following `paper https://arxiv.org/pdf/1906.00091.pdf`_ [1]_.

MLPBlock(dimensions[, activation, use_bias, …])

A block that applies a multi-layer perceptron to the input.

CrossBlock([depth, filter, low_rank_dim, …])

This block provides a way to create high-order feature interactions

TwoTowerBlock(*args, **kwargs)

Builds the Two-tower architecture, as proposed in the following `paper https://doi.org/10.1145/3298689.3346996`_ [Xinyang19].

MatrixFactorizationBlock(schema, dim[, …])

Returns a block for Matrix Factorization, which created the user and item embeddings based on the schema and computes the dot product between user and item L2-norm embeddings

DotProductInteraction(*args, **kwargs)

Modeling Prediction Task Constructors

PredictionTasks(schema[, task_blocks, …])

PredictionTask(*args, **kwargs)

Base-class for prediction tasks.

BinaryClassificationTask(*args, **kwargs)

Prediction task for binary classification.

MultiClassClassificationTask(*args, **kwargs)

Prediction task for multi-class classification.

RegressionTask(*args, **kwargs)

Prediction task for regression-task.

ItemRetrievalTask(*args, **kwargs)

Prediction-task for item-retrieval.

NextItemPredictionTask(schema[, …])

Function to create the NextItemPrediction task with the right parameters. :param schema: The schema object including features to use and their properties. :type schema: Schema :param weight_tying: The item_id embedding weights are shared with the prediction network layer. Defaults to True :type weight_tying: bool :param masking: Whether masking is used to transform inputs and targets or not Defaults to True :type masking: bool :param extra_pre_call: Optional extra pre-call block. Defaults to None. :type extra_pre_call: Optional[PredictionBlock] :param target_name: If specified, name of the target tensor to retrieve from dataloader. Defaults to None. :type target_name: Optional[str] :param task_name: name of the task. Defaults to None. :type task_name: Optional[str] :param task_block: The Block that applies additional layers op to inputs. Defaults to None. :type task_block: Block :param logits_temperature: Parameter used to reduce the model overconfidence, so that logits / T. Defaults to 1. :type logits_temperature: float :param l2_normalization: Apply L2 normalization before computing dot interactions. Defaults to False. :type l2_normalization: bool :param sampled_softmax: Compute the logits scores over all items of the catalog or generate a subset of candidates Defaults to False :type sampled_softmax: bool :param num_sampled: When sampled_softmax is enabled, specify the number of negative candidates to generate for each batch Defaults to 100 :type num_sampled: int :param min_sampled_id: The minimum id value to be sampled. Useful to ignore the first categorical encoded ids, which are usually reserved for <nulls>, out-of-vocabulary or padding. Defaults to 0. :type min_sampled_id: int :param post_logits: Optional extra pre-call block for post-processing the logits, by default None. You can for example use post_logits = mm.PopularitySamplingBlock(item_fequency) for populariy sampling correction. :type post_logits: Optional[PredictionBlock].

Model Pipeline Constructors

SequentialBlock(*args, **kwargs)

The SequentialLayer represents a sequence of Keras layers. It is a Keras Layer that can be used instead of tf.keras.layers.Sequential, which is actually a Keras Model. In contrast to keras Sequential, this layer can be used as a pure Layer in tf.functions and when exporting SavedModels, without having to pre-declare input and output shapes. In turn, this layer is usable as a preprocessing layer for TF Agents Networks, and can be exported via PolicySaver. Usage::.

ParallelBlock(*args, **kwargs)

Merge multiple layers or TabularModule’s into a single output of TabularData.

ParallelPredictionBlock(*args, **kwargs)

Multi-task prediction block.

DenseResidualBlock([low_rank_dim, …])

A block that applies a dense residual block to the input.

DualEncoderBlock(*args, **kwargs)

ResidualBlock(*args, **kwargs)

TabularBlock(*args, **kwargs)

Layer that’s specialized for tabular-data by integrating many often used operations.

Filter(*args, **kwargs)

Transformation that filters out certain features from TabularData.”

Masking Block Constructors

CausalLanguageModeling(*args, **kwargs)

In Causal Language Modeling (clm) you predict the next item based on past positions of the sequence. Future positions are masked. :param padding_idx: Index of padding item, used for masking and for getting batch of sequences with the same length. Defaults to 0 :type padding_idx: int :param eval_on_last_item_seq_only: When set to True, predict only the last non-padded item during evaluation Defaults to True :type eval_on_last_item_seq_only: bool :param item_id_feature_name: Name of the column containing the item ids Defaults to item_id :type item_id_feature_name: str :param train_on_last_item_seq_only: predict only the last item during training. Defaults to True. :type train_on_last_item_seq_only: Optional[bool].

MaskedLanguageModeling(*args, **kwargs)

In Masked Language Modeling (mlm) you randomly select some positions of the sequence to be predicted, which are masked. During training, the Transformer layer is allowed to use positions on the right (future info). During inference, all past items are visible for the Transformer layer, which tries to predict the next item. :param {mask_sequence_parameters}: :param mlm_probability: Probability of an item to be selected (masked) as a label of the given sequence. p.s. We enforce that at least one item is masked for each sequence, so that the network can learn something with it. Defaults to 0.15 :type mlm_probability: Optional[float].

Transformation Block Constructors

ExpandDims(*args, **kwargs)

Expand dims of selected input tensors. Example::.

AsDenseFeatures(*args, **kwargs)

Convert sparse inputs to dense tensors

AsSparseFeatures(*args, **kwargs)

Convert inputs to sparse tensors.

StochasticSwapNoise(*args, **kwargs)

Applies Stochastic replacement of sequence features

AsTabular(*args, **kwargs)

Converts a Tensor to TabularData by converting it to a dictionary.

Multi-Task Block Constructors

MMOEBlock(outputs, expert_block, num_experts)

CGCBlock(*args, **kwargs)

Data Loader Customization Constructor

merlin.models.tf.dataset.BatchedDataset(…)

Override class to customize data loading for backward compatibility with older NVTabular releases.

Metrics

NDCGAt(*args, **kwargs)

AvgPrecisionAt(*args, **kwargs)

RecallAt(*args, **kwargs)

ranking_metrics(top_ks, **kwargs)

Sampling

ItemSampler(*args, **kwargs)

InBatchSampler(*args, **kwargs)

Provides in-batch sampling [1]_ for two-tower item retrieval models.

CachedCrossBatchSampler(*args, **kwargs)

Provides efficient cached cross-batch [1]_ / inter-batch [2]_ negative sampling for two-tower item retrieval model.

CachedUniformSampler(*args, **kwargs)

Provides a cached uniform negative sampling for two-tower item retrieval model.

PopularityBasedSampler(*args, **kwargs)

Provides a popularity-based negative sampling for the softmax layer to ensure training efficiency when the catalog of items is very large.

Losses

CategoricalCrossEntropy([from_logits])

Extends tf.keras.losses.SparseCategoricalCrossentropy by making from_logits=True by default (in this case an optimized softmax activation is applied within this loss, you should not include softmax activation manually in the output layer).

SparseCategoricalCrossEntropy([from_logits])

Extends tf.keras.losses.SparseCategoricalCrossentropy by making from_logits=True by default (in this case an optimized softmax activation is applied within this loss, you should not include softmax activation manually in the output layer).

BPRLoss([reduction, name])

The Bayesian Personalised Ranking (BPR) pairwise loss [1]_

BPRmaxLoss(reg_lambda, **kwargs)

The BPR-max pairwise loss proposed in [1]_

HingeLoss([reduction, name])

Pairwise hinge loss, as described in [1]_: max(0, 1 + r_uj - r_ui)), where r_ui is the score of the positive item and r_uj the score of negative items.

LogisticLoss([reduction, name])

Pairwise log loss, as described in [1]_: log(1 + exp(r_uj - r_ui)), where r_ui is the score of the positive item and r_uj the score of negative items.

TOP1Loss([reduction, name])

The TOP pairwise loss proposed in [1]_

TOP1maxLoss([reduction, name])

The TOP1-max pairwise loss proposed in [1]_

TOP1v2Loss([reduction, name])

An adapted version of the TOP pairwise loss proposed in [1]_, but following the current GRU4Rec implementation [2]_.

Schema Functions

merlin.models.utils.schema_utils.select_targets(schema)

merlin.models.utils.schema_utils.schema_to_tensorflow_metadata_json(schema)

merlin.models.utils.schema_utils.tensorflow_metadata_json_to_schema(value)

merlin.models.utils.schema_utils.create_categorical_column(…)

merlin.models.utils.schema_utils.create_continuous_column(name)

merlin.models.utils.schema_utils.filter_dict_by_schema(…)

Filters out entries from input_dict, returns a dictionary where every entry corresponds to a column in the schema

merlin.models.utils.schema_utils.categorical_cardinalities(schema)

merlin.models.utils.schema_utils.categorical_domains(schema)

merlin.models.utils.schema_utils.get_embedding_sizes_from_schema(schema)

Provides a heristic (from Google) that suggests the embedding sizes as a function (forth root) of categorical features cardinalities, obtained from the schema.

merlin.models.utils.schema_utils.get_embedding_size_from_cardinality(…)

Provides a heristic (from Google) that suggests the embedding dimension as a function (forth root) of the feature cardinality.

Utilities

Miscellaneous Utility Functions

merlin.models.utils.misc_utils.filter_kwargs(…)

merlin.models.utils.misc_utils.safe_json(data)

merlin.models.utils.misc_utils.get_filenames(…)

merlin.models.utils.misc_utils.get_label_feature_name(…)

Analyses the feature map config and returns the name of the label feature (e.g.

merlin.models.utils.misc_utils.get_timestamp_feature_name(…)

Analyses the feature map config and returns the name of the label feature (e.g.

merlin.models.utils.misc_utils.get_parquet_files_names(…)

merlin.models.utils.misc_utils.Timing(message)

A context manager that prints the execution time of the block it manages

merlin.models.utils.misc_utils.get_object_size(obj)

Recursively finds size of objects

merlin.models.utils.misc_utils.validate_dataset(…)

Util function to load NVTabular Dataset from disk

Registry Functions

merlin.models.utils.registry.camelcase_to_snakecase(name)

merlin.models.utils.registry.snakecase_to_camelcase(name)

merlin.models.utils.registry.default_name(…)

Default name for a class or function.

merlin.models.utils.registry.default_object_name(obj)

merlin.models.utils.registry.Registry(…[, …])

Dict-like class for managing function registrations.

merlin.models.utils.registry.RegistryMixin(…)

merlin.models.utils.registry.display_list_by_prefix(…)

Creates a help string for names_list grouped by prefix.