API Documentation#

TensorFlow Models#

Ranking Model Constructors#

DCNModel(schema, depth[, deep_block, ...])

Create a model using the architecture proposed in DCN V2: Improved Deep & Cross Network [1].

DeepFMModel(schema[, embedding_dim, ...])

DeepFM-model architecture, which is the sum of the 1-dim output of a Factorization Machine [2] and a Deep Neural Network

DLRMModel(schema, *[, embeddings, ...])

DLRM-model architecture.

WideAndDeepModel(schema, deep_block[, ...])

The Wide&Deep architecture [1] was proposed by Google in 2016 to balance between the ability of neural networks to generalize and capacity of linear models to memorize relevant feature interactions.

Retrieval Model Constructors#

Encoder(*args, **kwargs)

Block that can be used for prediction and evaluation but not for training

EmbeddingEncoder(*args, **kwargs)

Creates an Encoder from an EmbeddingTable.

ItemRetrievalScorer(*args, **kwargs)

Block for ItemRetrieval, which expects query/user and item embeddings as input and uses dot product to score the positive item (inputs["item"]) and also sampled negative items (during training).

RetrievalModelV2(*args, **kwargs)

MatrixFactorizationModelV2(schema, dim[, ...])

Builds a matrix factorization (MF) model.

MatrixFactorizationModel(schema, dim[, ...])

Builds a matrix factorization model.

TwoTowerModelV2(query_tower, candidate_tower)

Builds the Two-tower architecture, as proposed in [1].

TwoTowerModel(schema, query_tower[, ...])

Builds the Two-tower architecture, as proposed in [1].

YoutubeDNNRetrievalModelV2(schema[, ...])

Build the Youtube-DNN retrieval model.

YoutubeDNNRetrievalModel(schema[, ...])

Build the Youtube-DNN retrieval model.

Input Block Constructors#

Embeddings(schema[, dim, infer_dim_fn, ...])

Creates a ParallelBlock with an EmbeddingTable for each categorical feature in the schema.

EmbeddingTable(*args, **kwargs)

Embedding table that is backed by a standard Keras Embedding Layer.

AverageEmbeddingsByWeightFeature(*args, **kwargs)

ReplaceMaskedEmbeddings(*args, **kwargs)

Takes a 3D input tensor (batch size x seq.

L2Norm(*args, **kwargs)

Apply L2-normalization to input tensors along a given axis

InputBlockV2([schema, categorical, ...])

The entry block of the model to process input features from a schema.

InputBlock(schema[, branches, pre, post, ...])

The entry block of the model to process input features from a schema.

Continuous(*args, **kwargs)

Filters (keeps) only the continuous features.

ContinuousFeatures(*args, **kwargs)

Input block for continuous features.

ContinuousEmbedding(inputs, embedding_block)

Concatenates all numerical features and project then using the

ContinuousProjection(schema, projection)

Concatenates the continuous features and combines them using a layer

SequenceEmbeddingFeatures(*args, **kwargs)

Input block for embedding-lookups for categorical features.

Model Building Block Constructors#

DLRMBlock(schema, *[, embedding_dim, ...])

Builds the DLRM architecture, as proposed in the following

MLPBlock(dimensions[, activation, use_bias, ...])

A block that applies a multi-layer perceptron to the input.

CrossBlock([depth, filter, low_rank_dim, ...])

This block provides a way to create high-order feature interactions

TwoTowerBlock(*args, **kwargs)

Builds the Two-tower architecture, as proposed in the following `paper https://doi.org/10.1145/3298689.3346996`_ [Xinyang19].

MatrixFactorizationBlock(schema, dim[, ...])

Returns a block for Matrix Factorization, which created the user and item embeddings based on the schema and computes the dot product between user and item L2-norm embeddings

DotProductInteraction(*args, **kwargs)

FMBlock(schema[, fm_input_block, ...])

Implements the Factorization Machine, as introduced in [1].

FMPairwiseInteraction(*args, **kwargs)

Compute pairwise (2nd-order) feature interactions like defined in Factorized Machine [1].

Modeling Prediction Task Constructors#

Note

The modeling prediction task classes are deprecated in favor of the prediction output classes.

PredictionTasks(schema[, task_blocks, ...])

Creates Multi-task prediction Blocks from schema

PredictionTask(*args, **kwargs)

Base-class for prediction tasks.

BinaryClassificationTask(*args, **kwargs)

Prediction task for binary classification.

MultiClassClassificationTask(*args, **kwargs)

Prediction task for multi-class classification.

RegressionTask(*args, **kwargs)

Prediction task for regression-task.

ItemRetrievalTask(*args, **kwargs)

Prediction-task for item-retrieval.

Modeling Prediction Output Constructors#

OutputBlock(schema[, model_outputs, pre, ...])

Creates model output(s) based on the columns tagged as target in the schema.

ModelOutput(*args, **kwargs)

Base-class for prediction blocks.

BinaryOutput(*args, **kwargs)

Binary-classification prediction block.

CategoricalOutput(*args, **kwargs)

Categorical output

ContrastiveOutput(*args, **kwargs)

Categorical output

RegressionOutput(*args, **kwargs)

Regression prediction block

ColumnBasedSampleWeight(*args, **kwargs)

Allows using columns (features or targets) as sample weights for a give ModelOutput.

Model Pipeline Constructors#

SequentialBlock(*args, **kwargs)

The SequentialLayer represents a sequence of Keras layers. It is a Keras Layer that can be used instead of tf.keras.layers.Sequential, which is actually a Keras Model. In contrast to keras Sequential, this layer can be used as a pure Layer in tf.functions and when exporting SavedModels, without having to pre-declare input and output shapes. In turn, this layer is usable as a preprocessing layer for TF Agents Networks, and can be exported via PolicySaver. Usage::.

ParallelBlock(*args, **kwargs)

Merge multiple layers or TabularModule's into a single output of TabularData.

ParallelPredictionBlock(*args, **kwargs)

Multi-task prediction block.

DenseResidualBlock([low_rank_dim, ...])

A block that applies a dense residual block to the input.

DualEncoderBlock(*args, **kwargs)

ResidualBlock(*args, **kwargs)

Creates a shortcut connection where the residuals are summed to the output of the block

TabularBlock(*args, **kwargs)

Layer that's specialized for tabular-data by integrating many often used operations.

Filter(*args, **kwargs)

Transformation that filters out certain features from TabularData."

Cond(*args, **kwargs)

Layer to enable conditionally apply layers.

Model Evaluation Constructors#

TopKEncoder(*args, **kwargs)

Block that can be used for top-k prediction & evaluation, initialized from a trained retrieval model

Model Optimizer Constructors#

MultiOptimizer(optimizers_and_blocks[, ...])

An optimizer that composes multiple individual optimizers.

LazyAdam([learning_rate, beta_1, beta_2, ...])

Variant of the Adam optimizer that handles sparse updates more efficiently.

OptimizerBlocks(optimizer, blocks)

dataclass for a pair of optimizer and blocks that the optimizer should apply to.

split_embeddings_on_size(embeddings, threshold)

split embedding tables in ParallelBlock based on size threshold (first dimension of embedding tables), return a tuple of two lists, which contain large embeddings and small embeddings

Transformation Block Constructors#

CategoryEncoding(*args, **kwargs)

A preprocessing layer which encodes integer features.

MapValues(*args, **kwargs)

Layer to map values of a dictionary of tensors.

PrepareListFeatures(*args, **kwargs)

Prepares all list (multi-hot/sequential) features, so that they converted to tf.RaggedTensor or dense tf.Tensor based on the columns schema.

PrepareFeatures(*args, **kwargs)

Prepares scalar and list (multi-hot/sequential) features to be used with a Merlin model.

ToSparse(*args, **kwargs)

Convert the features provided in the schema to sparse tensors.

ToDense(*args, **kwargs)

Convert the features provided in the schema to dense tensors.

ToTarget(*args, **kwargs)

Transform columns to targets

ToOneHot(*args, **kwargs)

Transform the categorical encoded labels into a one-hot representation

HashedCross(*args, **kwargs)

A transformation block which crosses categorical features using the "hashing trick". Conceptually, the transformation can be thought of as: hash(concatenation of features) % num_bins Example usage:: model_body = ParallelBlock( TabularBlock.from_schema(schema=cross_schema, pre=ml.HashedCross(cross_schema, num_bins = 1000)), is_input=True).connect(ml.MLPBlock([64, 32])) model = ml.Model(model_body, ml.BinaryClassificationTask("click")) :param schema: The Schema with the input features :type schema: Schema :param num_bins: Number of hash bins. :type num_bins: int :param output_mode: Specification for the output of the layer. Defaults to "one_hot". Values can be "int", or "one_hot", configuring the layer as follows: - "int": Return the integer bin indices directly. - "one_hot": Encodes each individual element in the input into an array with the same size as num_bins, containing a 1 at the input's bin index. :type output_mode: string :param sparse: Boolean. Only applicable to "one_hot" mode. If True, returns a SparseTensor instead of a dense Tensor. Defaults to False. :type sparse: bool :param output_name: Name of output feature, if not specified, default would be cross_<feature_name>_<feature_name>_<...> :type output_name: string :param infer_num_bins: If True, num_bins would be set as the multiplier of feature cadinalities, if the multiplier is bigger than max_num_bins, then it would be cliped by max_num_bins :type infer_num_bins: bool :param max_num_bins: Upper bound of num_bins, by default 100000. :type max_num_bins: int.

HashedCrossAll(schema[, num_bins, ...])

Parallel block consists of HashedCross blocks for all combinations of schema with all levels

BroadcastToSequence(*args, **kwargs)

Broadcast context features to match the timesteps of sequence features.

SequencePredictNext(*args, **kwargs)

Prepares sequential inputs and targets for next-item prediction.

SequencePredictLast(*args, **kwargs)

Prepares sequential inputs and targets for last-item prediction.

SequencePredictRandom(*args, **kwargs)

Prepares sequential inputs and targets for random-item prediction.

SequenceTargetAsInput(*args, **kwargs)

Creates targets to be equal to one of the sequential input features.

SequenceMaskLast(*args, **kwargs)

This block copies one of the sequence input features to be the target feature.

SequenceMaskRandom(*args, **kwargs)

This block implements the Masked Language Modeling (MLM) training approach introduced in BERT (NLP) and later adapted to RecSys by BERT4Rec [1].

ExpandDims(*args, **kwargs)

Expand dims of selected input tensors. Example:: inputs = { "cont_feat1": tf.random.uniform((NUM_ROWS,)), "cont_feat2": tf.random.uniform((NUM_ROWS,)), "multi_hot_categ_feat": tf.random.uniform( (NUM_ROWS, 4), minval=1, maxval=100, dtype=tf.int32 ), } expand_dims_op = tr.ExpandDims(expand_dims={"cont_feat2": 0, "multi_hot_categ_feat": 1}) expanded_inputs = expand_dims_op(inputs).

StochasticSwapNoise(*args, **kwargs)

Applies Stochastic replacement of sequence features

AsTabular(*args, **kwargs)

Converts a Tensor to TabularData by converting it to a dictionary.

Multi-Task Block Constructors#

MMOEBlock(outputs, expert_block, num_experts)

Implements the Multi-gate Mixture-of-Experts (MMoE) introduced in [1].

CGCBlock(*args, **kwargs)

Implements the Customized Gate Control (CGC) proposed in [1].

PLEBlock(num_layers, outputs, expert_block)

Implements the Progressive Layered Extraction (PLE) model from [1], by stacking CGC blocks (CGCBlock).

Data Loader Customization Constructor#

merlin.models.tf.Loader(paths_or_dataset, ...)

Override class to customize data loading for backward compatibility with older NVTabular releases.

Metrics#

AvgPrecisionAt(*args, **kwargs)

MRRAt(*args, **kwargs)

NDCGAt(*args, **kwargs)

PrecisionAt(*args, **kwargs)

RecallAt(*args, **kwargs)

TopKMetricsAggregator(*args, **kwargs)

Aggregator for top-k metrics (TopkMetric) that is optimized to sort top-k predictions only once for all metrics.

Sampling#

ItemSampler(*args, **kwargs)

InBatchSampler(*args, **kwargs)

Provides in-batch sampling [1]_ for two-tower item retrieval models.

PopularityBasedSampler(*args, **kwargs)

Provides a popularity-based negative sampling for the softmax layer to ensure training efficiency when the catalog of items is very large.

Losses#

CategoricalCrossEntropy([from_logits])

Extends tf.keras.losses.SparseCategoricalCrossentropy by making from_logits=True by default (in this case an optimized softmax activation is applied within this loss, you should not include softmax activation manually in the output layer).

SparseCategoricalCrossEntropy([from_logits])

Extends tf.keras.losses.SparseCategoricalCrossentropy by making from_logits=True by default (in this case an optimized softmax activation is applied within this loss, you should not include softmax activation manually in the output layer).

BPRLoss([reduction, name])

The Bayesian Personalised Ranking (BPR) pairwise loss [1]_

BPRmaxLoss([reg_lambda])

The BPR-max pairwise loss proposed in [1]_

HingeLoss([reduction, name])

Pairwise hinge loss, as described in [1]_: max(0, 1 + r_uj - r_ui)), where r_ui is the score of the positive item and r_uj the score of negative items.

LogisticLoss([reduction, name])

Pairwise log loss, as described in [1]_: log(1 + exp(r_uj - r_ui)), where r_ui is the score of the positive item and r_uj the score of negative items.

TOP1Loss([reduction, name])

The TOP pairwise loss proposed in [1]_

TOP1maxLoss([reduction, name])

The TOP1-max pairwise loss proposed in [1]_

TOP1v2Loss([reduction, name])

An adapted version of the TOP pairwise loss proposed in [1]_, but following the current GRU4Rec implementation [2]_.

Schema Functions#

merlin.models.utils.schema_utils.select_targets(schema)

merlin.models.utils.schema_utils.schema_to_tensorflow_metadata_json(schema)

merlin.models.utils.schema_utils.tensorflow_metadata_json_to_schema(value)

merlin.models.utils.schema_utils.create_categorical_column(...)

merlin.models.utils.schema_utils.create_continuous_column(name)

merlin.models.utils.schema_utils.filter_dict_by_schema(...)

Filters out entries from input_dict, returns a dictionary where every entry corresponds to a column in the schema

merlin.models.utils.schema_utils.categorical_cardinalities(schema)

merlin.models.utils.schema_utils.categorical_domains(schema)

merlin.models.utils.schema_utils.get_embedding_sizes_from_schema(schema)

Provides a heristic (from Google) that suggests the embedding sizes as a function (forth root) of categorical features cardinalities, obtained from the schema.

merlin.models.utils.schema_utils.get_embedding_size_from_cardinality(...)

Provides a heuristic (from Google) that suggests the embedding dimension as a function (forth root) of the feature cardinality.

Utilities#

Tensor Utilities#

TensorInitializer(weights, **kwargs)

Initializer that returns a tensor (e.g.

Miscellaneous Utility Functions#

merlin.models.utils.misc_utils.filter_kwargs(...)

merlin.models.utils.misc_utils.safe_json(data)

merlin.models.utils.misc_utils.get_filenames(...)

merlin.models.utils.misc_utils.get_label_feature_name(...)

Analyses the feature map config and returns the name of the label feature (e.g.

merlin.models.utils.misc_utils.get_timestamp_feature_name(...)

Analyses the feature map config and returns the name of the label feature (e.g.

merlin.models.utils.misc_utils.get_parquet_files_names(...)

merlin.models.utils.misc_utils.Timing(message)

A context manager that prints the execution time of the block it manages

merlin.models.utils.misc_utils.get_object_size(obj)

Recursively finds size of objects

merlin.models.utils.misc_utils.validate_dataset(...)

Util function to load NVTabular Dataset from disk

Registry Functions#

merlin.models.utils.registry.camelcase_to_snakecase(name)

merlin.models.utils.registry.snakecase_to_camelcase(name)

merlin.models.utils.registry.default_name(...)

Default name for a class or function.

merlin.models.utils.registry.default_object_name(obj)

merlin.models.utils.registry.Registry(...[, ...])

Dict-like class for managing function registrations.

merlin.models.utils.registry.RegistryMixin(...)

merlin.models.utils.registry.display_list_by_prefix(...)

Creates a help string for names_list grouped by prefix.