# Copyright 2021 NVIDIA Corporation. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================

# Each user is responsible for checking the content of datasets and the
# applicable licenses and determining if suitable for the intended use.
https://developer.download.nvidia.com/notebooks/dlsw-notebooks/merlin_merlin_getting-started-movielens-03-training-with-tf/nvidia_logo.png

Getting Started MovieLens: Training with TensorFlow#

This notebook is created using the latest stable merlin-tensorflow-training container.

Overview#

In this notebook, we will train a Merlin Models model implementing the Deep and Cross Network (DCN) architecture.

Merlin Models streamlines the training process and thus despite using a fairly elaborate deep learning architecture, we will only need to write a few lines of code!

Additionally, to accelerate the training, we will leverage the Merlin Dataloader.

The following notebooks provide a great overview of the concepts in Merlin Models. To learn more about the Merlin Dataloader, please take a look at its repository.

Learning objectives#

This notebook explains, how to use the Merlin dataloader to accelerate TensorFlow training.

  1. Use Merlin Dataloader with TensorFlow Keras model.

  2. Export the model for performing inference on the Triton Inference Server.

MovieLens25M#

The MovieLens25M is a popular dataset for recommender systems and is used in academic publications. The dataset contains 25M movie ratings for 62,000 movies given by 162,000 users. Many projects use only the user/item/rating information of MovieLens, but the original dataset provides metadata for the movies, as well. For example, which genres a movie has.

In this notebook we will train a Merlin Models model (Deep Cross Network) to predict the rating a user is likely to give a movie. To ensure we utilize our hardware to the fullest, we will leverage the Merlin Dataloder. It will allow us to load data in a highly optmized way and will ensure that our GPU is utilized to maximum.

Data Preparation#

# External dependencies
import os
import glob
os.environ["TF_GPU_ALLOCATOR"]="cuda_malloc_async"

import nvtabular as nvt

We define our base input directory, containing the data.

INPUT_DATA_DIR = os.environ.get(
    "INPUT_DATA_DIR", os.path.expanduser("/workspace/nvt-examples/movielens/data/")
)
# path to save the models
MODEL_DIR = os.environ.get("MODEL_DIR", os.path.expanduser("/workspace/nvt-examples/models"))
import os
import numpy as np

import nvtabular as nvt
from nvtabular.ops import *
from merlin.schema.tags import Tags
from merlin.models.utils.example_utils import workflow_fit_transform, save_results

from merlin.schema.tags import Tags

import merlin.models.tf as mm
from merlin.io.dataset import Dataset

import tensorflow as tf
2023-01-20 11:26:39.373230: I tensorflow/core/platform/cpu_feature_guard.cc:194] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE3 SSE4.1 SSE4.2 AVX
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
WARNING:tensorflow:Please fix your imports. Module tensorflow.python.training.tracking.data_structures has been moved to tensorflow.python.trackable.data_structures. The old module will be deleted in version 2.11.
2023-01-20 11:26:40.514976: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:996] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-01-20 11:26:40.515413: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:996] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-01-20 11:26:40.515595: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:996] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-01-20 11:26:40.727125: I tensorflow/core/platform/cpu_feature_guard.cc:194] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE3 SSE4.1 SSE4.2 AVX
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-01-20 11:26:40.728141: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:996] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-01-20 11:26:40.728351: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:996] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-01-20 11:26:40.728508: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:996] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-01-20 11:26:41.475009: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:996] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-01-20 11:26:41.475231: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:996] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-01-20 11:26:41.475394: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:996] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-01-20 11:26:41.475508: I tensorflow/core/common_runtime/gpu/gpu_process_state.cc:222] Using CUDA malloc Async allocator for GPU: 0
2023-01-20 11:26:41.475570: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1637] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 24576 MB memory:  -> device: 0, name: Quadro RTX 8000, pci bus id: 0000:08:00.0, compute capability: 7.5
/usr/local/lib/python3.8/dist-packages/tqdm/auto.py:22: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

Let’s read our train and validation set that we created in previous notebooks.

train_ds = nvt.Dataset(f'{INPUT_DATA_DIR}/train', engine='parquet', dtypes={'rating': np.int8})
valid_ds = nvt.Dataset(f'{INPUT_DATA_DIR}/valid', engine='parquet', dtypes={'rating': np.int8})

# I am modifying the schema here as we will not use the `genres` column for training
train_ds.schema = train_ds.schema.remove_col('genres')
valid_ds.schema = valid_ds.schema.remove_col('genres')

# specifying the target column
target_column = train_ds.schema.select_by_tag(Tags.TARGET).column_names[0]
target_column
'rating'

Training our model#

Let us now train our model. The process will be extremely streamlined as this is what Merlin Models was designed to facilitate.

Only few lines of code are needed to carry out the training!

Model definition#

model = mm.DCNModel(
    train_ds.schema,
    depth=2,
    deep_block=mm.MLPBlock([64, 32]),
    prediction_tasks=mm.BinaryOutput(target_column),
)

Specifying Hyperparameters#

batch_size = 16 * 1024
LR = 0.03

Training our model#

During training, we pass our dataset to the fit function of the model and everything is taken care of for us.

Internally, Merlin Dataloader is used to feed the data in a highly optimized way to our model during training.

The DCN-V2 is an architecture proposed as an improvement upon the original DCN model. The explicit feature interactions of the inputs are learned through cross layers, and then combined with a deep network to learn complementary implicit interactions. The overall model architecture is depicted in Figure below, with two ways to combine the cross network with the deep network: (1) stacked and (2) parallel. The output of the embbedding layer is the concatenation of all the embedded vectors and the normalized dense features: x0 = [xembed,1; … ; xembed,𝑛; 𝑥dense].

DCN

Image Source: DCN V2 paper

opt = tf.keras.optimizers.Adagrad(learning_rate=LR)
model.compile(optimizer=opt, run_eagerly=False, metrics=[tf.keras.metrics.AUC()])
model.fit(train_ds, validation_data=valid_ds, batch_size=batch_size)
1221/1221 [==============================] - 9s 6ms/step - loss: 0.6609 - auc: 0.5281 - regularization_loss: 0.0000e+00 - loss_batch: 0.6609 - val_loss: 0.6588 - val_auc: 0.5626 - val_regularization_loss: 0.0000e+00 - val_loss_batch: 0.6537
<keras.callbacks.History at 0x7f7b5959ed30>

Saving the model for inference#

We are now ready to save the model for inference.

from merlin.systems.dag.ensemble import Ensemble
from merlin.systems.dag.ops.workflow import TransformWorkflow
from merlin.systems.dag.ops.tensorflow import PredictTensorflow

Let’s create the serving operator that will use to predict using our model on TIS (the Triton Inference Server) and write it to disk along with the config files to load onto the server in the subsequent notebook.

from merlin.systems.dag.ensemble import Ensemble

serving_operators = ['userId', 'movieId'] >>  PredictTensorflow(model)
ensemble = Ensemble(serving_operators, train_ds.schema.remove_by_tag(Tags.TARGET).remove_col('genres'))

export_path = os.path.join(MODEL_DIR, "ensemble")
os.makedirs(export_path)

ens_conf, node_confs = ensemble.export(export_path)
INFO:tensorflow:Unsupported signature for serialization: ((Prediction(outputs={'rating/binary_output': TensorSpec(shape=(None, 1), dtype=tf.float32, name='outputs/outputs/rating/binary_output')}, targets={'rating/binary_output': TensorSpec(shape=(None, 1), dtype=tf.float32, name='outputs/targets/rating/binary_output')}, sample_weight={'rating/binary_output': None}, features=None, negative_candidate_ids=None), <tensorflow.python.framework.func_graph.UnknownArgument object at 0x7f7b58732c40>), {}).
WARNING:absl:Function `_wrapped_model` contains input name(s) movieId, userId with unsupported characters which will be renamed to movieid, userid in the SavedModel.
INFO:tensorflow:Unsupported signature for serialization: ((Prediction(outputs={'rating/binary_output': TensorSpec(shape=(None, 1), dtype=tf.float32, name='outputs/outputs/rating/binary_output')}, targets={'rating/binary_output': TensorSpec(shape=(None, 1), dtype=tf.float32, name='outputs/targets/rating/binary_output')}, sample_weight={'rating/binary_output': None}, features=None, negative_candidate_ids=None), <tensorflow.python.framework.func_graph.UnknownArgument object at 0x7f7b58732c40>), {}).
INFO:tensorflow:Unsupported signature for serialization: ((Prediction(outputs={'rating/binary_output': TensorSpec(shape=(None, 1), dtype=tf.float32, name='outputs/outputs/rating/binary_output')}, targets={'rating/binary_output': TensorSpec(shape=(None, 1), dtype=tf.float32, name='outputs/targets/rating/binary_output')}, sample_weight={'rating/binary_output': None}, features=None, negative_candidate_ids=None), <tensorflow.python.framework.func_graph.UnknownArgument object at 0x7f7b58732c40>), {}).
WARNING:absl:Found untraced functions such as train_compute_metrics, model_context_layer_call_fn, model_context_layer_call_and_return_conditional_losses, dense_6_layer_call_fn, dense_6_layer_call_and_return_conditional_losses while saving (showing 5 of 47). These functions will not be directly callable after loading.
INFO:tensorflow:Assets written to: /tmp/tmp5eyrbewk/model.savedmodel/assets
INFO:tensorflow:Assets written to: /tmp/tmp5eyrbewk/model.savedmodel/assets
WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
INFO:tensorflow:Unsupported signature for serialization: ((Prediction(outputs={'rating/binary_output': TensorSpec(shape=(None, 1), dtype=tf.float32, name='outputs/outputs/rating/binary_output')}, targets={'rating/binary_output': TensorSpec(shape=(None, 1), dtype=tf.float32, name='outputs/targets/rating/binary_output')}, sample_weight={'rating/binary_output': None}, features=None, negative_candidate_ids=None), <tensorflow.python.framework.func_graph.UnknownArgument object at 0x7f7b58732c40>), {}).
INFO:tensorflow:Unsupported signature for serialization: ((Prediction(outputs={'rating/binary_output': TensorSpec(shape=(None, 1), dtype=tf.float32, name='outputs/outputs/rating/binary_output')}, targets={'rating/binary_output': TensorSpec(shape=(None, 1), dtype=tf.float32, name='outputs/targets/rating/binary_output')}, sample_weight={'rating/binary_output': None}, features=None, negative_candidate_ids=None), <tensorflow.python.framework.func_graph.UnknownArgument object at 0x7f7b58732c40>), {}).
WARNING:absl:Function `_wrapped_model` contains input name(s) movieId, userId with unsupported characters which will be renamed to movieid, userid in the SavedModel.
INFO:tensorflow:Unsupported signature for serialization: ((Prediction(outputs={'rating/binary_output': TensorSpec(shape=(None, 1), dtype=tf.float32, name='outputs/outputs/rating/binary_output')}, targets={'rating/binary_output': TensorSpec(shape=(None, 1), dtype=tf.float32, name='outputs/targets/rating/binary_output')}, sample_weight={'rating/binary_output': None}, features=None, negative_candidate_ids=None), <tensorflow.python.framework.func_graph.UnknownArgument object at 0x7f7b58732c40>), {}).
INFO:tensorflow:Unsupported signature for serialization: ((Prediction(outputs={'rating/binary_output': TensorSpec(shape=(None, 1), dtype=tf.float32, name='outputs/outputs/rating/binary_output')}, targets={'rating/binary_output': TensorSpec(shape=(None, 1), dtype=tf.float32, name='outputs/targets/rating/binary_output')}, sample_weight={'rating/binary_output': None}, features=None, negative_candidate_ids=None), <tensorflow.python.framework.func_graph.UnknownArgument object at 0x7f7b58732c40>), {}).
WARNING:absl:Found untraced functions such as train_compute_metrics, model_context_layer_call_fn, model_context_layer_call_and_return_conditional_losses, dense_6_layer_call_fn, dense_6_layer_call_and_return_conditional_losses while saving (showing 5 of 47). These functions will not be directly callable after loading.
INFO:tensorflow:Assets written to: /workspace/nvt-examples/models/ensemble/0_predicttensorflowtriton/1/model.savedmodel/assets
INFO:tensorflow:Assets written to: /workspace/nvt-examples/models/ensemble/0_predicttensorflowtriton/1/model.savedmodel/assets
WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.