Additional Resources

Transformers4Rec and Session-Based Recommendation

Competitions

  • SIGIR eCommerce Workshop Data Challenge 2021 (organized by Coveo) - The NVIDIA Merlin team won this competition by using Transformer architectures to predict the next interacted products for user sessions in an e-commerce. For more information about our solution, refer to our blog post and paper.

  • WSDM WebTour Challenge 2021 (organized by Booking. com) - The NVIDIA Merlin team won this competition by leveraging a Transformers4Rec model in the final ensemble. For more information about our solution, refer to our blog post and paper.

NVIDIA Merlin

Transformers4Rec is part of the NVIDIA Merlin ecosystem for recommender systems, which includes the following components:

  • NVTabular - NVTabular is a feature engineering and preprocessing library for tabular data that is designed to easily manipulate datasets at terabyte scale and train deep learning (DL) based recommender systems.

  • Triton Inference Server - Provides a cloud and edge inferencing solution that is optimized for both CPUs and GPUs. Transformers4Rec models can be exported and served with Triton.

  • HugeCTR - A GPU-accelerated recommender framework designed to distribute training across multiple GPUs and nodes and estimate Click-Through Rates (CTRs).

Supported Hugging Face Architectures and Pre-Training Approaches

Transformers4Rec supports the following masking tasks:

  • Causal Language Modeling (CLM)

  • Masked Language Modeling (MLM)

  • Permutation Language Modeling (PLM)

  • Replacement Token Detection (RTD)

In Transformers4Rec, we decouple the pre-training approaches from transformers architectures and provide a TransformerBlock module that links the config class of the transformer architecture to the masking task. Transformers4Rec also defines a transformer_registry, which includes pre-defined T4RecConfig constructors that automatically set the arguments of the related Hugging Face Transformers’ configuration classes.

The table below represents the architectures that are currently supported in Transformers4Rec and links them to the possible masking tasks. It also lists the pre-registered T4RecConfig classes in the Registered column.

Tip: Consider registering HF Transformers config classes into Transformers4Rec as this can be a great first contribution.

Model

CLM

MLM

PLM

RTD

Registered

AlBERT

BERT

ConvBERT

DeBERTa

DistilBERT

GPT-2

Longformer

MegatronBert

MPNet

RoBERTa

RoFormer

Transformer-XL

XLNet

Note: The following HF architectures will eventually be supported: Reformer, Funnel Transformer, and ELECTRA.