Additional Resources

Transformers4Rec and Session-Based Recommendation
Competitions
NVIDIA Merlin
Supported Hugging Face Architectures and Pre-Training Approaches
Other Resources

Transformers4Rec and Session-Based Recommendation

Transformers4Rec: Bridging the Gap between NLP and Sequential / Session-Based Recommendation - Paper presented at the ACM RecSys’21 where we discuss the relationship between NLP and RecSys and introduce Transformers4Rec along with its core features. We also provide a comprehensive empirical analysis comparing Transformer architectures with session-based recommendation algorithms. To obtain the Online Appendices of the Paper and Experiments Reproducibility, refer to this README.
Blog post - Briefly introduces Transformers4Rec.
End-to-end session based recommendation demo - Recorded demo presented at ACM RecSys’21 about end-to-end session-based recommendation using NVTabular, Transformers4Rec, and Triton.
Session-based recommenders - Provides session-based recommendation resources.

Competitions

SIGIR eCommerce Workshop Data Challenge 2021 (organized by Coveo) - The NVIDIA Merlin team won this competition by using Transformer architectures to predict the next interacted products for user sessions in an e-commerce. For more information about our solution, refer to our blog post and paper.
WSDM WebTour Challenge 2021 (organized by Booking. com) - The NVIDIA Merlin team won this competition by leveraging a Transformers4Rec model in the final ensemble. For more information about our solution, refer to our blog post and paper.

NVIDIA Merlin

Transformers4Rec is part of the NVIDIA Merlin ecosystem for recommender systems, which includes the following components:

NVTabular - NVTabular is a feature engineering and preprocessing library for tabular data that is designed to easily manipulate datasets at terabyte scale and train deep learning (DL) based recommender systems.
Triton Inference Server - Provides a cloud and edge inferencing solution that is optimized for both CPUs and GPUs. Transformers4Rec models can be exported and served with Triton.
HugeCTR - A GPU-accelerated recommender framework designed to distribute training across multiple GPUs and nodes and estimate Click-Through Rates (CTRs).

Supported Hugging Face Architectures and Pre-Training Approaches

Transformers4Rec supports the following masking tasks:

Causal Language Modeling (CLM)
Masked Language Modeling (MLM)
Permutation Language Modeling (PLM)
Replacement Token Detection (RTD)

In Transformers4Rec, we decouple the pre-training approaches from transformers architectures and provide a TransformerBlock module that links the config class of the transformer architecture to the masking task. Transformers4Rec also defines a transformer_registry, which includes pre-defined T4RecConfig constructors that automatically set the arguments of the related Hugging Face Transformers’ configuration classes.

The table below represents the architectures that are currently supported in Transformers4Rec and links them to the possible masking tasks. It also lists the pre-registered T4RecConfig classes in the Registered column.

Tip: Consider registering HF Transformers config classes into Transformers4Rec as this can be a great first contribution.

Model	CLM	MLM	PLM	RTD	Registered
AlBERT	❌	✅	❌	✅	✅
BERT	❌	✅	❌	✅	✅
ConvBERT	❌	✅	❌	✅	❌
DeBERTa	❌	✅	❌	✅	❌
DistilBERT	❌	✅	❌	✅	❌
GPT-2	✅	❌	❌	❌	✅
Longformer	✅	✅	❌	❌	✅
MegatronBert	❌	✅	❌	✅	❌
MPNet	❌	✅	❌	✅	❌
RoBERTa	❌	✅	❌	✅	✅
RoFormer	✅	✅	❌	✅	❌
Transformer-XL	✅	❌	❌	❌	✅
XLNet	✅	✅	✅	✅	✅

Note: The following HF architectures will eventually be supported: Reformer, Funnel Transformer, and ELECTRA.

Other Resources

NVIDIA Merlin engineering blog
NVIDIA developer blogs:
- Technical walkthrough on how to build a winning recommendation system
- Using Neural Networks for Your Recommender System