Additional Resources

Transformers4Rec and Session-based recommendation

Transformers4Rec: Bridging the Gap between NLP and Sequential / Session-Based Recommendation - Paper presented at the ACM RecSys’21 where we discuss the relationship between NLP and RecSys and introduce theTransformers4Rec library, describing its focus and core features. We also provide a comprehensive empirical analysis comparing Transformer architectures with session-based recommendation algorithms, which are outperformed by the former. The paper online appendix and instructions for experiments reproducibility can be found here.
Blog post with a gentle introduction of the Transformers4Rec library
End-to-end session based recommendation demo - Recorded demo presented at ACM RecSys’21 on end-to-end session-based recommendation using NVTabular, Transformers4Rec and Triton
Session-based recommenders - NVIDIA Developer page about NVIDIA Merlin solution for session-based recommendation

Competitions

SIGIR eCommerce Workshop Data Challenge 2021, organized by Coveo - NVIDIA Merlin team was one of the winners of this competition on predicting the next interacted products for user sessions in an e-commerce. In our solution we used only Transformer architectures. Check our post and paper.
WSDM WebTour Challenge 2021 , organized by Booking. com - Competition on next destination prediction for multi-city trips won by NVIDIA. We leveraged a model from the Transformers4Rec library in the final ensemble. Here is our solution post and paper.

NVIDIA Merlin

Transformers4Rec is part of the NVIDIA Merlin ecosystem for Recommender Systems. Check our other libraries:

NVTabular - NVTabular is a feature engineering and preprocessing library for tabular data that is designed to easily manipulate terabyte scale datasets and train deep learning (DL) based recommender systems.
Triton Inference Server. - Provides a cloud and edge inferencing solution optimized for both CPUs and GPUs. Transformers4Rec models can be exported and served with Triton.
HugeCTR - A GPU-accelerated recommender framework designed to distribute training across multiple GPUs and nodes and estimate Click-Through Rates (CTRs).

Supported HuggingFace architectures and pre-training approaches

Transformers4Rec supports the four following masking tasks:

Acronym	Definition
CLM	Causal Language Modeling
MLM	Masked Language Modeling
PLM	Permutation Language Modeling
RTD	Replacement Token Detection

In Transformers4Rec, we decouple the pre-training approaches from transformers architectures and provide TransformerBlock module that links the config class of the transformer architecture to the masking task. Transformers4Rec also defines a transformer_registry including pre-defined T4RecConfig constructors that automatically set the arguments of the related HuggingFace Transformers’ configuration classes. The table below represents the current supported architectures in Transformers4Rec and links them to the possible masking tasks. It also lists the pre-registered T4RecConfig classes in the column Registered. Tip: Registering HF Transformers config classes into Transformers4Rec is a good opportunity for your first contributions to the library ;)

Model	CLM	MLM	PLM	RTD	Registered
AlBERT	❌	✅	❌	✅	✅
BERT	❌	✅	❌	✅	✅
ConvBERT	❌	✅	❌	✅	❌
DeBERTa	❌	✅	❌	✅	❌
DistilBERT	❌	✅	❌	✅	❌
GPT-2	✅	❌	❌	❌	✅
Longformer	✅	✅	❌	❌	✅
MegatronBert	❌	✅	❌	✅	❌
MPNet	❌	✅	❌	✅	❌
RoBERTa	❌	✅	❌	✅	✅
RoFormer	✅	✅	❌	✅	❌
Transformer-XL	✅	❌	❌	❌	✅
XLNet	✅	✅	✅	✅	✅

Note: The following HF architectures will be supported in future release: Reformer, Funnel Transformer, ELECTRA

Other Resources

NVIDIA Merlin engineering blog
NVIDIA Developer blog
- Post series on how to build winning RecSys
- Using Neural Networks for Your Recommender System