Additional Resources
Transformers4Rec and Session-based recommendation
Transformers4Rec: Bridging the Gap between NLP and Sequential / Session-Based Recommendation - Paper presented at the ACM RecSys’21 where we discuss the relationship between NLP and RecSys and introduce theTransformers4Rec library, describing its focus and core features. We also provide a comprehensive empirical analysis comparing Transformer architectures with session-based recommendation algorithms, which are outperformed by the former. The paper online appendix and instructions for experiments reproducibility can be found here.
Blog post with a gentle introduction of the Transformers4Rec library
End-to-end session based recommendation demo - Recorded demo presented at ACM RecSys’21 on end-to-end session-based recommendation using NVTabular, Transformers4Rec and Triton
Session-based recommenders - NVIDIA Developer page about NVIDIA Merlin solution for session-based recommendation
Competitions
SIGIR eCommerce Workshop Data Challenge 2021, organized by Coveo - NVIDIA Merlin team was one of the winners of this competition on predicting the next interacted products for user sessions in an e-commerce. In our solution we used only Transformer architectures. Check our post and paper.
WSDM WebTour Challenge 2021 , organized by Booking. com - Competition on next destination prediction for multi-city trips won by NVIDIA. We leveraged a model from the Transformers4Rec library in the final ensemble. Here is our solution post and paper.
NVIDIA Merlin
Transformers4Rec is part of the NVIDIA Merlin ecosystem for Recommender Systems. Check our other libraries:
NVTabular - NVTabular is a feature engineering and preprocessing library for tabular data that is designed to easily manipulate terabyte scale datasets and train deep learning (DL) based recommender systems.
Triton Inference Server. - Provides a cloud and edge inferencing solution optimized for both CPUs and GPUs. Transformers4Rec models can be exported and served with Triton.
HugeCTR - A GPU-accelerated recommender framework designed to distribute training across multiple GPUs and nodes and estimate Click-Through Rates (CTRs).
Supported HuggingFace architectures and pre-training approaches
Transformers4Rec supports the four following masking tasks:
Acronym | Definition |
---|---|
CLM | Causal Language Modeling |
MLM | Masked Language Modeling |
PLM | Permutation Language Modeling |
RTD | Replacement Token Detection |
In Transformers4Rec, we decouple the pre-training approaches from transformers architectures and provide TransformerBlock
module that links the config class of the transformer architecture to the masking task. Transformers4Rec also defines a transformer_registry
including pre-defined T4RecConfig
constructors that automatically set the arguments of the related HuggingFace Transformers’ configuration classes.
The table below represents the current supported architectures in Transformers4Rec and links them to the possible masking tasks. It also lists the pre-registered T4RecConfig
classes in the column Registered
.
Tip: Registering HF Transformers config classes into Transformers4Rec is a good opportunity for your first contributions to the library ;)
Model | CLM | MLM | PLM | RTD | Registered |
---|---|---|---|---|---|
AlBERT | ❌ | ✅ | ❌ | ✅ | ✅ |
BERT | ❌ | ✅ | ❌ | ✅ | ✅ |
ConvBERT | ❌ | ✅ | ❌ | ✅ | ❌ |
DeBERTa | ❌ | ✅ | ❌ | ✅ | ❌ |
DistilBERT | ❌ | ✅ | ❌ | ✅ | ❌ |
GPT-2 | ✅ | ❌ | ❌ | ❌ | ✅ |
Longformer | ✅ | ✅ | ❌ | ❌ | ✅ |
MegatronBert | ❌ | ✅ | ❌ | ✅ | ❌ |
MPNet | ❌ | ✅ | ❌ | ✅ | ❌ |
RoBERTa | ❌ | ✅ | ❌ | ✅ | ✅ |
RoFormer | ✅ | ✅ | ❌ | ✅ | ❌ |
Transformer-XL | ✅ | ❌ | ❌ | ❌ | ✅ |
XLNet | ✅ | ✅ | ✅ | ✅ | ✅ |
Note: The following HF architectures will be supported in future release: Reformer
, Funnel Transformer
, ELECTRA
Other Resources
NVIDIA Developer blog