Scaling to Large Datasets with Criteo

Criteo provides the largest publicly available dataset for recommender systems, having a size of 1TB uncompressed click logs of 4 billion examples. We will teach to scale NVTabular:

  • Using multiple GPUs and multiple nodes with NVTabular for ETL

  • Training recommender system model with NVTabular dataloader for PyTorch