DLRM using SparseOperationKit

Demonstrates how to build DLRM model with SparseOperationKit.

You can find the source codes in sparse_operation_kit/documents/tutorials/DLRM/.


Generate datasets

Criteo Terabytes Dataset will be used. Download these files. And there are several options for you to generate datasets.


Follow TensorFlow’s instructions to process these files and save as CSV files.


Follow HugeCTR’s instructions to process these files. Then convert the generated binary files to CSV files.

$ python3 bin2csv.py \
    --input_file="YourBinaryFilePath/train.bin" \
    --num_output_files=1024 \
    --output_path="./train/" \
$ python3 bin2csv.py \
    --input_file="YourBinaryFilePath/test.bin" \
    --num_output_files=64 \
    --output_path="./test/" \

Set common params

$ export EMBEDDING_DIM=32

Run DLRM with TensorFlow

$ mpiexec --allow-run-as-root -np 4 \
    python3 main.py \
        --global_batch_size=16384 \
        --train_file_pattern="./train/*.csv" \
        --test_file_pattern="./test/*.csv" \
        --embedding_layer="TF" \
        --embedding_vec_size=$EMBEDDING_DIM \
        --bottom_stack 512 256 $EMBEDDING_DIM \
        --top_stack 1024 1024 512 256 1 \
        --distribute_strategy="multiworker" \

Run DLRM with SOK

$ mpiexec --allow-run-as-root -np 4 \
    python3 main.py \
        --global_batch_size=16384 \
        --train_file_pattern="./train/*.csv" \
        --test_file_pattern="./test/*.csv" \
        --embedding_layer="SOK" \
        --embedding_vec_size=$EMBEDDING_DIM \
        --bottom_stack 512 256 $EMBEDDING_DIM \
        --top_stack 1024 1024 512 256 1 \


  1. DLRM (https://arxiv.org/pdf/1906.00091.pdf)

  2. Criteo TeraBytes Datasets (https://labs.criteo.com/2013/12/download-terabyte-click-logs/)

  3. TensorFlow DLRM model (https://github.com/tensorflow/models/tree/master/official/recommendation/ranking)