Hierarchical Parameter Server

Hierarchical Parameter Server (HPS) is a recommender system parameter server for embedding lookup that is part of NVIDIA HugeCTR.

HPS offers a flexible deployment and configuration to meet site-specific recommender system needs. The deployment

HPS Database Backend: Provides a three-level storage architecture. The first and highest performing level is GPU memory and is followed by CPU memory. The third layer can be high-speed local SSDs with or without a distributed database. The key benefit of the HPS database backend is serving embedding tables that exceed GPU and CPU memory while providing the highest possible performance.
HPS plugin for TensorFlow: Provides high-performance, scalability, and low-latency access to embedding tables for deep learning models that have large embedding tables in TensorFlow.
HPS Backend for Triton Inference Server: The backend for Triton Inference Server is an inference deployment framework that integrates HPS for end-to-end inference on Triton. Documentation for the backend is available from the hugectr_backend repository at the preceding URL.