Hierarchical Parameter Server

Hierarchical Parameter Server (HPS) is a recommender system parameter server for embedding lookup that is part of NVIDIA HugeCTR.

HPS offers a flexible deployment and configuration to meet site-specific recommender system needs. The deployment

HPS Database Backend

Provides a three-level storage architecture. The first and highest performing level is GPU memory and is followed by CPU memory. The third layer can be high-speed local SSDs with or without a distributed database. The key benefit of the HPS database backend is serving embedding tables that exceed GPU and CPU memory while providing the highest possible performance.

HPS plugin for TensorFlow

Provides high-performance, scalability, and low-latency access to embedding tables for deep learning models that have large embedding tables in TensorFlow.

HPS Backend for Triton Inference Server

The backend for Triton Inference Server is an inference deployment framework that integrates HPS for end-to-end inference on Triton. Documentation for the backend is available from the hugectr_backend repository at the preceding URL.