Template Class HashTable
Defined in File merlin_hashtable.cuh
Class Documentation
-
template<typename
K
, typenameV
, typenameS
= uint64_t, intStrategy
= EvictStrategy::kLru, typenameArchTag
= Sm80>
classnv::merlin
::
HashTable
: public nv::merlin::HashTableBase<K, V, uint64_t> A HierarchicalKV hash table is a concurrent and hierarchical hash table that is powered by GPUs and can use HBM and host memory as storage for key-value pairs. Support for SSD storage is a future consideration.
The
score
is introduced to define the importance of each key, the larger, the more important, the less likely they will be evicted. Eviction occurs automatically when a bucket is full. The keys with the minimumscore
value are evicted first. In a customized eviction strategy, we recommend using the timestamp or frequency of the key occurrence as thescore
value for each key. You can also assign a special value to thescore
to perform a customized eviction strategy.Note
By default configuration, this class is thread-safe.
- Template Parameters
K – The data type of the key.
V – The data type of the vector’s item type. The item data type should be a basic data type of C++/CUDA.
S – The data type for
score
. The currently supported data type is onlyuint64_t
.
Public Types
-
using
size_type
= size_t
-
using
Pred
= EraseIfPredict<key_type, score_type>
-
using
allocator_type
= BaseAllocator
Public Functions
-
inline
HashTable
() Default constructor for the hash table class.
-
inline
~HashTable
() Frees the resources used by the hash table and destroys the hash table object.
-
inline void
init
(const HashTableOptions &options, allocator_type *allocator = nullptr) Initialize a merlin::HashTable.
- Parameters
options – The configuration options.
-
inline void
insert_or_assign
(const size_type n, const key_type *keys, const value_type *values, const score_type *scores = nullptr, cudaStream_t stream = 0, bool unique_key = true, bool ignore_evict_strategy = false) Insert new key-value-score tuples into the hash table. If the key already exists, the values and scores are assigned new values.
If the target bucket is full, the keys with minimum score will be overwritten by new key unless the score of the new key is even less than minimum score of the target bucket.
- Parameters
n – Number of key-value-score tuples to insert or assign.
keys – The keys to insert on GPU-accessible memory with shape (n).
values – The values to insert on GPU-accessible memory with shape (n, DIM).
scores – The scores to insert on GPU-accessible memory with shape (n).
The scores should be a
uint64_t
value. You can specify a value that such as the timestamp of the key insertion, number of the key occurrences, or another value to perform a custom eviction strategy.The
scores
should benullptr
, when the LRU eviction strategy is applied.stream – The CUDA stream that is used to execute the operation.
unique_key – If all keys in the same batch are unique.
ignore_evict_strategy – A boolean option indicating whether if the insert_or_assign ignores the evict strategy of table with current scores anyway. If true, it does not check whether the scores conforms to the evict strategy. If false, it requires the scores follow the evict strategy of table.
-
inline void
insert_and_evict
(const size_type n, const key_type *keys, const value_type *values, const score_type *scores, key_type *evicted_keys, value_type *evicted_values, score_type *evicted_scores, size_type *d_evicted_counter, cudaStream_t stream = 0, bool unique_key = true, bool ignore_evict_strategy = false) Insert new key-value-score tuples into the hash table. If the key already exists, the values and scores are assigned new values.
If the target bucket is full, the keys with minimum score will be overwritten by new key unless the score of the new key is even less than minimum score of the target bucket. The overwritten key with minimum score will be evicted, with its values and score, to evicted_keys, evicted_values, evcted_scores seperately in compact format.
- Parameters
n – Number of key-value-score tuples to insert or assign.
keys – The keys to insert on GPU-accessible memory with shape (n).
values – The values to insert on GPU-accessible memory with shape (n, DIM).
scores – The scores to insert on GPU-accessible memory with shape (n). @params evicted_keys The output of keys replaced with minimum score. @params evicted_values The output of values replaced with minimum score on keys. @params evicted_scores The output of scores replaced with minimum score on keys.
The scores should be a
uint64_t
value. You can specify a value that such as the timestamp of the key insertion, number of the key occurrences, or another value to perform a custom eviction strategy.The
scores
should benullptr
, when the LRU eviction strategy is applied.d_evicted_counter – The number of elements evicted on GPU-accessible memory. @notice The caller should guarantee it is set to
0
before calling.stream – The CUDA stream that is used to execute the operation.
unique_key – If all keys in the same batch are unique.
ignore_evict_strategy – A boolean option indicating whether if the insert_or_assign ignores the evict strategy of table with current scores anyway. If true, it does not check whether the scores confroms to the evict strategy. If false, it requires the scores follow the evict strategy of table.
-
inline size_type
insert_and_evict
(const size_type n, const key_type *keys, const value_type *values, const score_type *scores, key_type *evicted_keys, value_type *evicted_values, score_type *evicted_scores, cudaStream_t stream = 0, bool unique_key = true, bool ignore_evict_strategy = false) Insert new key-value-score tuples into the hash table. If the key already exists, the values and scores are assigned new values.
If the target bucket is full, the keys with minimum score will be overwritten by new key unless the score of the new key is even less than minimum score of the target bucket. The overwritten key with minimum score will be evicted, with its values and score, to evicted_keys, evicted_values, evcted_scores seperately in compact format.
- Parameters
n – Number of key-value-score tuples to insert or assign.
keys – The keys to insert on GPU-accessible memory with shape (n).
values – The values to insert on GPU-accessible memory with shape (n, DIM).
scores – The scores to insert on GPU-accessible memory with shape (n). @params evicted_keys The output of keys replaced with minimum score. @params evicted_values The output of values replaced with minimum score on keys. @params evicted_scores The output of scores replaced with minimum score on keys.
The scores should be a
uint64_t
value. You can specify a value that such as the timestamp of the key insertion, number of the key occurrences, or another value to perform a custom eviction strategy.The
scores
should benullptr
, when the LRU eviction strategy is applied.stream – The CUDA stream that is used to execute the operation.
unique_key – If all keys in the same batch are unique.
ignore_evict_strategy – A boolean option indicating whether if the insert_or_assign ignores the evict strategy of table with current scores anyway. If true, it does not check whether the scores confroms to the evict strategy. If false, it requires the scores follow the evict strategy of table.
- Returns
The number of elements evicted.
-
inline void
accum_or_assign
(const size_type n, const key_type *keys, const value_type *value_or_deltas, const bool *accum_or_assigns, const score_type *scores = nullptr, cudaStream_t stream = 0, bool ignore_evict_strategy = false) Searches for each key in
keys
in the hash table. If the key is found and the corresponding value inaccum_or_assigns
istrue
, thevectors_or_deltas
is treated as a delta to the old value, and the delta is added to the old value of the key.If the key is not found and the corresponding value in
accum_or_assigns
isfalse
, thevectors_or_deltas
is treated as a new value and the key-value pair is updated in the table directly.Note
When the key is found and the value of
accum_or_assigns
isfalse
, or when the key is not found and the value ofaccum_or_assigns
istrue
, nothing is changed and this operation is ignored. The algorithm assumes these situations occur while the key was modified or removed by other processes just now.- Parameters
n – The number of key-value-score tuples to process.
keys – The keys to insert on GPU-accessible memory with shape (n).
value_or_deltas – The values or deltas to insert on GPU-accessible memory with shape (n, DIM).
accum_or_assigns – The operation type with shape (n). A value of
true
indicates to accum andfalse
indicates to assign.scores – The scores to insert on GPU-accessible memory with shape (n).
The scores should be a
uint64_t
value. You can specify a value that such as the timestamp of the key insertion, number of the key occurrences, or another value to perform a custom eviction strategy.The
scores
should benullptr
, when the LRU eviction strategy is applied.stream – The CUDA stream that is used to execute the operation.
ignore_evict_strategy – A boolean option indicating whether if the accum_or_assign ignores the evict strategy of table with current scores anyway. If true, it does not check whether the scores confroms to the evict strategy. If false, it requires the scores follow the evict strategy of table.
-
inline void
find_or_insert
(const size_type n, const key_type *keys, value_type *values, score_type *scores = nullptr, cudaStream_t stream = 0, bool unique_key = true, bool ignore_evict_strategy = false) Searches the hash table for the specified keys. When a key is missing, the value in
values
andscores
will be inserted.- Parameters
n – The number of key-value-score tuples to search or insert.
keys – The keys to search on GPU-accessible memory with shape (n).
values – The values to search on GPU-accessible memory with shape (n, DIM).
scores – The scores to search on GPU-accessible memory with shape (n).
If
scores
isnullptr
, the score for each key will not be returned.stream – The CUDA stream that is used to execute the operation.
unique_key – If all keys in the same batch are unique.
-
inline void
find_or_insert
(const size_type n, const key_type *keys, value_type **values, bool *founds, score_type *scores = nullptr, cudaStream_t stream = 0, bool unique_key = true, bool ignore_evict_strategy = false) Searches the hash table for the specified keys and returns address of the values. When a key is missing, the value in
values
andscores
will be inserted.Warning
This API returns internal addresses for high-performance but thread-unsafe. The caller is responsible for guaranteeing data consistency.
- Parameters
n – The number of key-value-score tuples to search or insert.
keys – The keys to search on GPU-accessible memory with shape (n).
values – The addresses of values to search on GPU-accessible memory with shape (n).
founds – The status that indicates if the keys are found on
scores – The scores to search on GPU-accessible memory with shape (n).
If
scores
isnullptr
, the score for each key will not be returned.stream – The CUDA stream that is used to execute the operation.
unique_key – If all keys in the same batch are unique.
-
inline void
assign
(const size_type n, const key_type *keys, const value_type *values, const score_type *scores = nullptr, cudaStream_t stream = 0, bool unique_key = true) Assign new key-value-score tuples into the hash table. If the key doesn’t exist, the operation on the key will be ignored.
- Parameters
n – Number of key-value-score tuples to insert or assign.
keys – The keys to insert on GPU-accessible memory with shape (n).
values – The values to insert on GPU-accessible memory with shape (n, DIM).
scores – The scores to insert on GPU-accessible memory with shape (n).
The scores should be a
uint64_t
value. You can specify a value that such as the timestamp of the key insertion, number of the key occurrences, or another value to perform a custom eviction strategy.The
scores
should benullptr
, when the LRU eviction strategy is applied.stream – The CUDA stream that is used to execute the operation.
unique_key – If all keys in the same batch are unique.
-
inline void
assign_scores
(const size_type n, const key_type *keys, const score_type *scores = nullptr, cudaStream_t stream = 0, bool unique_key = true) Assign new scores for keys. If the key doesn’t exist, the operation on the key will be ignored.
- Parameters
n – Number of key-score pairs to assign.
keys – The keys to insert on GPU-accessible memory with shape (n).
The scores should be a
uint64_t
value. You can specify a value that such as the timestamp of the key insertion, number of the key occurrences, or another value to perform a custom eviction strategy.The
scores
should benullptr
, when the LRU eviction strategy is applied.stream – The CUDA stream that is used to execute the operation.
unique_key – If all keys in the same batch are unique.
-
inline void
assign
(const size_type n, const key_type *keys, const score_type *scores = nullptr, cudaStream_t stream = 0, bool unique_key = true) Alias of
assign_scores
.
-
inline void
assign_values
(const size_type n, const key_type *keys, const value_type *values, cudaStream_t stream = 0, bool unique_key = true) Assign new values for each keys . If the key doesn’t exist, the operation on the key will be ignored.
- Parameters
n – Number of key-value pairs to assign.
keys – The keys need to be operated, which must be on GPU-accessible memory with shape (n).
values – The values need to be updated, which must be on GPU-accessible memory with shape (n, DIM).
stream – The CUDA stream that is used to execute the operation.
unique_key – If all keys in the same batch are unique.
-
inline void
find
(const size_type n, const key_type *keys, value_type *values, bool *founds, score_type *scores = nullptr, cudaStream_t stream = 0) const Searches the hash table for the specified keys.
Note
When a key is missing, the value in
values
is not changed.- Parameters
n – The number of key-value-score tuples to search.
keys – The keys to search on GPU-accessible memory with shape (n).
values – The values to search on GPU-accessible memory with shape (n, DIM).
founds – The status that indicates if the keys are found on GPU-accessible memory with shape (n).
scores – The scores to search on GPU-accessible memory with shape (n).
If
scores
isnullptr
, the score for each key will not be returned.stream – The CUDA stream that is used to execute the operation.
-
inline void
find
(const size_type n, const key_type *keys, value_type *values, key_type *missed_keys, int *missed_indices, int *missed_size, score_type *scores = nullptr, cudaStream_t stream = 0) const Searches the hash table for the specified keys.
Note
When the searched keys are not hit, missed keys/indices/size can be obtained.
- Parameters
n – The number of key-value-score tuples to search.
keys – The keys to search on GPU-accessible memory with shape (n).
values – The values to search on GPU-accessible memory with shape (n, DIM).
missed_keys – The missed keys to search on GPU-accessible memory with shape (n).
missed_indices – The missed indices to search on GPU-accessible memory with shape (n).
missed_size – The size of
missed_keys
andmissed_indices
.scores – The scores to search on GPU-accessible memory with shape (n).
If
scores
isnullptr
, the score for each key will not be returned.stream – The CUDA stream that is used to execute the operation.
-
inline void
find
(const size_type n, const key_type *keys, value_type **values, bool *founds, score_type *scores = nullptr, cudaStream_t stream = 0, bool unique_key = true) const Searches the hash table for the specified keys and returns address of the values.
Note
When a key is missing, the data in
values
won’t change.Warning
This API returns internal addresses for high-performance but thread-unsafe. The caller is responsible for guaranteeing data consistency.
- Parameters
n – The number of key-value-score tuples to search.
keys – The keys to search on GPU-accessible memory with shape (n).
values – The addresses of values to search on GPU-accessible memory with shape (n).
founds – The status that indicates if the keys are found on GPU-accessible memory with shape (n).
scores – The scores to search on GPU-accessible memory with shape (n).
If
scores
isnullptr
, the score for each key will not be returned.stream – The CUDA stream that is used to execute the operation.
unique_key – If all keys in the same batch are unique.
-
inline void
contains
(const size_type n, const key_type *keys, bool *founds, cudaStream_t stream = 0) const Checks if there are elements with key equivalent to
keys
in the table.- Parameters
n – The number of
keys
to check.keys – The keys to search on GPU-accessible memory with shape (n).
founds – The result that indicates if the keys are found, and should be allocated by caller on GPU-accessible memory with shape (n).
stream – The CUDA stream that is used to execute the operation.
-
inline void
erase
(const size_type n, const key_type *keys, cudaStream_t stream = 0) Removes specified elements from the hash table.
- Parameters
n – The number of keys to remove.
keys – The keys to remove on GPU-accessible memory.
stream – The CUDA stream that is used to execute the operation.
-
template<template<typename, typename> class
PredFunctor
>
inline size_typeerase_if
(const key_type &pattern, const score_type &threshold, cudaStream_t stream = 0) Erases all elements that satisfy the predicate
pred
from the hash table.template <class K, class S> struct EraseIfPredFunctor { __forceinline__ __device__ bool operator()(const K& key, S& score, const K& pattern, const S& threshold) { return ((key & 0x1 == pattern) && (score < threshold)); } };
- Template Parameters
PredFunctor – The predicate template <typename K, typename S> function with operator signature (bool*)(const K&, const S&, const K&, const threshold) that returns
true
if the element should be erased. The value forpred
should be a function with typePred
defined like the following example:- Parameters
pattern – The third user-defined argument to
pred
with key_type type.threshold – The fourth user-defined argument to
pred
with score_type type.stream – The CUDA stream that is used to execute the operation.
- Returns
The number of elements removed.
-
inline void
clear
(cudaStream_t stream = 0) Removes all of the elements in the hash table with no release object.
-
inline void
export_batch
(size_type n, const size_type offset, size_type *d_counter, key_type *keys, value_type *values, score_type *scores = nullptr, cudaStream_t stream = 0) const Exports a certain number of the key-value-score tuples from the hash table.
- Parameters
n – The maximum number of exported pairs.
offset – The position of the key to search.
d_counter – Accumulates amount of successfully exported values.
keys – The keys to dump from GPU-accessible memory with shape (n).
values – The values to dump from GPU-accessible memory with shape (n, DIM).
scores – The scores to search on GPU-accessible memory with shape (n).
If
scores
isnullptr
, the score for each key will not be returned.stream – The CUDA stream that is used to execute the operation.
- Throws
CudaException – If the key-value size is too large for GPU shared memory. Reducing the value for
n
is currently required if this exception occurs.- Returns
The number of elements dumped.
-
inline size_type
export_batch
(const size_type n, const size_type offset, key_type *keys, value_type *values, score_type *scores = nullptr, cudaStream_t stream = 0) const
-
template<template<typename, typename> class
PredFunctor
>
inline voidexport_batch_if
(const key_type &pattern, const score_type &threshold, size_type n, const size_type offset, size_type *d_counter, key_type *keys, value_type *values, score_type *scores = nullptr, cudaStream_t stream = 0) const Exports a certain number of the key-value-score tuples which match.
template <class K, class S> struct ExportIfPredFunctor { __forceinline__ __device__ bool operator()(const K& key, S& score, const K& pattern, const S& threshold) { return score >= threshold; } };
- Template Parameters
PredFunctor – A functor with template <K, S> defined an operator with signature: device (bool*)(const K&, S&, const K&, const S&). specified condition from the hash table.
- Parameters
n – The maximum number of exported pairs. The value for
pred
should be a function with typePred
defined like the following example:pattern – The third user-defined argument to
pred
with key_type type.threshold – The fourth user-defined argument to
pred
with score_type type.offset – The position of the key to remove.
keys – The keys to dump from GPU-accessible memory with shape (n).
values – The values to dump from GPU-accessible memory with shape (n, DIM).
scores – The scores to search on GPU-accessible memory with shape (n).
If
scores
isnullptr
, the score for each key will not be returned.stream – The CUDA stream that is used to execute the operation.
- Throws
CudaException – If the key-value size is too large for GPU shared memory. Reducing the value for
n
is currently required if this exception occurs.- Returns
The number of elements dumped.
-
inline bool
empty
(cudaStream_t stream = 0) const Indicates if the hash table has no elements.
- Parameters
stream – The CUDA stream that is used to execute the operation.
- Returns
true
if the table is empty andfalse
otherwise.
-
inline size_type
size
(cudaStream_t stream = 0) const Returns the hash table size.
- Parameters
stream – The CUDA stream that is used to execute the operation.
- Returns
The table size.
-
inline size_type
capacity
() const Returns the hash table capacity.
Note
The value that is returned might be less than the actual capacity of the hash table because the hash table currently keeps the capacity to be a power of 2 for performance considerations.
- Returns
The table capacity.
-
inline void
reserve
(const size_type new_capacity, cudaStream_t stream = 0) Sets the number of buckets to the number that is needed to accommodate at least
new_capacity
elements without exceeding the maximum load factor. This method rehashes the hash table. Rehashing puts the elements into the appropriate buckets considering that total number of buckets has changed.Note
If the value of
new_capacity
or double ofnew_capacity
is greater or equal thanoptions_.max_capacity
, the reserve does not perform any change to the hash table.- Parameters
new_capacity – The requested capacity for the hash table.
stream – The CUDA stream that is used to execute the operation.
-
inline float
load_factor
(cudaStream_t stream = 0) const Returns the average number of elements per slot, that is, size() divided by capacity().
- Parameters
stream – The CUDA stream that is used to execute the operation.
- Returns
The load factor
-
inline void
set_max_capacity
(size_type new_max_capacity) Set max_capacity of the table.
- Parameters
new_max_capacity – The new expecting max_capacity. It must be power of 2. Otherwise it will raise an error.
-
inline size_type
dim
() const noexcept Returns the dimension of the vectors.
- Returns
The dimension of the vectors.
-
inline size_type
max_bucket_size
() const noexcept Returns The length of each bucket.
- Returns
The length of each bucket.
-
inline size_type
bucket_count
() const noexcept Returns the number of buckets in the table.
- Returns
The number of buckets in the table.
-
inline size_type
save
(BaseKVFile<K, V, S> *file, const size_t max_workspace_size = 1L * 1024 * 1024, cudaStream_t stream = 0) const Save keys, vectors, scores in table to file or files.
- Parameters
file – A BaseKVFile object defined the file format on host filesystem.
max_workspace_size – Saving is conducted in chunks. This value denotes the maximum amount of temporary memory to use when dumping the table. Larger values can lead to higher performance.
stream – The CUDA stream used to execute the operation.
- Returns
Number of KV pairs saved to file.
-
inline size_type
load
(BaseKVFile<K, V, S> *file, const size_t max_workspace_size = 1L * 1024 * 1024, cudaStream_t stream = 0) Load keys, vectors, scores from file to table.
- Parameters
file – An BaseKVFile defined the file format within filesystem.
max_workspace_size – Loading is conducted in chunks. This value denotes the maximum size of such chunks. Larger values can lead to higher performance.
stream – The CUDA stream used to execute the operation.
- Returns
Number of keys loaded from file.
-
inline void
set_global_epoch
(const uint64_t epoch)