merlin.models.utils.schema_utils.get_embedding_sizes_from_schema

merlin.models.utils.schema_utils.get_embedding_sizes_from_schema(schema: merlin.schema.schema.Schema, multiplier: float = 2.0, ensure_multiple_of_8: bool = False) → Dict[str, int][source]

Provides a heristic (from Google) that suggests the embedding sizes as a function (forth root) of categorical features cardinalities, obtained from the schema.

Parameters

schema (Schema) – Featires schema
multiplier (float, optional) – Multiplier to be applied on the forth root of the cardinality. Google recommends multiplier in the [2.0,10.0] range, by default 2.0
ensure_multiple_of_8 (bool, optional) – If enabled, adjusts the embedding dim to the smallest greater number multiple of 8, to ensure best performance with GPU ops, by default False

Returns

A dict with the feature names and the suggested embedding sizes based on the features cardinalities obtained from the schema

Return type

Dict[str, int]