TensorFlow Dataloader

TensorFlow Layers

class nvtabular.framework_utils.tensorflow.layers.embedding.DenseFeatures(*args, **kwargs)[source]

Bases: keras.engine.base_layer.Layer

Layer which maps a dictionary of input tensors to a dense, continuous vector digestible by a neural network. Meant to reproduce the API exposed by tf.keras.layers.DenseFeatures while reducing overhead for the case of one-hot categorical and scalar numeric features.

Uses TensorFlow feature_column to represent inputs to the layer, but does not perform any preprocessing associated with those columns. As such, it should only be passed numeric_column objects and their subclasses, embedding_column and indicator_column. Preprocessing functionality should be moved to NVTabular.

For multi-hot categorical or vector continuous data, represent the data for a feature with a dictionary entry “<feature_name>__values” corresponding to the flattened array of all values in the batch. For multi-hot categorical data, there should be a corresponding “<feature_name>__nnzs” entry that describes how many categories are present in each sample (and so has length batch_size).

Note that categorical columns should be wrapped in embedding or indicator columns first, consistent with the API used by tf.keras.layers.DenseFeatures.

Example usage:

column_a = tf.feature_column.numeric_column("a", (1,))
column_b = tf.feature_column.categorical_column_with_identity("b", 100)
column_b_embedding = tf.feature_column.embedding_column(column_b, 4)

inputs = {
    "a": tf.keras.Input(name="a", shape=(1,), dtype=tf.float32),
    "b": tf.keras.Input(name="b", shape=(1,), dtype=tf.int64)
}
x = DenseFeatures([column_a, column_b_embedding])(inputs)
Parameters
  • feature_columns (list of tf.feature_column) – feature columns describing the inputs to the layer

  • aggregation (str in ("concat", "stack")) – how to combine the embeddings from multiple features

build(input_shapes)[source]
call(inputs)[source]
compute_output_shape(input_shapes)[source]
get_config()[source]
class nvtabular.framework_utils.tensorflow.layers.embedding.LinearFeatures(*args, **kwargs)[source]

Bases: keras.engine.base_layer.Layer

Layer which implements a linear combination of one-hot categorical and scalar numeric features. Based on the “wide” branch of the Wide & Deep network architecture.

Uses TensorFlow feature_column``s to represent inputs to the layer, but does not perform any preprocessing associated with those columns. As such, it should only be passed ``numeric_column and categorical_column_with_identity. Preprocessing functionality should be moved to NVTabular.

Also note that, unlike ScalarDenseFeatures, categorical columns should NOT be wrapped in embedding or indicator columns first.

Example usage:

column_a = tf.feature_column.numeric_column("a", (1,))
column_b = tf.feature_column.categorical_column_with_identity("b", 100)

inputs = {
    "a": tf.keras.Input(name="a", shape=(1,), dtype=tf.float32),
    "b": tf.keras.Input(name="b", shape=(1,), dtype=tf.int64)
}
x = ScalarLinearFeatures([column_a, column_b])(inputs)
Parameters

feature_columns (list of tf.feature_column) – feature columns describing the inputs to the layer

build(input_shapes)[source]
call(inputs)[source]
compute_output_shape(input_shapes)[source]
get_config()[source]
class nvtabular.framework_utils.tensorflow.layers.interaction.DotProductInteraction(*args, **kwargs)[source]

Bases: keras.engine.base_layer.Layer

build(input_shape)[source]
call(value)[source]
compute_output_shape(input_shape)[source]
get_config()[source]