TensorFlow Dataloader
TensorFlow Layers
-
class
nvtabular.framework_utils.tensorflow.layers.embedding.
DenseFeatures
(*args, **kwargs)[source] Bases:
keras.engine.base_layer.Layer
Layer which maps a dictionary of input tensors to a dense, continuous vector digestible by a neural network. Meant to reproduce the API exposed by tf.keras.layers.DenseFeatures while reducing overhead for the case of one-hot categorical and scalar numeric features.
Uses TensorFlow feature_column to represent inputs to the layer, but does not perform any preprocessing associated with those columns. As such, it should only be passed numeric_column objects and their subclasses, embedding_column and indicator_column. Preprocessing functionality should be moved to NVTabular.
For multi-hot categorical or vector continuous data, represent the data for a feature with a dictionary entry “<feature_name>__values” corresponding to the flattened array of all values in the batch. For multi-hot categorical data, there should be a corresponding “<feature_name>__nnzs” entry that describes how many categories are present in each sample (and so has length batch_size).
Note that categorical columns should be wrapped in embedding or indicator columns first, consistent with the API used by tf.keras.layers.DenseFeatures.
Example usage:
column_a = tf.feature_column.numeric_column("a", (1,)) column_b = tf.feature_column.categorical_column_with_identity("b", 100) column_b_embedding = tf.feature_column.embedding_column(column_b, 4) inputs = { "a": tf.keras.Input(name="a", shape=(1,), dtype=tf.float32), "b": tf.keras.Input(name="b", shape=(1,), dtype=tf.int64) } x = DenseFeatures([column_a, column_b_embedding])(inputs)
- Parameters
feature_columns (list of tf.feature_column) – feature columns describing the inputs to the layer
aggregation (str in ("concat", "stack")) – how to combine the embeddings from multiple features
-
class
nvtabular.framework_utils.tensorflow.layers.embedding.
LinearFeatures
(*args, **kwargs)[source] Bases:
keras.engine.base_layer.Layer
Layer which implements a linear combination of one-hot categorical and scalar numeric features. Based on the “wide” branch of the Wide & Deep network architecture.
Uses TensorFlow
feature_column``s to represent inputs to the layer, but does not perform any preprocessing associated with those columns. As such, it should only be passed ``numeric_column
andcategorical_column_with_identity
. Preprocessing functionality should be moved to NVTabular.Also note that, unlike ScalarDenseFeatures, categorical columns should NOT be wrapped in embedding or indicator columns first.
Example usage:
column_a = tf.feature_column.numeric_column("a", (1,)) column_b = tf.feature_column.categorical_column_with_identity("b", 100) inputs = { "a": tf.keras.Input(name="a", shape=(1,), dtype=tf.float32), "b": tf.keras.Input(name="b", shape=(1,), dtype=tf.int64) } x = ScalarLinearFeatures([column_a, column_b])(inputs)
- Parameters
feature_columns (list of tf.feature_column) – feature columns describing the inputs to the layer