merlin.dag.Node#
- class merlin.dag.Node(selector=None)[source]#
Bases:
object
A Node is a group of columns that you want to apply the same transformations to. Node’s can be transformed by shifting operators on to them, which returns a new Node with the transformations applied. This lets you define a graph of operations that makes up your workflow
- Parameters:
selector (ColumnSelector) – Defines which columns to select from the input Dataset using column names and tags.
Methods
__init__
([selector])add_child
(child)Adding a child node to this node
add_dependency
(dep)Adding a dependency node to this node
add_parent
(parent)Adding a parent node to this node
compute_schemas
(root_schema[, preserve_dtypes])Defines the input and output schema
construct_from
(nodable)Convert Node-like objects to a Node or list of Nodes.
export
(output_path[, node_id, version])Export a directory for this node, containing the required artifacts to run in the target context.
exportable
([backend])remove_child
(child)Removing a child node from this node
remove_inputs
(input_cols)Remove input columns and all output columns that depend on them.
validate_schemas
(root_schema[, strict_dtypes])Check if this Node's input schema matches the output schemas of parents and dependencies
Attributes
Name for the exported node directory.
- property selector#
- add_dependency(dep: Node | Operator | str | List[str] | ColumnSelector | List[Node | Operator | str | List[str] | ColumnSelector])[source]#
Adding a dependency node to this node
- Parameters:
dep (Union[str, ColumnSelector, Node, List[Union[str, Node, ColumnSelector]]]) – Dependency to be added
- add_parent(parent: Node | Operator | str | List[str] | ColumnSelector | List[Node | Operator | str | List[str] | ColumnSelector])[source]#
Adding a parent node to this node
- Parameters:
parent (Union[str, ColumnSelector, Node, List[Union[str, Node, ColumnSelector]]]) – Parent to be added
- add_child(child: Node | Operator | str | List[str] | ColumnSelector | List[Node | Operator | str | List[str] | ColumnSelector])[source]#
Adding a child node to this node
- Parameters:
child (Union[str, ColumnSelector, Node, List[Union[str, Node, ColumnSelector]]]) – Child to be added
- remove_child(child: Node | Operator | str | List[str] | ColumnSelector | List[Node | Operator | str | List[str] | ColumnSelector])[source]#
Removing a child node from this node
- Parameters:
child (Union[str, ColumnSelector, Node, List[Union[str, Node, ColumnSelector]]]) – Child to be removed
- compute_schemas(root_schema: Schema, preserve_dtypes: bool = False)[source]#
Defines the input and output schema
- validate_schemas(root_schema: Schema, strict_dtypes: bool = False)[source]#
Check if this Node’s input schema matches the output schemas of parents and dependencies
- Parameters:
- Raises:
ValueError – If parents and dependencies don’t provide an expected column based on the input schema
ValueError – If the dtype of a column from parents and dependencies doesn’t match the expected dtype based on the input schema
- remove_inputs(input_cols: List[str]) List[str] [source]#
Remove input columns and all output columns that depend on them.
- export(output_path: str | PathLike, node_id: int | None = None, version: int = 1)[source]#
Export a directory for this node, containing the required artifacts to run in the target context.
- Parameters:
output_path (Union[str, os.PathLike]) – The base path to write this node’s export directory.
node_id (int, optional) – The id of this node in a larger graph (for disambiguation), by default None.
version (int, optional) – The version of the node to use for this export, by default 1.
- property export_name#
Name for the exported node directory.
- Returns:
Name supplied by this node’s operator.
- Return type:
- property parents_with_dependencies#
- property grouped_parents_with_dependencies#
- property input_columns#
- property output_columns#
- property column_mapping#
- property dependency_columns#
- property label#
- property graph#
- classmethod construct_from(nodable: Node | Operator | str | List[str] | ColumnSelector | List[Node | Operator | str | List[str] | ColumnSelector])[source]#
Convert Node-like objects to a Node or list of Nodes.
- Parameters:
nodable (Nodable) – Node-like objects to convert to a Node or list of Nodes.
- Returns:
New Node(s) corresponding to the Node-like input objects
- Return type:
Union[“Node”, List[“Node”]]
- Raises:
TypeError – If supplied input cannot be converted to a Node or list of Nodes