merlin_standard_lib package
Subpackages
Submodules
merlin_standard_lib.registry module
-
merlin_standard_lib.registry.
default_name
(class_or_fn)[source] Default name for a class or function.
This is the naming function by default for registries expecting classes or functions.
- Parameters
class_or_fn – class or function to be named.
- Returns
- Return type
Default name for registration.
-
class
merlin_standard_lib.registry.
Registry
(registry_name, default_key_fn=<function default_name>, validator=None, on_set=None, value_transformer=<function Registry.<lambda>>)[source] Bases:
object
Dict-like class for managing function registrations.
Example usage:
my_registry = Registry("custom_name") @my_registry.register def my_func(): pass @my_registry.register() def another_func(): pass @my_registry.register("non_default_name") def third_func(x, y, z): pass def foo(): pass my_registry.register()(foo) my_registry.register("baz")(lambda (x, y): x + y) my_register.register("bar") print(list(my_registry)) # ["my_func", "another_func", "non_default_name", "foo", "baz"] # (order may vary) print(my_registry["non_default_name"] is third_func) # True print("third_func" in my_registry) # False print("bar" in my_registry) # False my_registry["non-existent_key"] # raises KeyError
Optional validation, on_set callback and value transform also supported.
- Parameters
registry_name (str) – identifier for the given registry. Used in error msgs.
default_key_fn (callable, optional) – function mapping value -> key for registration when a key is not provided
validator (callable, optional) – if given, this is run before setting a given (key, value) pair. Accepts (key, value) and should raise if there is a problem. Overwriting existing keys is not allowed and is checked separately. Values are also checked to be callable separately.
on_set (callable, optional) – callback function accepting (key, value) pair which is run after an item is successfully set.
value_transformer (callable, optional) – if run, __getitem__ will return value_transformer(key, registered_value).
-
classmethod
class_registry
(registry_name, default_key_fn=<function default_name>, validator=None, on_set=None)[source]
-
property
name
-
register
(key_or_value=None)[source] Decorator to register a function, or registration itself. This is primarily intended for use as a decorator, either with or without a key/parentheses.
Example Usage:
@my_registry.register('key1') def value_fn(x, y, z): pass @my_registry.register() def another_fn(x, y): pass @my_registry.register def third_func(): pass
Note if key_or_value is provided as a non-callable, registration only occurs once the returned callback is called with a callable as its only argument:
callback = my_registry.register('different_key') 'different_key' in my_registry # False callback(lambda (x, y): x + y) 'different_key' in my_registry # True
- Parameters
(optional) (key_or_value) – key to access the registered value with, or the unction itself. If None (default), self.default_key will be called on value once the returned callback is called with value as the only arg. If key_or_value is itself callable, it is assumed to be the value and the key is given by self.default_key(key).
- Returns
- Return type
decorated callback, or callback generated a decorated function.
Module contents
-
class
merlin_standard_lib.
ColumnSchema
(name: str = <betterproto._PLACEHOLDER object>, deprecated: bool = <betterproto._PLACEHOLDER object>, presence: merlin_standard_lib.proto.schema_bp.FeaturePresence = <betterproto._PLACEHOLDER object>, group_presence: merlin_standard_lib.proto.schema_bp.FeaturePresenceWithinGroup = <betterproto._PLACEHOLDER object>, shape: merlin_standard_lib.proto.schema_bp.FixedShape = <betterproto._PLACEHOLDER object>, value_count: merlin_standard_lib.proto.schema_bp.ValueCount = <betterproto._PLACEHOLDER object>, value_counts: merlin_standard_lib.proto.schema_bp.ValueCountList = <betterproto._PLACEHOLDER object>, type: merlin_standard_lib.proto.schema_bp.FeatureType = <betterproto._PLACEHOLDER object>, domain: str = <betterproto._PLACEHOLDER object>, int_domain: merlin_standard_lib.proto.schema_bp.IntDomain = <betterproto._PLACEHOLDER object>, float_domain: merlin_standard_lib.proto.schema_bp.FloatDomain = <betterproto._PLACEHOLDER object>, string_domain: merlin_standard_lib.proto.schema_bp.StringDomain = <betterproto._PLACEHOLDER object>, bool_domain: merlin_standard_lib.proto.schema_bp.BoolDomain = <betterproto._PLACEHOLDER object>, struct_domain: merlin_standard_lib.proto.schema_bp.StructDomain = <betterproto._PLACEHOLDER object>, natural_language_domain: merlin_standard_lib.proto.schema_bp.NaturalLanguageDomain = <betterproto._PLACEHOLDER object>, image_domain: merlin_standard_lib.proto.schema_bp.ImageDomain = <betterproto._PLACEHOLDER object>, mid_domain: merlin_standard_lib.proto.schema_bp.MIDDomain = <betterproto._PLACEHOLDER object>, url_domain: merlin_standard_lib.proto.schema_bp.URLDomain = <betterproto._PLACEHOLDER object>, time_domain: merlin_standard_lib.proto.schema_bp.TimeDomain = <betterproto._PLACEHOLDER object>, time_of_day_domain: merlin_standard_lib.proto.schema_bp.TimeOfDayDomain = <betterproto._PLACEHOLDER object>, distribution_constraints: merlin_standard_lib.proto.schema_bp.DistributionConstraints = <betterproto._PLACEHOLDER object>, annotation: merlin_standard_lib.proto.schema_bp.Annotation = <betterproto._PLACEHOLDER object>, skew_comparator: merlin_standard_lib.proto.schema_bp.FeatureComparator = <betterproto._PLACEHOLDER object>, drift_comparator: merlin_standard_lib.proto.schema_bp.FeatureComparator = <betterproto._PLACEHOLDER object>, in_environment: List[str] = <betterproto._PLACEHOLDER object>, not_in_environment: List[str] = <betterproto._PLACEHOLDER object>, lifecycle_stage: merlin_standard_lib.proto.schema_bp.LifecycleStage = <betterproto._PLACEHOLDER object>, unique_constraints: merlin_standard_lib.proto.schema_bp.UniqueConstraints = <betterproto._PLACEHOLDER object>)[source] Bases:
merlin_standard_lib.proto.schema_bp.Feature
-
classmethod
create_categorical
(name: str, num_items: int, shape: Optional[Union[Tuple[int, …], List[int]]] = None, value_count: Optional[Union[merlin_standard_lib.proto.schema_bp.ValueCount, merlin_standard_lib.proto.schema_bp.ValueCountList]] = None, min_index: int = 0, tags: Optional[Union[List[str], List[merlin_standard_lib.schema.tag.Tag], List[Union[merlin_standard_lib.schema.tag.Tag, str]]]] = None, **kwargs) → merlin_standard_lib.schema.schema.ColumnSchema[source]
-
classmethod
create_continuous
(name: str, is_float: bool = True, min_value: Optional[Union[int, float]] = None, max_value: Optional[Union[int, float]] = None, disallow_nan: bool = False, disallow_inf: bool = False, is_embedding: bool = False, shape: Optional[Union[Tuple[int, …], List[int]]] = None, value_count: Optional[Union[merlin_standard_lib.proto.schema_bp.ValueCount, merlin_standard_lib.proto.schema_bp.ValueCountList]] = None, tags: Optional[Union[List[str], List[merlin_standard_lib.schema.tag.Tag], List[Union[merlin_standard_lib.schema.tag.Tag, str]]]] = None, **kwargs) → merlin_standard_lib.schema.schema.ColumnSchema[source]
-
copy
(**kwargs) → merlin_standard_lib.schema.schema.ColumnSchema[source]
-
with_properties
(properties: Dict[str, Union[str, int, float]]) → merlin_standard_lib.schema.schema.ColumnSchema[source]
-
property
properties
-
classmethod
-
class
merlin_standard_lib.
Schema
(feature: Sequence[merlin_standard_lib.proto.schema_bp.Feature] = <betterproto._PLACEHOLDER object>, sparse_feature: List[merlin_standard_lib.proto.schema_bp.SparseFeature] = <betterproto._PLACEHOLDER object>, weighted_feature: List[merlin_standard_lib.proto.schema_bp.WeightedFeature] = <betterproto._PLACEHOLDER object>, string_domain: List[merlin_standard_lib.proto.schema_bp.StringDomain] = <betterproto._PLACEHOLDER object>, float_domain: List[merlin_standard_lib.proto.schema_bp.FloatDomain] = <betterproto._PLACEHOLDER object>, int_domain: List[merlin_standard_lib.proto.schema_bp.IntDomain] = <betterproto._PLACEHOLDER object>, default_environment: List[str] = <betterproto._PLACEHOLDER object>, annotation: merlin_standard_lib.proto.schema_bp.Annotation = <betterproto._PLACEHOLDER object>, dataset_constraints: merlin_standard_lib.proto.schema_bp.DatasetConstraints = <betterproto._PLACEHOLDER object>, tensor_representation_group: Dict[str, merlin_standard_lib.proto.schema_bp.TensorRepresentationGroup] = <betterproto._PLACEHOLDER object>)[source] Bases:
merlin_standard_lib.proto.schema_bp._Schema
A collection of column schemas for a dataset.
-
feature
: List[merlin_standard_lib.schema.schema.ColumnSchema] = Field(name=None,type=None,default=<betterproto._PLACEHOLDER object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'betterproto': FieldMetadata(number=1, proto_type='message', map_types=None, group=None, wraps=None)}),_field_type=None)
-
classmethod
create
(column_schemas: Optional[Union[List[Union[merlin_standard_lib.schema.schema.ColumnSchema, str]], Dict[str, Union[merlin_standard_lib.schema.schema.ColumnSchema, str]]]] = None, **kwargs)[source]
-
apply
(selector) → merlin_standard_lib.schema.schema.Schema[source]
-
apply_inverse
(selector) → merlin_standard_lib.schema.schema.Schema[source]
-
select_by_type
(to_select) → merlin_standard_lib.schema.schema.Schema[source]
-
remove_by_type
(to_remove) → merlin_standard_lib.schema.schema.Schema[source]
-
select_by_tag
(to_select) → merlin_standard_lib.schema.schema.Schema[source]
-
remove_by_tag
(to_remove) → merlin_standard_lib.schema.schema.Schema[source]
-
select_by_name
(to_select) → merlin_standard_lib.schema.schema.Schema[source]
-
remove_by_name
(to_remove) → merlin_standard_lib.schema.schema.Schema[source]
-
map_column_schemas
(map_fn: Callable[[merlin_standard_lib.schema.schema.ColumnSchema], merlin_standard_lib.schema.schema.ColumnSchema]) → merlin_standard_lib.schema.schema.Schema[source]
-
filter_column_schemas
(filter_fn: Callable[[merlin_standard_lib.schema.schema.ColumnSchema], bool], negate=False) → merlin_standard_lib.schema.schema.Schema[source]
-
property
column_names
-
property
column_schemas
-
property
item_id_column_name
-
from_json
(value: Union[str, bytes]) → merlin_standard_lib.schema.schema.Schema[source]
-
from_proto_text
(path_or_proto_text: str) → merlin_standard_lib.schema.schema.Schema[source]
-
copy
(**kwargs) → merlin_standard_lib.schema.schema.Schema[source]
-
add
(other, allow_overlap=True) → merlin_standard_lib.schema.schema.Schema[source]
-
-
class
merlin_standard_lib.
Tag
(value)[source] Bases:
enum.Enum
An enumeration.
-
CATEGORICAL
= 'categorical'
-
CONTINUOUS
= 'continuous'
-
LIST
= 'list'
-
TEXT
= 'text'
-
TEXT_TOKENIZED
= 'text_tokenized'
-
TIME
= 'time'
-
USER
= 'user'
-
USER_ID
= 'user_id'
-
ITEM
= 'item'
-
ITEM_ID
= 'item_id'
-
SESSION
= 'session'
-
SESSION_ID
= 'session_id'
-
CONTEXT
= 'context'
-
TARGETS
= 'target'
-
BINARY_CLASSIFICATION
= 'binary_classification'
-
MULTI_CLASS_CLASSIFICATION
= 'multi_class'
-
REGRESSION
= 'regression'
-
-
class
merlin_standard_lib.
Registry
(registry_name, default_key_fn=<function default_name>, validator=None, on_set=None, value_transformer=<function Registry.<lambda>>)[source] Bases:
object
Dict-like class for managing function registrations.
Example usage:
my_registry = Registry("custom_name") @my_registry.register def my_func(): pass @my_registry.register() def another_func(): pass @my_registry.register("non_default_name") def third_func(x, y, z): pass def foo(): pass my_registry.register()(foo) my_registry.register("baz")(lambda (x, y): x + y) my_register.register("bar") print(list(my_registry)) # ["my_func", "another_func", "non_default_name", "foo", "baz"] # (order may vary) print(my_registry["non_default_name"] is third_func) # True print("third_func" in my_registry) # False print("bar" in my_registry) # False my_registry["non-existent_key"] # raises KeyError
Optional validation, on_set callback and value transform also supported.
- Parameters
registry_name (str) – identifier for the given registry. Used in error msgs.
default_key_fn (callable, optional) – function mapping value -> key for registration when a key is not provided
validator (callable, optional) – if given, this is run before setting a given (key, value) pair. Accepts (key, value) and should raise if there is a problem. Overwriting existing keys is not allowed and is checked separately. Values are also checked to be callable separately.
on_set (callable, optional) – callback function accepting (key, value) pair which is run after an item is successfully set.
value_transformer (callable, optional) – if run, __getitem__ will return value_transformer(key, registered_value).
-
classmethod
class_registry
(registry_name, default_key_fn=<function default_name>, validator=None, on_set=None)[source]
-
property
name
-
register
(key_or_value=None)[source] Decorator to register a function, or registration itself. This is primarily intended for use as a decorator, either with or without a key/parentheses.
Example Usage:
@my_registry.register('key1') def value_fn(x, y, z): pass @my_registry.register() def another_fn(x, y): pass @my_registry.register def third_func(): pass
Note if key_or_value is provided as a non-callable, registration only occurs once the returned callback is called with a callable as its only argument:
callback = my_registry.register('different_key') 'different_key' in my_registry # False callback(lambda (x, y): x + y) 'different_key' in my_registry # True
- Parameters
(optional) (key_or_value) – key to access the registered value with, or the unction itself. If None (default), self.default_key will be called on value once the returned callback is called with value as the only arg. If key_or_value is itself callable, it is assumed to be the value and the key is given by self.default_key(key).
- Returns
- Return type
decorated callback, or callback generated a decorated function.