logml.data.transformers.imputation
Classes
|
Provides MICE imputation functionality (Multivariate Imputation by Chained Equations). |
|
Provides imputation functionality. |
- class logml.data.transformers.imputation.SimpleImputerParams
Bases:
logml.data.config.BaseTransformerParams
Show JSON schema
{ "title": "SimpleImputerParams", "description": "Defines schema for transformer params.\n\nColumns inclusion/exclusion schema (see also `get_affected_columns`):\n\n- make set by union all columns that match `include_columns` filter.\n- subtract columns that match `exclude_columns` filter.\n\nFiltering expressions are identified by prefix:\n\n- 're:' or empty - regular expression. Any valid python regular expression, e.g. \".*_DNA$\"\n- 'g:' - columns' group filter. Should completely match group name, e.g. \"g:clinical_data\".\n- '$' - keyword:\n - $features - all features (input columns, covariates).\n - $numeric_features - only numeric features.\n - $cat_features - only categorical features.\n - $target - target feature. (For survival problems will be two columns - time+event).\n - $all - all columns except key columns.\n\nIf no know prefix detected, the filter is considered as regular expression.", "type": "object", "properties": { "columns_to_include": { "title": "Columns To Include", "description": "List of filtering expressions. By default, all columns are included.", "default": [ ".*" ], "type": "array", "items": { "type": "string" } }, "columns_to_exclude": { "title": "Columns To Exclude", "description": "List of filtering expressions. Empty by default.", "default": [], "type": "array", "items": { "type": "string" } }, "strategy": { "title": "Strategy", "description": "(mean, median, most_frequent, constant)", "default": "mean", "type": "string" }, "fill_value": { "title": "Fill Value" } } }
- Fields
- field strategy: str = 'mean'
(mean, median, most_frequent, constant)
- field fill_value: Any = None
- class logml.data.transformers.imputation.SimpleImputeTransformer(**kwargs)
Bases:
logml.data.base.BaseTransformer
Provides imputation functionality.
- LABEL = 'impute'
- CONFIG_CLASS
alias of
logml.data.transformers.imputation.SimpleImputerParams
- fit(dataframe: pandas.core.frame.DataFrame, dataset_metadata: Optional[logml.data.metadata.DatasetMetadata] = None, **kwargs)
Fit method prepares a transformer for further ‘transform’ calls.
- transform(dataframe: pandas.core.frame.DataFrame) pandas.core.frame.DataFrame
Applies transformations and returns the result dataframe.
- update_transform_log(change: logml.data.utils.DataTransformLogItem)
Add custom data to the log.
- params: BaseTransformerParams
- global_params: Dict
- metadata_cfg: ModelingTaskSpec
- affected_columns_: List[str]
- class logml.data.transformers.imputation.MICEImputeTransformer(*args, **kwargs)
Bases:
logml.data.base.BaseTransformer
Provides MICE imputation functionality (Multivariate Imputation by Chained Equations). NOTE: affected columns are additionally filtered to be numerical only
- LABEL = 'impute_mice'
- CONFIG_CLASS
- fit(dataframe: pandas.core.frame.DataFrame, dataset_metadata: Optional[logml.data.metadata.DatasetMetadata] = None, **kwargs)
Fit method prepares a transformer for further ‘transform’ calls.
- transform(dataframe: pandas.core.frame.DataFrame) pandas.core.frame.DataFrame
Applies transformations and returns the result dataframe.
- update_transform_log(change: logml.data.utils.DataTransformLogItem)
Add custom data to the log.
- params: BaseTransformerParams
- global_params: Dict
- metadata_cfg: ModelingTaskSpec
- affected_columns_: List[str]