logml.eda.artifacts_producers.metadata

Functions

filter_columns(dataframe[, columns_to_exclude])

Returns columns that are not listed in a given list.

get_categorical_columns(dataframe[, ...])

Returns a list of all categorical column names.

get_numeric_columns(dataframe, ...)

Returns a list of all numeric column names.

Classes

MetadataProducer(metadata_cfg, global_params)

Produces:

class logml.eda.artifacts_producers.metadata.MetadataProducer(metadata_cfg: logml.configuration.modeling.ModelingTaskSpec, global_params: dict, dataset_metadata: Optional[logml.data.metadata.DatasetMetadata] = None, logger=None, eda_params: Optional[logml.configuration.eda.EDAArtifactsGenerationParameters] = None)

Bases: logml.eda.artifacts_producers.base.BaseEDAArtifactsProducer

Produces:

  • list of numerical columns

  • list of categorical columns

  • list of all columns

Dependencies - NO.

LABEL = 'metadata'
DEPENDENCIES = []
ALIAS = 'Metadata producer'
produce(dataframe: pandas.core.frame.DataFrame)

Generate dataset metadata artifact.

logml.eda.artifacts_producers.metadata.get_numeric_columns(dataframe: pandas.core.frame.DataFrame, columns_to_exclude: List[str]) List[str]

Returns a list of all numeric column names.

logml.eda.artifacts_producers.metadata.get_categorical_columns(dataframe: pandas.core.frame.DataFrame, columns_to_exclude: Optional[List[str]] = None, accept_numeric_columns: bool = True) List[str]

Returns a list of all categorical column names.

logml.eda.artifacts_producers.metadata.filter_columns(dataframe: pandas.core.frame.DataFrame, columns_to_exclude: Optional[List[str]] = None) Iterable[str]

Returns columns that are not listed in a given list.