logml.eda.artifacts_producers.distributions
Functions
|
Retrieves histograms values and concatenates. |
|
Returns histograms for a given list of numeric columns. |
Classes
|
Produces: |
- class logml.eda.artifacts_producers.distributions.DistributionsSummaryProducer(metadata_cfg: logml.configuration.modeling.ModelingTaskSpec, global_params: dict, dataset_metadata: Optional[logml.data.metadata.DatasetMetadata] = None, logger=None, eda_params: Optional[logml.configuration.eda.EDAArtifactsGenerationParameters] = None)
Bases:
logml.eda.artifacts_producers.base.BaseEDAArtifactsProducer
Produces:
histograms for numerical columns
similarity ordering for calculated histograms
Dependencies:
metadata artifact
- LABEL = 'distributions'
- DEPENDENCIES = ['metadata']
- ALIAS = 'Distributions summary producer'
- produce(dataframe: pandas.core.frame.DataFrame)
Creates and dumps EDA artifact for a given dataframe.
- logml.eda.artifacts_producers.distributions.get_histograms(dataframe: pandas.core.frame.DataFrame, numeric_columns: List[str], bins: int = 30) List[Tuple[str, Dict]]
Returns histograms for a given list of numeric columns.
- logml.eda.artifacts_producers.distributions.create_hist_features(histograms: Dict[str, Dict[str, numpy.array]]) numpy.array
Retrieves histograms values and concatenates.