logml.eda.artifacts_producers.dimensionality_reduction
Functions
|
Returns the number of PCs that cover 95% of variance. |
Classes
Produces: |
- class logml.eda.artifacts_producers.dimensionality_reduction.DimensionalityReductionSummaryProducer(metadata_cfg: logml.configuration.modeling.ModelingTaskSpec, global_params: dict, dataset_metadata: Optional[logml.data.metadata.DatasetMetadata] = None, logger=None, eda_params: Optional[logml.configuration.eda.EDAArtifactsGenerationParameters] = None)
Bases:
logml.eda.artifacts_producers.base.BaseEDAArtifactsProducer
Produces:
PCA output (+explained variance, feature weights)
TSNE output
LDA output
MCA output
Dependencies:
metadata artifact
- LABEL = 'dim_reduction'
- DEPENDENCIES = ['metadata']
- ALIAS = 'Dimensionality reduction summary producer'
- get_numeric_columns(dataframe: pandas.core.frame.DataFrame, target_column: Optional[str] = None) List[str]
Applies basic filtering to the list of numerical columns.
- produce(dataframe: pandas.core.frame.DataFrame)
Creates and dumps EDA artifact for a given dataframe.
- logml.eda.artifacts_producers.dimensionality_reduction.get_n_components(pca: sklearn.decomposition._pca.PCA) Tuple[int, numpy.array]
Returns the number of PCs that cover 95% of variance.