EDA Artifact Types Registry
- EDA Artifacts
Provides registry functionality for EDA artifacts producers. For implementation details see
EligibleEDAArtifactsProducers
correlation
- Description
Produces: - pearson/spearman correlation for numerical columns - orders artifact by similarity using AgglomerativeClustering - list of correlation groups See
logml.eda.artifacts.correlation.CorrelationSummary
. Dependencies: - metadata artifact For implementation details seeCorrelationSummaryProducer
- Attributes
DEPENDENCIES: [‘metadata’]
dim_reduction
- Description
Produces: - PCA output (+explained variance, feature weights) - TSNE output - LDA output - MCA output Dependencies: - metadata artifact For implementation details see
DimensionalityReductionSummaryProducer
- Attributes
DEPENDENCIES: [‘metadata’]
distributions
- Description
Produces: - histograms for numerical columns - similarity ordering for calculated histograms Dependencies: - metadata artifact For implementation details see
DistributionsSummaryProducer
- Attributes
DEPENDENCIES: [‘metadata’]
metadata
- Description
Produces: - list of numerical columns - list of categorical columns - list of all columns Dependencies - NO. For implementation details see
MetadataProducer
Attributes
missingness
- Description
Produces: - missing values per columns summaries (for num/cat/all columns) - missing values per row summaries (for num/cat/all columns) - complete dataset summaries (for num/cat/all columns) - similarity order by pairwise nan distances Dependencies: - metadata artifact For implementation details see
MissingnessSummaryProducer
- Attributes
DEPENDENCIES: [‘metadata’]
statistics
- Description
Produces: - multiple statistics for numerical columns Dependencies: - metadata artifact For implementation details see
StatisticsSummaryProducer
- Attributes
DEPENDENCIES: [‘metadata’]