logml.report.plotters.feature_importance
Functions
|
Produces custom plots for a given model assuming we use its bootstrapped coefficients as feature importance. |
|
For a set of FI rankings (per method) shows a heatmap. |
|
For a set of FI rankings (per method) shows a heatmap. |
|
Filters ranks dataset by rank. |
|
Returns a list with simple plotly-compatible buttons configurations. |
|
Plot strata similarity heatmaps |
|
Display clustermap of strata features ranking. |
|
Creates an interactive view for examining target vs features relationships. |
|
Simple scatter for displaying averaging rankings for 2 given stratas. |
|
Creates an interactive view for examining target vs features relationships. |
|
Creates an interactive view for examining target vs features relationships. |
|
Shows ranks for all pairs of strata. |
- logml.report.plotters.feature_importance.build_fi_rankings_heatmap(results: pandas.core.frame.DataFrame, title: str, **kwargs)
For a set of FI rankings (per method) shows a heatmap. The brighter color the better.
- logml.report.plotters.feature_importance.build_fi_rankings_heatmap_h(results: pandas.core.frame.DataFrame, title: str, max_name_len=22, **kwargs)
For a set of FI rankings (per method) shows a heatmap. The brighter color the better.
Differs from build_fi_rankings_heatmap_h by horizontal direction of chart due to the strange fact that plotly provider sliders only for x-axis.
- logml.report.plotters.feature_importance.build_bootstrapping_result_plots(result: pandas.core.frame.DataFrame)
Produces custom plots for a given model assuming we use its bootstrapped coefficients as feature importance.
Before plotting the data we do filter the (frequency, coefficient) pairs using the following approach:
coefficients
a) In case any RANDOM features were used - we use the maximal absolute value of such coefficients as the threshold. b) Otherwise we use 1e-2 as the threshold for absolute values filtering.
frequency
a) In case there are enough (> 20) features with frequency > 0.7, we plot only those. b) Otherwise we plot all features.
Plots produced:
Summary plot - just enumerate all results features as a table.
Barchart plot - to visually compare features coefficient magnitude.
Scatter plot - to visually assess coefficients and frequencies.
- logml.report.plotters.feature_importance.show_cross_strata_fi_rankings_scatter(data: pandas.core.frame.DataFrame, x_column: str, y_column: str)
Simple scatter for displaying averaging rankings for 2 given stratas.
- logml.report.plotters.feature_importance.show_ranking_scatter_plot(rank_data: pandas.core.frame.DataFrame)
Shows ranks for all pairs of strata.
- logml.report.plotters.feature_importance.filter_ranks_df(df, max_rank=30)
Filters ranks dataset by rank. Expected df in format: - index = feature names, columns = strata names, values = rank ranks are 1-based.
- logml.report.plotters.feature_importance.plot_strata_ranks_clustermap(df, cols_filter=None, replace_empty=None, max_rank=None, cmap='Greens_r', figsize=(12, 22), title=None)
Display clustermap of strata features ranking.
- logml.report.plotters.feature_importance.plot_stata_similarity_heatmaps(col_width, labels, plots_per_row, row_height, stats_data)
Plot strata similarity heatmaps
- logml.report.plotters.feature_importance.show_numerical_target_vs_numerical_features_associations(dataframe: pandas.core.frame.DataFrame, target: str, features: List[str])
Creates an interactive view for examining target vs features relationships.
Target column is assumed to be numerical. Features are assumed to be numerical.
- logml.report.plotters.feature_importance.show_numerical_target_vs_categorical_features_associations(dataframe: pandas.core.frame.DataFrame, target: str, features: List[str])
Creates an interactive view for examining target vs features relationships.
Target column is assumed to be numerical. Features are assumed to be categorical.
- logml.report.plotters.feature_importance.show_categorical_target_vs_numerical_features_associations(dataframe: pandas.core.frame.DataFrame, target: str, features: List[str])
Creates an interactive view for examining target vs features relationships.
Target column is assumed to be categorical. Features are assumed to be numerical.
- logml.report.plotters.feature_importance.generate_buttons(dataframe: pandas.core.frame.DataFrame, target: str, features: List[str], swap_axes: bool = False, **kwargs) List[Dict]
Returns a list with simple plotly-compatible buttons configurations.