logml.survival_analysis.optcutoff

Survival Optimal Cutoff module - Searches for a feature’s split which maximizes survival.

This is work in progress, and is intended to replace everything in survival_analysis folder.

Functions

binarize(values, threshold)

Binarizes values based on a threshold

get_column_opt_cutoff(df, column, ...[, ...])

Find optimal cutoff

get_columns_opt_cutoff(df, columns, ...[, ...])

Calls get_column_opt_cutoff for each of the columns provided.

get_valid_cutoffs(values[, n_percentiles, ...])

Returns list of percentile values that split groups range into valid parts.

Classes

ProgressParallel([use_tqdm, total])

SAOptimalCutoff(params, time_column, ...[, ...])

Survival Optimal cutoff - searches for a feature's split which maximizes survival.

class logml.survival_analysis.optcutoff.ProgressParallel(use_tqdm=True, total=None, *args, **kwargs)

Bases: joblib.parallel.Parallel

print_progress()

Display the process of the parallel execution only a fraction of time, controlled by self.verbose.

logml.survival_analysis.optcutoff.get_valid_cutoffs(values: numpy.ndarray, n_percentiles: int = 50, min_population: float = 0.0) numpy.ndarray

Returns list of percentile values that split groups range into valid parts.

logml.survival_analysis.optcutoff.binarize(values: numpy.ndarray, threshold: float)

Binarizes values based on a threshold

logml.survival_analysis.optcutoff.get_column_opt_cutoff(df: pandas.core.frame.DataFrame, column: str, event_column: str, time_column: str, min_population: float = 0.2, n_percentiles: int = 50, cox_cols_mapping: Optional[dict] = None, errors: str = 'report') dict

Find optimal cutoff

logml.survival_analysis.optcutoff.get_columns_opt_cutoff(df: pandas.core.frame.DataFrame, columns: List[str], event_column: str, time_column: str, min_population: float = 0.2, n_percentiles: int = 50, cox_cols_mapping: Optional[dict] = None, errors: str = 'report') List

Calls get_column_opt_cutoff for each of the columns provided.

class logml.survival_analysis.optcutoff.SAOptimalCutoff(params: logml.survival_analysis.extractors.optimal_cut_off.OptimalCutOffSAParams, time_column: str, event_column: str, logger=None, group_labels: Optional[dict] = None, n_jobs: int = 1)

Bases: object

Survival Optimal cutoff - searches for a feature’s split which maximizes survival.

fit(df: pandas.core.frame.DataFrame)

Find optimal split per feature.

Parameters

df – dataframe with numerical columns